[2025-07-27 19:52:57] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-27 19:52:58] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-27 19:52:58] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-27 19:53:58] Dataset contains 205,841 [2025-07-27 19:54:01] Training for 2000 epochs... [2025-07-27 19:54:01] Beginning epoch 0... [2025-07-27 19:54:12] (step=0000001) Train Loss: 0.2763, Train Steps/Sec: 0.09, Epoch: 1.9432568985619897e-05, LR: 0.001 [2025-07-27 19:54:20] (step=0000002) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 3.8865137971239795e-05, LR: 0.001 [2025-07-27 19:54:28] (step=0000003) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 5.82977069568597e-05, LR: 0.001 [2025-07-27 19:54:36] (step=0000004) Train Loss: 0.3027, Train Steps/Sec: 0.13, Epoch: 7.773027594247959e-05, LR: 0.001 [2025-07-27 19:54:44] (step=0000005) Train Loss: 0.2705, Train Steps/Sec: 0.12, Epoch: 9.71628449280995e-05, LR: 0.001 [2025-07-27 19:54:52] (step=0000006) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.0001165954139137194, LR: 0.001 [2025-07-27 19:55:01] (step=0000007) Train Loss: 0.3011, Train Steps/Sec: 0.12, Epoch: 0.0001360279828993393, LR: 0.001 [2025-07-27 19:55:09] (step=0000008) Train Loss: 0.2686, Train Steps/Sec: 0.12, Epoch: 0.00015546055188495918, LR: 0.001 [2025-07-27 19:55:17] (step=0000009) Train Loss: 0.2553, Train Steps/Sec: 0.12, Epoch: 0.00017489312087057908, LR: 0.001 [2025-07-27 19:55:25] (step=0000010) Train Loss: 0.2940, Train Steps/Sec: 0.12, Epoch: 0.000194325689856199, LR: 0.001 [2025-07-27 19:55:33] (step=0000011) Train Loss: 0.3032, Train Steps/Sec: 0.12, Epoch: 0.0002137582588418189, LR: 0.001 [2025-07-27 19:55:41] (step=0000012) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.0002331908278274388, LR: 0.001 [2025-07-27 19:55:49] (step=0000013) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.0002526233968130587, LR: 0.001 [2025-07-27 19:55:57] (step=0000014) Train Loss: 0.3077, Train Steps/Sec: 0.13, Epoch: 0.0002720559657986786, LR: 0.001 [2025-07-27 19:56:05] (step=0000015) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.0002914885347842985, LR: 0.001 [2025-07-27 19:56:13] (step=0000016) Train Loss: 0.2582, Train Steps/Sec: 0.12, Epoch: 0.00031092110376991836, LR: 0.001 [2025-07-27 19:56:21] (step=0000017) Train Loss: 0.2466, Train Steps/Sec: 0.12, Epoch: 0.00033035367275553826, LR: 0.001 [2025-07-27 19:56:29] (step=0000018) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.00034978624174115817, LR: 0.001 [2025-07-27 19:56:37] (step=0000019) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.00036921881072677807, LR: 0.001 [2025-07-27 19:56:45] (step=0000020) Train Loss: 0.2189, Train Steps/Sec: 0.12, Epoch: 0.000388651379712398, LR: 0.001 [2025-07-27 19:56:53] (step=0000021) Train Loss: 0.2194, Train Steps/Sec: 0.13, Epoch: 0.0004080839486980179, LR: 0.001 [2025-07-27 19:57:01] (step=0000022) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.0004275165176836378, LR: 0.001 [2025-07-27 19:57:09] (step=0000023) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.0004469490866692577, LR: 0.001 [2025-07-27 19:57:17] (step=0000024) Train Loss: 0.2842, Train Steps/Sec: 0.12, Epoch: 0.0004663816556548776, LR: 0.001 [2025-07-27 19:57:25] (step=0000025) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.0004858142246404975, LR: 0.001 [2025-07-27 19:57:34] (step=0000026) Train Loss: 0.1923, Train Steps/Sec: 0.12, Epoch: 0.0005052467936261174, LR: 0.001 [2025-07-27 19:57:41] (step=0000027) Train Loss: 0.2611, Train Steps/Sec: 0.13, Epoch: 0.0005246793626117373, LR: 0.001 [2025-07-27 19:57:50] (step=0000028) Train Loss: 0.3150, Train Steps/Sec: 0.12, Epoch: 0.0005441119315973572, LR: 0.001 [2025-07-27 19:57:58] (step=0000029) Train Loss: 0.2399, Train Steps/Sec: 0.12, Epoch: 0.0005635445005829771, LR: 0.001 [2025-07-27 19:58:06] (step=0000030) Train Loss: 0.2967, Train Steps/Sec: 0.12, Epoch: 0.000582977069568597, LR: 0.001 [2025-07-27 19:58:14] (step=0000031) Train Loss: 0.2773, Train Steps/Sec: 0.13, Epoch: 0.0006024096385542169, LR: 0.001 [2025-07-27 19:58:22] (step=0000032) Train Loss: 0.3011, Train Steps/Sec: 0.12, Epoch: 0.0006218422075398367, LR: 0.001 [2025-07-27 19:58:28] (step=0000033) Train Loss: 0.1966, Train Steps/Sec: 0.15, Epoch: 0.0006412747765254566, LR: 0.001 [2025-07-27 19:58:35] (step=0000034) Train Loss: 0.1931, Train Steps/Sec: 0.15, Epoch: 0.0006607073455110765, LR: 0.001 [2025-07-27 19:58:43] (step=0000035) Train Loss: 0.2151, Train Steps/Sec: 0.12, Epoch: 0.0006801399144966964, LR: 0.001 [2025-07-27 19:58:51] (step=0000036) Train Loss: 0.3416, Train Steps/Sec: 0.12, Epoch: 0.0006995724834823163, LR: 0.001 [2025-07-27 19:58:59] (step=0000037) Train Loss: 0.2982, Train Steps/Sec: 0.13, Epoch: 0.0007190050524679362, LR: 0.001 [2025-07-27 19:59:07] (step=0000038) Train Loss: 0.2114, Train Steps/Sec: 0.12, Epoch: 0.0007384376214535561, LR: 0.001 [2025-07-27 19:59:15] (step=0000039) Train Loss: 0.1834, Train Steps/Sec: 0.12, Epoch: 0.000757870190439176, LR: 0.001 [2025-07-27 19:59:23] (step=0000040) Train Loss: 0.1878, Train Steps/Sec: 0.12, Epoch: 0.000777302759424796, LR: 0.001 [2025-07-27 19:59:31] (step=0000041) Train Loss: 0.2172, Train Steps/Sec: 0.13, Epoch: 0.0007967353284104159, LR: 0.001 [2025-07-27 19:59:40] (step=0000042) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.0008161678973960358, LR: 0.001 [2025-07-27 19:59:48] (step=0000043) Train Loss: 0.2382, Train Steps/Sec: 0.12, Epoch: 0.0008356004663816557, LR: 0.001 [2025-07-27 19:59:56] (step=0000044) Train Loss: 0.3374, Train Steps/Sec: 0.12, Epoch: 0.0008550330353672756, LR: 0.001 [2025-07-27 20:00:04] (step=0000045) Train Loss: 0.2600, Train Steps/Sec: 0.12, Epoch: 0.0008744656043528955, LR: 0.001 [2025-07-27 20:00:12] (step=0000046) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.0008938981733385154, LR: 0.001 [2025-07-27 20:00:20] (step=0000047) Train Loss: 0.2266, Train Steps/Sec: 0.12, Epoch: 0.0009133307423241353, LR: 0.001 [2025-07-27 20:00:28] (step=0000048) Train Loss: 0.2300, Train Steps/Sec: 0.12, Epoch: 0.0009327633113097552, LR: 0.001 [2025-07-27 20:00:36] (step=0000049) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.0009521958802953751, LR: 0.001 [2025-07-27 20:00:44] (step=0000050) Train Loss: 0.3138, Train Steps/Sec: 0.12, Epoch: 0.000971628449280995, LR: 0.001 [2025-07-27 20:00:52] (step=0000051) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.000991061018266615, LR: 0.001 [2025-07-27 20:01:00] (step=0000052) Train Loss: 0.2117, Train Steps/Sec: 0.12, Epoch: 0.0010104935872522348, LR: 0.001 [2025-07-27 20:01:08] (step=0000053) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.0010299261562378547, LR: 0.001 [2025-07-27 20:01:16] (step=0000054) Train Loss: 0.2349, Train Steps/Sec: 0.12, Epoch: 0.0010493587252234746, LR: 0.001 [2025-07-27 20:01:24] (step=0000055) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.0010687912942090945, LR: 0.001 [2025-07-27 20:01:32] (step=0000056) Train Loss: 0.2771, Train Steps/Sec: 0.13, Epoch: 0.0010882238631947144, LR: 0.001 [2025-07-27 20:01:41] (step=0000057) Train Loss: 0.2997, Train Steps/Sec: 0.12, Epoch: 0.0011076564321803343, LR: 0.001 [2025-07-27 20:01:49] (step=0000058) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.0011270890011659542, LR: 0.001 [2025-07-27 20:01:57] (step=0000059) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.0011465215701515741, LR: 0.001 [2025-07-27 20:02:05] (step=0000060) Train Loss: 0.2459, Train Steps/Sec: 0.12, Epoch: 0.001165954139137194, LR: 0.001 [2025-07-27 20:02:13] (step=0000061) Train Loss: 0.2854, Train Steps/Sec: 0.12, Epoch: 0.001185386708122814, LR: 0.001 [2025-07-27 20:02:21] (step=0000062) Train Loss: 0.3201, Train Steps/Sec: 0.12, Epoch: 0.0012048192771084338, LR: 0.001 [2025-07-27 20:02:29] (step=0000063) Train Loss: 0.2857, Train Steps/Sec: 0.12, Epoch: 0.0012242518460940535, LR: 0.001 [2025-07-27 20:02:37] (step=0000064) Train Loss: 0.2726, Train Steps/Sec: 0.13, Epoch: 0.0012436844150796734, LR: 0.001 [2025-07-27 20:02:45] (step=0000065) Train Loss: 0.2100, Train Steps/Sec: 0.12, Epoch: 0.0012631169840652933, LR: 0.001 [2025-07-27 20:02:51] (step=0000066) Train Loss: 0.2253, Train Steps/Sec: 0.16, Epoch: 0.0012825495530509132, LR: 0.001 [2025-07-27 20:02:59] (step=0000067) Train Loss: 0.2192, Train Steps/Sec: 0.13, Epoch: 0.0013019821220365331, LR: 0.001 [2025-07-27 20:03:07] (step=0000068) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.001321414691022153, LR: 0.001 [2025-07-27 20:03:15] (step=0000069) Train Loss: 0.2426, Train Steps/Sec: 0.12, Epoch: 0.001340847260007773, LR: 0.001 [2025-07-27 20:03:23] (step=0000070) Train Loss: 0.1510, Train Steps/Sec: 0.12, Epoch: 0.0013602798289933929, LR: 0.001 [2025-07-27 20:03:31] (step=0000071) Train Loss: 0.2540, Train Steps/Sec: 0.12, Epoch: 0.0013797123979790128, LR: 0.001 [2025-07-27 20:03:39] (step=0000072) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.0013991449669646327, LR: 0.001 [2025-07-27 20:03:47] (step=0000073) Train Loss: 0.2057, Train Steps/Sec: 0.13, Epoch: 0.0014185775359502526, LR: 0.001 [2025-07-27 20:03:55] (step=0000074) Train Loss: 0.2550, Train Steps/Sec: 0.13, Epoch: 0.0014380101049358725, LR: 0.001 [2025-07-27 20:04:03] (step=0000075) Train Loss: 0.2184, Train Steps/Sec: 0.13, Epoch: 0.0014574426739214924, LR: 0.001 [2025-07-27 20:04:11] (step=0000076) Train Loss: 0.2076, Train Steps/Sec: 0.12, Epoch: 0.0014768752429071123, LR: 0.001 [2025-07-27 20:04:19] (step=0000077) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.0014963078118927322, LR: 0.001 [2025-07-27 20:04:27] (step=0000078) Train Loss: 0.1971, Train Steps/Sec: 0.13, Epoch: 0.001515740380878352, LR: 0.001 [2025-07-27 20:04:35] (step=0000079) Train Loss: 0.2430, Train Steps/Sec: 0.13, Epoch: 0.001535172949863972, LR: 0.001 [2025-07-27 20:04:43] (step=0000080) Train Loss: 0.2123, Train Steps/Sec: 0.13, Epoch: 0.001554605518849592, LR: 0.001 [2025-07-27 20:04:51] (step=0000081) Train Loss: 0.1928, Train Steps/Sec: 0.12, Epoch: 0.0015740380878352118, LR: 0.001 [2025-07-27 20:04:59] (step=0000082) Train Loss: 0.2358, Train Steps/Sec: 0.13, Epoch: 0.0015934706568208317, LR: 0.001 [2025-07-27 20:05:07] (step=0000083) Train Loss: 0.2942, Train Steps/Sec: 0.13, Epoch: 0.0016129032258064516, LR: 0.001 [2025-07-27 20:05:15] (step=0000084) Train Loss: 0.2643, Train Steps/Sec: 0.13, Epoch: 0.0016323357947920715, LR: 0.001 [2025-07-27 20:05:23] (step=0000085) Train Loss: 0.2570, Train Steps/Sec: 0.13, Epoch: 0.0016517683637776914, LR: 0.001 [2025-07-27 20:05:31] (step=0000086) Train Loss: 0.1902, Train Steps/Sec: 0.12, Epoch: 0.0016712009327633113, LR: 0.001 [2025-07-27 20:05:39] (step=0000087) Train Loss: 0.1716, Train Steps/Sec: 0.13, Epoch: 0.0016906335017489312, LR: 0.001 [2025-07-27 20:05:47] (step=0000088) Train Loss: 0.2318, Train Steps/Sec: 0.12, Epoch: 0.0017100660707345511, LR: 0.001 [2025-07-27 20:05:55] (step=0000089) Train Loss: 0.3302, Train Steps/Sec: 0.13, Epoch: 0.001729498639720171, LR: 0.001 [2025-07-27 20:06:03] (step=0000090) Train Loss: 0.2776, Train Steps/Sec: 0.13, Epoch: 0.001748931208705791, LR: 0.001 [2025-07-27 20:06:11] (step=0000091) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.0017683637776914108, LR: 0.001 [2025-07-27 20:06:19] (step=0000092) Train Loss: 0.2223, Train Steps/Sec: 0.13, Epoch: 0.0017877963466770307, LR: 0.001 [2025-07-27 20:06:27] (step=0000093) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.0018072289156626507, LR: 0.001 [2025-07-27 20:06:35] (step=0000094) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.0018266614846482706, LR: 0.001 [2025-07-27 20:06:43] (step=0000095) Train Loss: 0.2366, Train Steps/Sec: 0.13, Epoch: 0.0018460940536338905, LR: 0.001 [2025-07-27 20:06:51] (step=0000096) Train Loss: 0.2706, Train Steps/Sec: 0.12, Epoch: 0.0018655266226195104, LR: 0.001 [2025-07-27 20:06:59] (step=0000097) Train Loss: 0.2249, Train Steps/Sec: 0.13, Epoch: 0.0018849591916051303, LR: 0.001 [2025-07-27 20:07:07] (step=0000098) Train Loss: 0.2289, Train Steps/Sec: 0.12, Epoch: 0.0019043917605907502, LR: 0.001 [2025-07-27 20:07:13] (step=0000099) Train Loss: 0.2985, Train Steps/Sec: 0.19, Epoch: 0.00192382432957637, LR: 0.001 [2025-07-27 20:07:21] (step=0000100) Train Loss: 0.3072, Train Steps/Sec: 0.12, Epoch: 0.00194325689856199, LR: 0.001 [2025-07-27 20:07:29] (step=0000101) Train Loss: 0.2772, Train Steps/Sec: 0.12, Epoch: 0.0019626894675476097, LR: 0.001 [2025-07-27 20:07:37] (step=0000102) Train Loss: 0.3058, Train Steps/Sec: 0.12, Epoch: 0.00198212203653323, LR: 0.001 [2025-07-27 20:07:45] (step=0000103) Train Loss: 0.2560, Train Steps/Sec: 0.12, Epoch: 0.0020015546055188495, LR: 0.001 [2025-07-27 20:07:53] (step=0000104) Train Loss: 0.1682, Train Steps/Sec: 0.12, Epoch: 0.0020209871745044696, LR: 0.001 [2025-07-27 20:08:01] (step=0000105) Train Loss: 0.2188, Train Steps/Sec: 0.12, Epoch: 0.0020404197434900893, LR: 0.001 [2025-07-27 20:08:09] (step=0000106) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.0020598523124757094, LR: 0.001 [2025-07-27 20:08:17] (step=0000107) Train Loss: 0.1893, Train Steps/Sec: 0.12, Epoch: 0.002079284881461329, LR: 0.001 [2025-07-27 20:08:25] (step=0000108) Train Loss: 0.2157, Train Steps/Sec: 0.12, Epoch: 0.002098717450446949, LR: 0.001 [2025-07-27 20:08:33] (step=0000109) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.002118150019432569, LR: 0.001 [2025-07-27 20:08:41] (step=0000110) Train Loss: 0.2497, Train Steps/Sec: 0.13, Epoch: 0.002137582588418189, LR: 0.001 [2025-07-27 20:08:49] (step=0000111) Train Loss: 0.2825, Train Steps/Sec: 0.12, Epoch: 0.0021570151574038087, LR: 0.001 [2025-07-27 20:08:57] (step=0000112) Train Loss: 0.2710, Train Steps/Sec: 0.12, Epoch: 0.002176447726389429, LR: 0.001 [2025-07-27 20:09:05] (step=0000113) Train Loss: 0.2917, Train Steps/Sec: 0.13, Epoch: 0.0021958802953750485, LR: 0.001 [2025-07-27 20:09:13] (step=0000114) Train Loss: 0.2873, Train Steps/Sec: 0.12, Epoch: 0.0022153128643606686, LR: 0.001 [2025-07-27 20:09:21] (step=0000115) Train Loss: 0.2077, Train Steps/Sec: 0.12, Epoch: 0.0022347454333462883, LR: 0.001 [2025-07-27 20:09:29] (step=0000116) Train Loss: 0.2563, Train Steps/Sec: 0.12, Epoch: 0.0022541780023319084, LR: 0.001 [2025-07-27 20:09:38] (step=0000117) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.002273610571317528, LR: 0.001 [2025-07-27 20:09:46] (step=0000118) Train Loss: 0.2058, Train Steps/Sec: 0.13, Epoch: 0.0022930431403031483, LR: 0.001 [2025-07-27 20:09:54] (step=0000119) Train Loss: 0.2992, Train Steps/Sec: 0.12, Epoch: 0.002312475709288768, LR: 0.001 [2025-07-27 20:10:02] (step=0000120) Train Loss: 0.2732, Train Steps/Sec: 0.13, Epoch: 0.002331908278274388, LR: 0.001 [2025-07-27 20:10:10] (step=0000121) Train Loss: 0.2827, Train Steps/Sec: 0.12, Epoch: 0.0023513408472600078, LR: 0.001 [2025-07-27 20:10:18] (step=0000122) Train Loss: 0.3050, Train Steps/Sec: 0.12, Epoch: 0.002370773416245628, LR: 0.001 [2025-07-27 20:10:26] (step=0000123) Train Loss: 0.2026, Train Steps/Sec: 0.12, Epoch: 0.0023902059852312476, LR: 0.001 [2025-07-27 20:10:34] (step=0000124) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.0024096385542168677, LR: 0.001 [2025-07-27 20:10:42] (step=0000125) Train Loss: 0.2060, Train Steps/Sec: 0.13, Epoch: 0.0024290711232024874, LR: 0.001 [2025-07-27 20:10:50] (step=0000126) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.002448503692188107, LR: 0.001 [2025-07-27 20:10:58] (step=0000127) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.002467936261173727, LR: 0.001 [2025-07-27 20:11:06] (step=0000128) Train Loss: 0.2714, Train Steps/Sec: 0.12, Epoch: 0.002487368830159347, LR: 0.001 [2025-07-27 20:11:14] (step=0000129) Train Loss: 0.2877, Train Steps/Sec: 0.12, Epoch: 0.002506801399144967, LR: 0.001 [2025-07-27 20:11:22] (step=0000130) Train Loss: 0.2331, Train Steps/Sec: 0.13, Epoch: 0.0025262339681305867, LR: 0.001 [2025-07-27 20:11:30] (step=0000131) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.002545666537116207, LR: 0.001 [2025-07-27 20:11:36] (step=0000132) Train Loss: 0.2678, Train Steps/Sec: 0.18, Epoch: 0.0025650991061018265, LR: 0.001 [2025-07-27 20:11:44] (step=0000133) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.0025845316750874466, LR: 0.001 [2025-07-27 20:11:52] (step=0000134) Train Loss: 0.3067, Train Steps/Sec: 0.13, Epoch: 0.0026039642440730663, LR: 0.001 [2025-07-27 20:12:00] (step=0000135) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.0026233968130586864, LR: 0.001 [2025-07-27 20:12:08] (step=0000136) Train Loss: 0.2820, Train Steps/Sec: 0.13, Epoch: 0.002642829382044306, LR: 0.001 [2025-07-27 20:12:16] (step=0000137) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.0026622619510299262, LR: 0.001 [2025-07-27 20:12:24] (step=0000138) Train Loss: 0.2633, Train Steps/Sec: 0.12, Epoch: 0.002681694520015546, LR: 0.001 [2025-07-27 20:12:32] (step=0000139) Train Loss: 0.2109, Train Steps/Sec: 0.12, Epoch: 0.002701127089001166, LR: 0.001 [2025-07-27 20:12:40] (step=0000140) Train Loss: 0.2725, Train Steps/Sec: 0.12, Epoch: 0.0027205596579867857, LR: 0.001 [2025-07-27 20:12:48] (step=0000141) Train Loss: 0.2724, Train Steps/Sec: 0.12, Epoch: 0.002739992226972406, LR: 0.001 [2025-07-27 20:12:56] (step=0000142) Train Loss: 0.2413, Train Steps/Sec: 0.13, Epoch: 0.0027594247959580255, LR: 0.001 [2025-07-27 20:13:04] (step=0000143) Train Loss: 0.3327, Train Steps/Sec: 0.12, Epoch: 0.0027788573649436456, LR: 0.001 [2025-07-27 20:13:12] (step=0000144) Train Loss: 0.3171, Train Steps/Sec: 0.12, Epoch: 0.0027982899339292653, LR: 0.001 [2025-07-27 20:13:20] (step=0000145) Train Loss: 0.2489, Train Steps/Sec: 0.12, Epoch: 0.0028177225029148855, LR: 0.001 [2025-07-27 20:13:28] (step=0000146) Train Loss: 0.1963, Train Steps/Sec: 0.12, Epoch: 0.002837155071900505, LR: 0.001 [2025-07-27 20:13:36] (step=0000147) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.0028565876408861253, LR: 0.001 [2025-07-27 20:13:44] (step=0000148) Train Loss: 0.2661, Train Steps/Sec: 0.12, Epoch: 0.002876020209871745, LR: 0.001 [2025-07-27 20:13:53] (step=0000149) Train Loss: 0.2011, Train Steps/Sec: 0.12, Epoch: 0.002895452778857365, LR: 0.001 [2025-07-27 20:14:01] (step=0000150) Train Loss: 0.2324, Train Steps/Sec: 0.12, Epoch: 0.0029148853478429848, LR: 0.001 [2025-07-27 20:14:09] (step=0000151) Train Loss: 0.3833, Train Steps/Sec: 0.12, Epoch: 0.002934317916828605, LR: 0.001 [2025-07-27 20:14:17] (step=0000152) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.0029537504858142246, LR: 0.001 [2025-07-27 20:14:25] (step=0000153) Train Loss: 0.1673, Train Steps/Sec: 0.12, Epoch: 0.0029731830547998447, LR: 0.001 [2025-07-27 20:14:33] (step=0000154) Train Loss: 0.2501, Train Steps/Sec: 0.13, Epoch: 0.0029926156237854644, LR: 0.001 [2025-07-27 20:14:41] (step=0000155) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.0030120481927710845, LR: 0.001 [2025-07-27 20:14:49] (step=0000156) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.003031480761756704, LR: 0.001 [2025-07-27 20:14:57] (step=0000157) Train Loss: 0.2149, Train Steps/Sec: 0.12, Epoch: 0.0030509133307423243, LR: 0.001 [2025-07-27 20:15:05] (step=0000158) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.003070345899727944, LR: 0.001 [2025-07-27 20:15:13] (step=0000159) Train Loss: 0.3428, Train Steps/Sec: 0.12, Epoch: 0.003089778468713564, LR: 0.001 [2025-07-27 20:15:21] (step=0000160) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.003109211037699184, LR: 0.001 [2025-07-27 20:15:29] (step=0000161) Train Loss: 0.2003, Train Steps/Sec: 0.12, Epoch: 0.003128643606684804, LR: 0.001 [2025-07-27 20:15:37] (step=0000162) Train Loss: 0.2381, Train Steps/Sec: 0.12, Epoch: 0.0031480761756704236, LR: 0.001 [2025-07-27 20:15:45] (step=0000163) Train Loss: 0.2164, Train Steps/Sec: 0.13, Epoch: 0.0031675087446560437, LR: 0.001 [2025-07-27 20:15:53] (step=0000164) Train Loss: 0.1799, Train Steps/Sec: 0.13, Epoch: 0.0031869413136416634, LR: 0.001 [2025-07-27 20:15:59] (step=0000165) Train Loss: 0.2941, Train Steps/Sec: 0.16, Epoch: 0.0032063738826272835, LR: 0.001 [2025-07-27 20:16:07] (step=0000166) Train Loss: 0.2051, Train Steps/Sec: 0.12, Epoch: 0.0032258064516129032, LR: 0.001 [2025-07-27 20:16:15] (step=0000167) Train Loss: 0.2600, Train Steps/Sec: 0.12, Epoch: 0.003245239020598523, LR: 0.001 [2025-07-27 20:16:23] (step=0000168) Train Loss: 0.2872, Train Steps/Sec: 0.12, Epoch: 0.003264671589584143, LR: 0.001 [2025-07-27 20:16:31] (step=0000169) Train Loss: 0.2334, Train Steps/Sec: 0.12, Epoch: 0.0032841041585697627, LR: 0.001 [2025-07-27 20:16:39] (step=0000170) Train Loss: 0.3094, Train Steps/Sec: 0.12, Epoch: 0.003303536727555383, LR: 0.001 [2025-07-27 20:16:47] (step=0000171) Train Loss: 0.2613, Train Steps/Sec: 0.12, Epoch: 0.0033229692965410025, LR: 0.001 [2025-07-27 20:16:56] (step=0000172) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.0033424018655266226, LR: 0.001 [2025-07-27 20:17:04] (step=0000173) Train Loss: 0.3100, Train Steps/Sec: 0.12, Epoch: 0.0033618344345122423, LR: 0.001 [2025-07-27 20:17:12] (step=0000174) Train Loss: 0.3326, Train Steps/Sec: 0.12, Epoch: 0.0033812670034978625, LR: 0.001 [2025-07-27 20:17:20] (step=0000175) Train Loss: 0.2737, Train Steps/Sec: 0.13, Epoch: 0.003400699572483482, LR: 0.001 [2025-07-27 20:17:28] (step=0000176) Train Loss: 0.2691, Train Steps/Sec: 0.12, Epoch: 0.0034201321414691023, LR: 0.001 [2025-07-27 20:17:36] (step=0000177) Train Loss: 0.2925, Train Steps/Sec: 0.12, Epoch: 0.003439564710454722, LR: 0.001 [2025-07-27 20:17:44] (step=0000178) Train Loss: 0.2622, Train Steps/Sec: 0.12, Epoch: 0.003458997279440342, LR: 0.001 [2025-07-27 20:17:52] (step=0000179) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.0034784298484259618, LR: 0.001 [2025-07-27 20:18:00] (step=0000180) Train Loss: 0.2466, Train Steps/Sec: 0.12, Epoch: 0.003497862417411582, LR: 0.001 [2025-07-27 20:18:08] (step=0000181) Train Loss: 0.2382, Train Steps/Sec: 0.13, Epoch: 0.0035172949863972016, LR: 0.001 [2025-07-27 20:18:16] (step=0000182) Train Loss: 0.2205, Train Steps/Sec: 0.12, Epoch: 0.0035367275553828217, LR: 0.001 [2025-07-27 20:18:24] (step=0000183) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.0035561601243684414, LR: 0.001 [2025-07-27 20:18:32] (step=0000184) Train Loss: 0.2491, Train Steps/Sec: 0.13, Epoch: 0.0035755926933540615, LR: 0.001 [2025-07-27 20:18:40] (step=0000185) Train Loss: 0.2709, Train Steps/Sec: 0.12, Epoch: 0.003595025262339681, LR: 0.001 [2025-07-27 20:18:48] (step=0000186) Train Loss: 0.3090, Train Steps/Sec: 0.12, Epoch: 0.0036144578313253013, LR: 0.001 [2025-07-27 20:18:56] (step=0000187) Train Loss: 0.1652, Train Steps/Sec: 0.12, Epoch: 0.003633890400310921, LR: 0.001 [2025-07-27 20:19:04] (step=0000188) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.003653322969296541, LR: 0.001 [2025-07-27 20:19:12] (step=0000189) Train Loss: 0.2647, Train Steps/Sec: 0.13, Epoch: 0.003672755538282161, LR: 0.001 [2025-07-27 20:19:20] (step=0000190) Train Loss: 0.1680, Train Steps/Sec: 0.13, Epoch: 0.003692188107267781, LR: 0.001 [2025-07-27 20:19:28] (step=0000191) Train Loss: 0.2829, Train Steps/Sec: 0.12, Epoch: 0.0037116206762534006, LR: 0.001 [2025-07-27 20:19:36] (step=0000192) Train Loss: 0.2512, Train Steps/Sec: 0.13, Epoch: 0.0037310532452390207, LR: 0.001 [2025-07-27 20:19:44] (step=0000193) Train Loss: 0.2187, Train Steps/Sec: 0.12, Epoch: 0.0037504858142246404, LR: 0.001 [2025-07-27 20:19:52] (step=0000194) Train Loss: 0.3259, Train Steps/Sec: 0.13, Epoch: 0.0037699183832102605, LR: 0.001 [2025-07-27 20:20:00] (step=0000195) Train Loss: 0.3265, Train Steps/Sec: 0.13, Epoch: 0.0037893509521958802, LR: 0.001 [2025-07-27 20:20:08] (step=0000196) Train Loss: 0.2403, Train Steps/Sec: 0.12, Epoch: 0.0038087835211815003, LR: 0.001 [2025-07-27 20:20:15] (step=0000197) Train Loss: 0.2251, Train Steps/Sec: 0.15, Epoch: 0.00382821609016712, LR: 0.001 [2025-07-27 20:20:22] (step=0000198) Train Loss: 0.2720, Train Steps/Sec: 0.13, Epoch: 0.00384764865915274, LR: 0.001 [2025-07-27 20:20:30] (step=0000199) Train Loss: 0.1977, Train Steps/Sec: 0.13, Epoch: 0.00386708122813836, LR: 0.001 [2025-07-27 20:20:38] (step=0000200) Train Loss: 0.2300, Train Steps/Sec: 0.12, Epoch: 0.00388651379712398, LR: 0.001 [2025-07-27 20:20:46] (step=0000201) Train Loss: 0.2584, Train Steps/Sec: 0.13, Epoch: 0.0039059463661095997, LR: 0.001 [2025-07-27 20:20:54] (step=0000202) Train Loss: 0.2023, Train Steps/Sec: 0.12, Epoch: 0.003925378935095219, LR: 0.001 [2025-07-27 20:21:02] (step=0000203) Train Loss: 0.2699, Train Steps/Sec: 0.12, Epoch: 0.00394481150408084, LR: 0.001 [2025-07-27 20:21:11] (step=0000204) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.00396424407306646, LR: 0.001 [2025-07-27 20:21:19] (step=0000205) Train Loss: 0.2747, Train Steps/Sec: 0.12, Epoch: 0.003983676642052079, LR: 0.001 [2025-07-27 20:21:27] (step=0000206) Train Loss: 0.3325, Train Steps/Sec: 0.12, Epoch: 0.004003109211037699, LR: 0.001 [2025-07-27 20:21:35] (step=0000207) Train Loss: 0.1948, Train Steps/Sec: 0.12, Epoch: 0.0040225417800233195, LR: 0.001 [2025-07-27 20:21:43] (step=0000208) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.004041974349008939, LR: 0.001 [2025-07-27 20:21:51] (step=0000209) Train Loss: 0.2705, Train Steps/Sec: 0.12, Epoch: 0.004061406917994559, LR: 0.001 [2025-07-27 20:21:59] (step=0000210) Train Loss: 0.2191, Train Steps/Sec: 0.12, Epoch: 0.004080839486980179, LR: 0.001 [2025-07-27 20:22:07] (step=0000211) Train Loss: 0.3210, Train Steps/Sec: 0.12, Epoch: 0.004100272055965798, LR: 0.001 [2025-07-27 20:22:15] (step=0000212) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.004119704624951419, LR: 0.001 [2025-07-27 20:22:23] (step=0000213) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.0041391371939370385, LR: 0.001 [2025-07-27 20:22:31] (step=0000214) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.004158569762922658, LR: 0.001 [2025-07-27 20:22:39] (step=0000215) Train Loss: 0.2753, Train Steps/Sec: 0.13, Epoch: 0.004178002331908278, LR: 0.001 [2025-07-27 20:22:47] (step=0000216) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.004197434900893898, LR: 0.001 [2025-07-27 20:22:55] (step=0000217) Train Loss: 0.1431, Train Steps/Sec: 0.12, Epoch: 0.004216867469879518, LR: 0.001 [2025-07-27 20:23:03] (step=0000218) Train Loss: 0.2690, Train Steps/Sec: 0.13, Epoch: 0.004236300038865138, LR: 0.001 [2025-07-27 20:23:11] (step=0000219) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.0042557326078507575, LR: 0.001 [2025-07-27 20:23:19] (step=0000220) Train Loss: 0.2744, Train Steps/Sec: 0.12, Epoch: 0.004275165176836378, LR: 0.001 [2025-07-27 20:23:27] (step=0000221) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.004294597745821998, LR: 0.001 [2025-07-27 20:23:36] (step=0000222) Train Loss: 0.2793, Train Steps/Sec: 0.12, Epoch: 0.004314030314807617, LR: 0.001 [2025-07-27 20:23:44] (step=0000223) Train Loss: 0.3247, Train Steps/Sec: 0.12, Epoch: 0.004333462883793237, LR: 0.001 [2025-07-27 20:23:52] (step=0000224) Train Loss: 0.1619, Train Steps/Sec: 0.12, Epoch: 0.004352895452778858, LR: 0.001 [2025-07-27 20:24:00] (step=0000225) Train Loss: 0.3282, Train Steps/Sec: 0.13, Epoch: 0.004372328021764477, LR: 0.001 [2025-07-27 20:24:08] (step=0000226) Train Loss: 0.2192, Train Steps/Sec: 0.12, Epoch: 0.004391760590750097, LR: 0.001 [2025-07-27 20:24:16] (step=0000227) Train Loss: 0.1856, Train Steps/Sec: 0.12, Epoch: 0.004411193159735717, LR: 0.001 [2025-07-27 20:24:24] (step=0000228) Train Loss: 0.2671, Train Steps/Sec: 0.13, Epoch: 0.004430625728721337, LR: 0.001 [2025-07-27 20:24:32] (step=0000229) Train Loss: 0.1377, Train Steps/Sec: 0.12, Epoch: 0.004450058297706957, LR: 0.001 [2025-07-27 20:24:38] (step=0000230) Train Loss: 0.2633, Train Steps/Sec: 0.17, Epoch: 0.004469490866692577, LR: 0.001 [2025-07-27 20:24:46] (step=0000231) Train Loss: 0.2881, Train Steps/Sec: 0.12, Epoch: 0.004488923435678196, LR: 0.001 [2025-07-27 20:24:54] (step=0000232) Train Loss: 0.2952, Train Steps/Sec: 0.12, Epoch: 0.004508356004663817, LR: 0.001 [2025-07-27 20:25:02] (step=0000233) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.004527788573649437, LR: 0.001 [2025-07-27 20:25:10] (step=0000234) Train Loss: 0.2560, Train Steps/Sec: 0.12, Epoch: 0.004547221142635056, LR: 0.001 [2025-07-27 20:25:18] (step=0000235) Train Loss: 0.2389, Train Steps/Sec: 0.12, Epoch: 0.004566653711620676, LR: 0.001 [2025-07-27 20:25:26] (step=0000236) Train Loss: 0.2002, Train Steps/Sec: 0.12, Epoch: 0.0045860862806062965, LR: 0.001 [2025-07-27 20:25:34] (step=0000237) Train Loss: 0.2795, Train Steps/Sec: 0.12, Epoch: 0.004605518849591916, LR: 0.001 [2025-07-27 20:25:42] (step=0000238) Train Loss: 0.1839, Train Steps/Sec: 0.12, Epoch: 0.004624951418577536, LR: 0.001 [2025-07-27 20:25:50] (step=0000239) Train Loss: 0.2881, Train Steps/Sec: 0.13, Epoch: 0.004644383987563156, LR: 0.001 [2025-07-27 20:25:58] (step=0000240) Train Loss: 0.3234, Train Steps/Sec: 0.12, Epoch: 0.004663816556548776, LR: 0.001 [2025-07-27 20:26:06] (step=0000241) Train Loss: 0.2111, Train Steps/Sec: 0.13, Epoch: 0.004683249125534396, LR: 0.001 [2025-07-27 20:26:14] (step=0000242) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.0047026816945200155, LR: 0.001 [2025-07-27 20:26:22] (step=0000243) Train Loss: 0.2644, Train Steps/Sec: 0.12, Epoch: 0.004722114263505635, LR: 0.001 [2025-07-27 20:26:30] (step=0000244) Train Loss: 0.2160, Train Steps/Sec: 0.12, Epoch: 0.004741546832491256, LR: 0.001 [2025-07-27 20:26:38] (step=0000245) Train Loss: 0.2266, Train Steps/Sec: 0.12, Epoch: 0.0047609794014768754, LR: 0.001 [2025-07-27 20:26:46] (step=0000246) Train Loss: 0.3068, Train Steps/Sec: 0.12, Epoch: 0.004780411970462495, LR: 0.001 [2025-07-27 20:26:54] (step=0000247) Train Loss: 0.2652, Train Steps/Sec: 0.13, Epoch: 0.004799844539448115, LR: 0.001 [2025-07-27 20:27:02] (step=0000248) Train Loss: 0.2462, Train Steps/Sec: 0.12, Epoch: 0.004819277108433735, LR: 0.001 [2025-07-27 20:27:10] (step=0000249) Train Loss: 0.2299, Train Steps/Sec: 0.12, Epoch: 0.004838709677419355, LR: 0.001 [2025-07-27 20:27:19] (step=0000250) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.004858142246404975, LR: 0.001 [2025-07-27 20:27:27] (step=0000251) Train Loss: 0.2591, Train Steps/Sec: 0.13, Epoch: 0.004877574815390594, LR: 0.001 [2025-07-27 20:27:35] (step=0000252) Train Loss: 0.2369, Train Steps/Sec: 0.12, Epoch: 0.004897007384376214, LR: 0.001 [2025-07-27 20:27:43] (step=0000253) Train Loss: 0.2116, Train Steps/Sec: 0.12, Epoch: 0.004916439953361835, LR: 0.001 [2025-07-27 20:27:51] (step=0000254) Train Loss: 0.3068, Train Steps/Sec: 0.12, Epoch: 0.004935872522347454, LR: 0.001 [2025-07-27 20:27:59] (step=0000255) Train Loss: 0.2685, Train Steps/Sec: 0.12, Epoch: 0.004955305091333074, LR: 0.001 [2025-07-27 20:28:07] (step=0000256) Train Loss: 0.2869, Train Steps/Sec: 0.12, Epoch: 0.004974737660318694, LR: 0.001 [2025-07-27 20:28:15] (step=0000257) Train Loss: 0.1853, Train Steps/Sec: 0.12, Epoch: 0.004994170229304314, LR: 0.001 [2025-07-27 20:28:23] (step=0000258) Train Loss: 0.2604, Train Steps/Sec: 0.12, Epoch: 0.005013602798289934, LR: 0.001 [2025-07-27 20:28:31] (step=0000259) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.005033035367275554, LR: 0.001 [2025-07-27 20:28:39] (step=0000260) Train Loss: 0.3115, Train Steps/Sec: 0.12, Epoch: 0.005052467936261173, LR: 0.001 [2025-07-27 20:28:47] (step=0000261) Train Loss: 0.2676, Train Steps/Sec: 0.13, Epoch: 0.005071900505246794, LR: 0.001 [2025-07-27 20:28:55] (step=0000262) Train Loss: 0.2764, Train Steps/Sec: 0.12, Epoch: 0.005091333074232414, LR: 0.001 [2025-07-27 20:29:01] (step=0000263) Train Loss: 0.2180, Train Steps/Sec: 0.18, Epoch: 0.005110765643218033, LR: 0.001 [2025-07-27 20:29:08] (step=0000264) Train Loss: 0.2936, Train Steps/Sec: 0.13, Epoch: 0.005130198212203653, LR: 0.001 [2025-07-27 20:29:17] (step=0000265) Train Loss: 0.3038, Train Steps/Sec: 0.13, Epoch: 0.0051496307811892735, LR: 0.001 [2025-07-27 20:29:25] (step=0000266) Train Loss: 0.2417, Train Steps/Sec: 0.12, Epoch: 0.005169063350174893, LR: 0.001 [2025-07-27 20:29:33] (step=0000267) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.005188495919160513, LR: 0.001 [2025-07-27 20:29:41] (step=0000268) Train Loss: 0.2318, Train Steps/Sec: 0.13, Epoch: 0.005207928488146133, LR: 0.001 [2025-07-27 20:29:49] (step=0000269) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.005227361057131753, LR: 0.001 [2025-07-27 20:29:57] (step=0000270) Train Loss: 0.2166, Train Steps/Sec: 0.12, Epoch: 0.005246793626117373, LR: 0.001 [2025-07-27 20:30:05] (step=0000271) Train Loss: 0.2045, Train Steps/Sec: 0.12, Epoch: 0.0052662261951029925, LR: 0.001 [2025-07-27 20:30:13] (step=0000272) Train Loss: 0.2169, Train Steps/Sec: 0.12, Epoch: 0.005285658764088612, LR: 0.001 [2025-07-27 20:30:21] (step=0000273) Train Loss: 0.1798, Train Steps/Sec: 0.12, Epoch: 0.005305091333074233, LR: 0.001 [2025-07-27 20:30:29] (step=0000274) Train Loss: 0.1937, Train Steps/Sec: 0.12, Epoch: 0.0053245239020598524, LR: 0.001 [2025-07-27 20:30:37] (step=0000275) Train Loss: 0.2644, Train Steps/Sec: 0.12, Epoch: 0.005343956471045472, LR: 0.001 [2025-07-27 20:30:45] (step=0000276) Train Loss: 0.3097, Train Steps/Sec: 0.12, Epoch: 0.005363389040031092, LR: 0.001 [2025-07-27 20:30:53] (step=0000277) Train Loss: 0.2592, Train Steps/Sec: 0.13, Epoch: 0.005382821609016712, LR: 0.001 [2025-07-27 20:31:01] (step=0000278) Train Loss: 0.2923, Train Steps/Sec: 0.12, Epoch: 0.005402254178002332, LR: 0.001 [2025-07-27 20:31:09] (step=0000279) Train Loss: 0.3051, Train Steps/Sec: 0.12, Epoch: 0.005421686746987952, LR: 0.001 [2025-07-27 20:31:17] (step=0000280) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.005441119315973571, LR: 0.001 [2025-07-27 20:31:25] (step=0000281) Train Loss: 0.2486, Train Steps/Sec: 0.12, Epoch: 0.005460551884959192, LR: 0.001 [2025-07-27 20:31:33] (step=0000282) Train Loss: 0.2484, Train Steps/Sec: 0.13, Epoch: 0.005479984453944812, LR: 0.001 [2025-07-27 20:31:42] (step=0000283) Train Loss: 0.2828, Train Steps/Sec: 0.12, Epoch: 0.005499417022930431, LR: 0.001 [2025-07-27 20:31:50] (step=0000284) Train Loss: 0.2110, Train Steps/Sec: 0.12, Epoch: 0.005518849591916051, LR: 0.001 [2025-07-27 20:31:58] (step=0000285) Train Loss: 0.3143, Train Steps/Sec: 0.13, Epoch: 0.005538282160901672, LR: 0.001 [2025-07-27 20:32:06] (step=0000286) Train Loss: 0.2918, Train Steps/Sec: 0.12, Epoch: 0.005557714729887291, LR: 0.001 [2025-07-27 20:32:14] (step=0000287) Train Loss: 0.2202, Train Steps/Sec: 0.13, Epoch: 0.005577147298872911, LR: 0.001 [2025-07-27 20:32:22] (step=0000288) Train Loss: 0.3361, Train Steps/Sec: 0.12, Epoch: 0.005596579867858531, LR: 0.001 [2025-07-27 20:32:30] (step=0000289) Train Loss: 0.2586, Train Steps/Sec: 0.13, Epoch: 0.005616012436844151, LR: 0.001 [2025-07-27 20:32:38] (step=0000290) Train Loss: 0.2646, Train Steps/Sec: 0.12, Epoch: 0.005635445005829771, LR: 0.001 [2025-07-27 20:32:46] (step=0000291) Train Loss: 0.1752, Train Steps/Sec: 0.12, Epoch: 0.005654877574815391, LR: 0.001 [2025-07-27 20:32:54] (step=0000292) Train Loss: 0.1661, Train Steps/Sec: 0.12, Epoch: 0.00567431014380101, LR: 0.001 [2025-07-27 20:33:02] (step=0000293) Train Loss: 0.2181, Train Steps/Sec: 0.12, Epoch: 0.00569374271278663, LR: 0.001 [2025-07-27 20:33:10] (step=0000294) Train Loss: 0.2233, Train Steps/Sec: 0.13, Epoch: 0.0057131752817722505, LR: 0.001 [2025-07-27 20:33:18] (step=0000295) Train Loss: 0.1859, Train Steps/Sec: 0.12, Epoch: 0.00573260785075787, LR: 0.001 [2025-07-27 20:33:24] (step=0000296) Train Loss: 0.2962, Train Steps/Sec: 0.17, Epoch: 0.00575204041974349, LR: 0.001 [2025-07-27 20:33:32] (step=0000297) Train Loss: 0.2663, Train Steps/Sec: 0.13, Epoch: 0.00577147298872911, LR: 0.001 [2025-07-27 20:33:40] (step=0000298) Train Loss: 0.3069, Train Steps/Sec: 0.12, Epoch: 0.00579090555771473, LR: 0.001 [2025-07-27 20:33:48] (step=0000299) Train Loss: 0.2847, Train Steps/Sec: 0.12, Epoch: 0.00581033812670035, LR: 0.001 [2025-07-27 20:33:56] (step=0000300) Train Loss: 0.2504, Train Steps/Sec: 0.12, Epoch: 0.0058297706956859695, LR: 0.001 [2025-07-27 20:34:04] (step=0000301) Train Loss: 0.3108, Train Steps/Sec: 0.12, Epoch: 0.005849203264671589, LR: 0.001 [2025-07-27 20:34:12] (step=0000302) Train Loss: 0.3004, Train Steps/Sec: 0.13, Epoch: 0.00586863583365721, LR: 0.001 [2025-07-27 20:34:20] (step=0000303) Train Loss: 0.1499, Train Steps/Sec: 0.12, Epoch: 0.0058880684026428294, LR: 0.001 [2025-07-27 20:34:28] (step=0000304) Train Loss: 0.1991, Train Steps/Sec: 0.12, Epoch: 0.005907500971628449, LR: 0.001 [2025-07-27 20:34:36] (step=0000305) Train Loss: 0.3300, Train Steps/Sec: 0.13, Epoch: 0.005926933540614069, LR: 0.001 [2025-07-27 20:34:44] (step=0000306) Train Loss: 0.2829, Train Steps/Sec: 0.12, Epoch: 0.005946366109599689, LR: 0.001 [2025-07-27 20:34:52] (step=0000307) Train Loss: 0.2155, Train Steps/Sec: 0.13, Epoch: 0.005965798678585309, LR: 0.001 [2025-07-27 20:35:00] (step=0000308) Train Loss: 0.2384, Train Steps/Sec: 0.12, Epoch: 0.005985231247570929, LR: 0.001 [2025-07-27 20:35:08] (step=0000309) Train Loss: 0.2257, Train Steps/Sec: 0.12, Epoch: 0.006004663816556548, LR: 0.001 [2025-07-27 20:35:16] (step=0000310) Train Loss: 0.2771, Train Steps/Sec: 0.13, Epoch: 0.006024096385542169, LR: 0.001 [2025-07-27 20:35:24] (step=0000311) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.006043528954527789, LR: 0.001 [2025-07-27 20:35:32] (step=0000312) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.006062961523513408, LR: 0.001 [2025-07-27 20:35:40] (step=0000313) Train Loss: 0.2652, Train Steps/Sec: 0.12, Epoch: 0.006082394092499028, LR: 0.001 [2025-07-27 20:35:48] (step=0000314) Train Loss: 0.2399, Train Steps/Sec: 0.13, Epoch: 0.006101826661484649, LR: 0.001 [2025-07-27 20:35:56] (step=0000315) Train Loss: 0.2039, Train Steps/Sec: 0.12, Epoch: 0.006121259230470268, LR: 0.001 [2025-07-27 20:36:04] (step=0000316) Train Loss: 0.2037, Train Steps/Sec: 0.12, Epoch: 0.006140691799455888, LR: 0.001 [2025-07-27 20:36:12] (step=0000317) Train Loss: 0.2092, Train Steps/Sec: 0.13, Epoch: 0.006160124368441508, LR: 0.001 [2025-07-27 20:36:20] (step=0000318) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.006179556937427128, LR: 0.001 [2025-07-27 20:36:28] (step=0000319) Train Loss: 0.3210, Train Steps/Sec: 0.13, Epoch: 0.006198989506412748, LR: 0.001 [2025-07-27 20:36:36] (step=0000320) Train Loss: 0.3460, Train Steps/Sec: 0.12, Epoch: 0.006218422075398368, LR: 0.001 [2025-07-27 20:36:44] (step=0000321) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.006237854644383987, LR: 0.001 [2025-07-27 20:36:53] (step=0000322) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.006257287213369608, LR: 0.001 [2025-07-27 20:37:01] (step=0000323) Train Loss: 0.3234, Train Steps/Sec: 0.13, Epoch: 0.0062767197823552275, LR: 0.001 [2025-07-27 20:37:09] (step=0000324) Train Loss: 0.2020, Train Steps/Sec: 0.12, Epoch: 0.006296152351340847, LR: 0.001 [2025-07-27 20:37:17] (step=0000325) Train Loss: 0.2799, Train Steps/Sec: 0.12, Epoch: 0.006315584920326467, LR: 0.001 [2025-07-27 20:37:25] (step=0000326) Train Loss: 0.2724, Train Steps/Sec: 0.13, Epoch: 0.0063350174893120875, LR: 0.001 [2025-07-27 20:37:33] (step=0000327) Train Loss: 0.2809, Train Steps/Sec: 0.13, Epoch: 0.006354450058297707, LR: 0.001 [2025-07-27 20:37:41] (step=0000328) Train Loss: 0.2814, Train Steps/Sec: 0.12, Epoch: 0.006373882627283327, LR: 0.001 [2025-07-27 20:37:47] (step=0000329) Train Loss: 0.2363, Train Steps/Sec: 0.16, Epoch: 0.0063933151962689465, LR: 0.001 [2025-07-27 20:37:54] (step=0000330) Train Loss: 0.1891, Train Steps/Sec: 0.14, Epoch: 0.006412747765254567, LR: 0.001 [2025-07-27 20:38:02] (step=0000331) Train Loss: 0.2849, Train Steps/Sec: 0.12, Epoch: 0.006432180334240187, LR: 0.001 [2025-07-27 20:38:11] (step=0000332) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.0064516129032258064, LR: 0.001 [2025-07-27 20:38:19] (step=0000333) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.006471045472211426, LR: 0.001 [2025-07-27 20:38:27] (step=0000334) Train Loss: 0.2589, Train Steps/Sec: 0.12, Epoch: 0.006490478041197046, LR: 0.001 [2025-07-27 20:38:35] (step=0000335) Train Loss: 0.2816, Train Steps/Sec: 0.13, Epoch: 0.006509910610182666, LR: 0.001 [2025-07-27 20:38:43] (step=0000336) Train Loss: 0.2983, Train Steps/Sec: 0.12, Epoch: 0.006529343179168286, LR: 0.001 [2025-07-27 20:38:51] (step=0000337) Train Loss: 0.2541, Train Steps/Sec: 0.13, Epoch: 0.006548775748153906, LR: 0.001 [2025-07-27 20:38:59] (step=0000338) Train Loss: 0.2460, Train Steps/Sec: 0.12, Epoch: 0.0065682083171395254, LR: 0.001 [2025-07-27 20:39:07] (step=0000339) Train Loss: 0.3071, Train Steps/Sec: 0.12, Epoch: 0.006587640886125146, LR: 0.001 [2025-07-27 20:39:15] (step=0000340) Train Loss: 0.2067, Train Steps/Sec: 0.12, Epoch: 0.006607073455110766, LR: 0.001 [2025-07-27 20:39:23] (step=0000341) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.006626506024096385, LR: 0.001 [2025-07-27 20:39:31] (step=0000342) Train Loss: 0.3073, Train Steps/Sec: 0.12, Epoch: 0.006645938593082005, LR: 0.001 [2025-07-27 20:39:39] (step=0000343) Train Loss: 0.2660, Train Steps/Sec: 0.12, Epoch: 0.006665371162067626, LR: 0.001 [2025-07-27 20:39:47] (step=0000344) Train Loss: 0.2541, Train Steps/Sec: 0.12, Epoch: 0.006684803731053245, LR: 0.001 [2025-07-27 20:39:55] (step=0000345) Train Loss: 0.2100, Train Steps/Sec: 0.13, Epoch: 0.006704236300038865, LR: 0.001 [2025-07-27 20:40:03] (step=0000346) Train Loss: 0.2469, Train Steps/Sec: 0.13, Epoch: 0.006723668869024485, LR: 0.001 [2025-07-27 20:40:11] (step=0000347) Train Loss: 0.3086, Train Steps/Sec: 0.12, Epoch: 0.006743101438010105, LR: 0.001 [2025-07-27 20:40:19] (step=0000348) Train Loss: 0.2932, Train Steps/Sec: 0.12, Epoch: 0.006762534006995725, LR: 0.001 [2025-07-27 20:40:27] (step=0000349) Train Loss: 0.2072, Train Steps/Sec: 0.12, Epoch: 0.006781966575981345, LR: 0.001 [2025-07-27 20:40:35] (step=0000350) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.006801399144966964, LR: 0.001 [2025-07-27 20:40:43] (step=0000351) Train Loss: 0.2909, Train Steps/Sec: 0.12, Epoch: 0.006820831713952585, LR: 0.001 [2025-07-27 20:40:51] (step=0000352) Train Loss: 0.1926, Train Steps/Sec: 0.12, Epoch: 0.0068402642829382045, LR: 0.001 [2025-07-27 20:40:59] (step=0000353) Train Loss: 0.2680, Train Steps/Sec: 0.12, Epoch: 0.006859696851923824, LR: 0.001 [2025-07-27 20:41:07] (step=0000354) Train Loss: 0.2124, Train Steps/Sec: 0.12, Epoch: 0.006879129420909444, LR: 0.001 [2025-07-27 20:41:15] (step=0000355) Train Loss: 0.2160, Train Steps/Sec: 0.13, Epoch: 0.0068985619898950645, LR: 0.001 [2025-07-27 20:41:23] (step=0000356) Train Loss: 0.2124, Train Steps/Sec: 0.12, Epoch: 0.006917994558880684, LR: 0.001 [2025-07-27 20:41:31] (step=0000357) Train Loss: 0.3135, Train Steps/Sec: 0.12, Epoch: 0.006937427127866304, LR: 0.001 [2025-07-27 20:41:40] (step=0000358) Train Loss: 0.2761, Train Steps/Sec: 0.12, Epoch: 0.0069568596968519235, LR: 0.001 [2025-07-27 20:41:48] (step=0000359) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.006976292265837544, LR: 0.001 [2025-07-27 20:41:56] (step=0000360) Train Loss: 0.1523, Train Steps/Sec: 0.13, Epoch: 0.006995724834823164, LR: 0.001 [2025-07-27 20:42:04] (step=0000361) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.0070151574038087834, LR: 0.001 [2025-07-27 20:42:10] (step=0000362) Train Loss: 0.2551, Train Steps/Sec: 0.15, Epoch: 0.007034589972794403, LR: 0.001 [2025-07-27 20:42:17] (step=0000363) Train Loss: 0.1791, Train Steps/Sec: 0.14, Epoch: 0.007054022541780024, LR: 0.001 [2025-07-27 20:42:25] (step=0000364) Train Loss: 0.1324, Train Steps/Sec: 0.13, Epoch: 0.007073455110765643, LR: 0.001 [2025-07-27 20:42:33] (step=0000365) Train Loss: 0.3076, Train Steps/Sec: 0.12, Epoch: 0.007092887679751263, LR: 0.001 [2025-07-27 20:42:41] (step=0000366) Train Loss: 0.1959, Train Steps/Sec: 0.13, Epoch: 0.007112320248736883, LR: 0.001 [2025-07-27 20:42:49] (step=0000367) Train Loss: 0.2831, Train Steps/Sec: 0.12, Epoch: 0.007131752817722503, LR: 0.001 [2025-07-27 20:42:57] (step=0000368) Train Loss: 0.2496, Train Steps/Sec: 0.12, Epoch: 0.007151185386708123, LR: 0.001 [2025-07-27 20:43:05] (step=0000369) Train Loss: 0.2420, Train Steps/Sec: 0.12, Epoch: 0.007170617955693743, LR: 0.001 [2025-07-27 20:43:13] (step=0000370) Train Loss: 0.2588, Train Steps/Sec: 0.13, Epoch: 0.007190050524679362, LR: 0.001 [2025-07-27 20:43:22] (step=0000371) Train Loss: 0.2484, Train Steps/Sec: 0.12, Epoch: 0.007209483093664983, LR: 0.001 [2025-07-27 20:43:30] (step=0000372) Train Loss: 0.2367, Train Steps/Sec: 0.12, Epoch: 0.007228915662650603, LR: 0.001 [2025-07-27 20:43:38] (step=0000373) Train Loss: 0.3017, Train Steps/Sec: 0.13, Epoch: 0.007248348231636222, LR: 0.001 [2025-07-27 20:43:46] (step=0000374) Train Loss: 0.1931, Train Steps/Sec: 0.12, Epoch: 0.007267780800621842, LR: 0.001 [2025-07-27 20:43:54] (step=0000375) Train Loss: 0.2344, Train Steps/Sec: 0.13, Epoch: 0.007287213369607462, LR: 0.001 [2025-07-27 20:44:02] (step=0000376) Train Loss: 0.2114, Train Steps/Sec: 0.12, Epoch: 0.007306645938593082, LR: 0.001 [2025-07-27 20:44:10] (step=0000377) Train Loss: 0.2957, Train Steps/Sec: 0.12, Epoch: 0.007326078507578702, LR: 0.001 [2025-07-27 20:44:18] (step=0000378) Train Loss: 0.2104, Train Steps/Sec: 0.12, Epoch: 0.007345511076564322, LR: 0.001 [2025-07-27 20:44:26] (step=0000379) Train Loss: 0.3290, Train Steps/Sec: 0.12, Epoch: 0.007364943645549941, LR: 0.001 [2025-07-27 20:44:34] (step=0000380) Train Loss: 0.2952, Train Steps/Sec: 0.12, Epoch: 0.007384376214535562, LR: 0.001 [2025-07-27 20:44:42] (step=0000381) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.0074038087835211815, LR: 0.001 [2025-07-27 20:44:50] (step=0000382) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.007423241352506801, LR: 0.001 [2025-07-27 20:44:58] (step=0000383) Train Loss: 0.1505, Train Steps/Sec: 0.12, Epoch: 0.007442673921492421, LR: 0.001 [2025-07-27 20:45:06] (step=0000384) Train Loss: 0.2404, Train Steps/Sec: 0.12, Epoch: 0.0074621064904780415, LR: 0.001 [2025-07-27 20:45:14] (step=0000385) Train Loss: 0.3014, Train Steps/Sec: 0.12, Epoch: 0.007481539059463661, LR: 0.001 [2025-07-27 20:45:22] (step=0000386) Train Loss: 0.2578, Train Steps/Sec: 0.12, Epoch: 0.007500971628449281, LR: 0.001 [2025-07-27 20:45:30] (step=0000387) Train Loss: 0.2577, Train Steps/Sec: 0.12, Epoch: 0.0075204041974349005, LR: 0.001 [2025-07-27 20:45:38] (step=0000388) Train Loss: 0.2755, Train Steps/Sec: 0.12, Epoch: 0.007539836766420521, LR: 0.001 [2025-07-27 20:45:46] (step=0000389) Train Loss: 0.1887, Train Steps/Sec: 0.13, Epoch: 0.007559269335406141, LR: 0.001 [2025-07-27 20:45:54] (step=0000390) Train Loss: 0.2083, Train Steps/Sec: 0.13, Epoch: 0.0075787019043917605, LR: 0.001 [2025-07-27 20:46:02] (step=0000391) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.00759813447337738, LR: 0.001 [2025-07-27 20:46:10] (step=0000392) Train Loss: 0.2907, Train Steps/Sec: 0.13, Epoch: 0.007617567042363001, LR: 0.001 [2025-07-27 20:46:19] (step=0000393) Train Loss: 0.1659, Train Steps/Sec: 0.12, Epoch: 0.00763699961134862, LR: 0.001 [2025-07-27 20:46:26] (step=0000394) Train Loss: 0.1802, Train Steps/Sec: 0.13, Epoch: 0.00765643218033424, LR: 0.001 [2025-07-27 20:46:34] (step=0000395) Train Loss: 0.2264, Train Steps/Sec: 0.14, Epoch: 0.00767586474931986, LR: 0.001 [2025-07-27 20:46:40] (step=0000396) Train Loss: 0.2240, Train Steps/Sec: 0.16, Epoch: 0.00769529731830548, LR: 0.001 [2025-07-27 20:46:48] (step=0000397) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.0077147298872911, LR: 0.001 [2025-07-27 20:46:56] (step=0000398) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.00773416245627672, LR: 0.001 [2025-07-27 20:47:04] (step=0000399) Train Loss: 0.3290, Train Steps/Sec: 0.13, Epoch: 0.007753595025262339, LR: 0.001 [2025-07-27 20:47:12] (step=0000400) Train Loss: 0.2853, Train Steps/Sec: 0.12, Epoch: 0.00777302759424796, LR: 0.001 [2025-07-27 20:47:20] (step=0000401) Train Loss: 0.2943, Train Steps/Sec: 0.13, Epoch: 0.00779246016323358, LR: 0.001 [2025-07-27 20:47:28] (step=0000402) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.007811892732219199, LR: 0.001 [2025-07-27 20:47:36] (step=0000403) Train Loss: 0.2531, Train Steps/Sec: 0.13, Epoch: 0.00783132530120482, LR: 0.001 [2025-07-27 20:47:44] (step=0000404) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.007850757870190439, LR: 0.001 [2025-07-27 20:47:52] (step=0000405) Train Loss: 0.2625, Train Steps/Sec: 0.13, Epoch: 0.00787019043917606, LR: 0.001 [2025-07-27 20:48:00] (step=0000406) Train Loss: 0.3081, Train Steps/Sec: 0.12, Epoch: 0.00788962300816168, LR: 0.001 [2025-07-27 20:48:09] (step=0000407) Train Loss: 0.2427, Train Steps/Sec: 0.12, Epoch: 0.007909055577147299, LR: 0.001 [2025-07-27 20:48:17] (step=0000408) Train Loss: 0.2792, Train Steps/Sec: 0.12, Epoch: 0.00792848814613292, LR: 0.001 [2025-07-27 20:48:25] (step=0000409) Train Loss: 0.2539, Train Steps/Sec: 0.12, Epoch: 0.007947920715118538, LR: 0.001 [2025-07-27 20:48:33] (step=0000410) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.007967353284104159, LR: 0.001 [2025-07-27 20:48:41] (step=0000411) Train Loss: 0.1709, Train Steps/Sec: 0.13, Epoch: 0.007986785853089779, LR: 0.001 [2025-07-27 20:48:49] (step=0000412) Train Loss: 0.2638, Train Steps/Sec: 0.12, Epoch: 0.008006218422075398, LR: 0.001 [2025-07-27 20:48:57] (step=0000413) Train Loss: 0.2855, Train Steps/Sec: 0.12, Epoch: 0.008025650991061018, LR: 0.001 [2025-07-27 20:49:05] (step=0000414) Train Loss: 0.3281, Train Steps/Sec: 0.12, Epoch: 0.008045083560046639, LR: 0.001 [2025-07-27 20:49:13] (step=0000415) Train Loss: 0.2446, Train Steps/Sec: 0.13, Epoch: 0.008064516129032258, LR: 0.001 [2025-07-27 20:49:21] (step=0000416) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.008083948698017878, LR: 0.001 [2025-07-27 20:49:29] (step=0000417) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.008103381267003497, LR: 0.001 [2025-07-27 20:49:37] (step=0000418) Train Loss: 0.2107, Train Steps/Sec: 0.12, Epoch: 0.008122813835989118, LR: 0.001 [2025-07-27 20:49:45] (step=0000419) Train Loss: 0.2422, Train Steps/Sec: 0.12, Epoch: 0.008142246404974738, LR: 0.001 [2025-07-27 20:49:53] (step=0000420) Train Loss: 0.2736, Train Steps/Sec: 0.12, Epoch: 0.008161678973960357, LR: 0.001 [2025-07-27 20:50:01] (step=0000421) Train Loss: 0.2911, Train Steps/Sec: 0.12, Epoch: 0.008181111542945978, LR: 0.001 [2025-07-27 20:50:09] (step=0000422) Train Loss: 0.2404, Train Steps/Sec: 0.12, Epoch: 0.008200544111931597, LR: 0.001 [2025-07-27 20:50:17] (step=0000423) Train Loss: 0.2735, Train Steps/Sec: 0.13, Epoch: 0.008219976680917217, LR: 0.001 [2025-07-27 20:50:25] (step=0000424) Train Loss: 0.2845, Train Steps/Sec: 0.12, Epoch: 0.008239409249902838, LR: 0.001 [2025-07-27 20:50:33] (step=0000425) Train Loss: 0.2972, Train Steps/Sec: 0.12, Epoch: 0.008258841818888456, LR: 0.001 [2025-07-27 20:50:41] (step=0000426) Train Loss: 0.2011, Train Steps/Sec: 0.12, Epoch: 0.008278274387874077, LR: 0.001 [2025-07-27 20:50:49] (step=0000427) Train Loss: 0.3518, Train Steps/Sec: 0.12, Epoch: 0.008297706956859698, LR: 0.001 [2025-07-27 20:50:57] (step=0000428) Train Loss: 0.2402, Train Steps/Sec: 0.13, Epoch: 0.008317139525845316, LR: 0.001 [2025-07-27 20:51:03] (step=0000429) Train Loss: 0.1508, Train Steps/Sec: 0.17, Epoch: 0.008336572094830937, LR: 0.001 [2025-07-27 20:51:11] (step=0000430) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.008356004663816556, LR: 0.001 [2025-07-27 20:51:19] (step=0000431) Train Loss: 0.1588, Train Steps/Sec: 0.12, Epoch: 0.008375437232802176, LR: 0.001 [2025-07-27 20:51:27] (step=0000432) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.008394869801787797, LR: 0.001 [2025-07-27 20:51:35] (step=0000433) Train Loss: 0.1876, Train Steps/Sec: 0.12, Epoch: 0.008414302370773416, LR: 0.001 [2025-07-27 20:51:44] (step=0000434) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.008433734939759036, LR: 0.001 [2025-07-27 20:51:52] (step=0000435) Train Loss: 0.1776, Train Steps/Sec: 0.12, Epoch: 0.008453167508744657, LR: 0.001 [2025-07-27 20:52:00] (step=0000436) Train Loss: 0.1550, Train Steps/Sec: 0.12, Epoch: 0.008472600077730276, LR: 0.001 [2025-07-27 20:52:08] (step=0000437) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.008492032646715896, LR: 0.001 [2025-07-27 20:52:16] (step=0000438) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.008511465215701515, LR: 0.001 [2025-07-27 20:52:24] (step=0000439) Train Loss: 0.2528, Train Steps/Sec: 0.12, Epoch: 0.008530897784687136, LR: 0.001 [2025-07-27 20:52:32] (step=0000440) Train Loss: 0.2102, Train Steps/Sec: 0.12, Epoch: 0.008550330353672756, LR: 0.001 [2025-07-27 20:52:40] (step=0000441) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.008569762922658375, LR: 0.001 [2025-07-27 20:52:48] (step=0000442) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.008589195491643995, LR: 0.001 [2025-07-27 20:52:56] (step=0000443) Train Loss: 0.2899, Train Steps/Sec: 0.13, Epoch: 0.008608628060629616, LR: 0.001 [2025-07-27 20:53:04] (step=0000444) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.008628060629615235, LR: 0.001 [2025-07-27 20:53:12] (step=0000445) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.008647493198600855, LR: 0.001 [2025-07-27 20:53:20] (step=0000446) Train Loss: 0.3433, Train Steps/Sec: 0.13, Epoch: 0.008666925767586474, LR: 0.001 [2025-07-27 20:53:28] (step=0000447) Train Loss: 0.1989, Train Steps/Sec: 0.12, Epoch: 0.008686358336572095, LR: 0.001 [2025-07-27 20:53:36] (step=0000448) Train Loss: 0.2475, Train Steps/Sec: 0.13, Epoch: 0.008705790905557715, LR: 0.001 [2025-07-27 20:53:44] (step=0000449) Train Loss: 0.2894, Train Steps/Sec: 0.12, Epoch: 0.008725223474543334, LR: 0.001 [2025-07-27 20:53:52] (step=0000450) Train Loss: 0.3158, Train Steps/Sec: 0.12, Epoch: 0.008744656043528955, LR: 0.001 [2025-07-27 20:54:00] (step=0000451) Train Loss: 0.1697, Train Steps/Sec: 0.13, Epoch: 0.008764088612514575, LR: 0.001 [2025-07-27 20:54:08] (step=0000452) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.008783521181500194, LR: 0.001 [2025-07-27 20:54:16] (step=0000453) Train Loss: 0.2470, Train Steps/Sec: 0.13, Epoch: 0.008802953750485815, LR: 0.001 [2025-07-27 20:54:24] (step=0000454) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.008822386319471433, LR: 0.001 [2025-07-27 20:54:32] (step=0000455) Train Loss: 0.2641, Train Steps/Sec: 0.13, Epoch: 0.008841818888457054, LR: 0.001 [2025-07-27 20:54:40] (step=0000456) Train Loss: 0.3115, Train Steps/Sec: 0.12, Epoch: 0.008861251457442675, LR: 0.001 [2025-07-27 20:54:49] (step=0000457) Train Loss: 0.2276, Train Steps/Sec: 0.12, Epoch: 0.008880684026428293, LR: 0.001 [2025-07-27 20:54:57] (step=0000458) Train Loss: 0.2768, Train Steps/Sec: 0.13, Epoch: 0.008900116595413914, LR: 0.001 [2025-07-27 20:55:05] (step=0000459) Train Loss: 0.1971, Train Steps/Sec: 0.12, Epoch: 0.008919549164399533, LR: 0.001 [2025-07-27 20:55:12] (step=0000460) Train Loss: 0.2320, Train Steps/Sec: 0.13, Epoch: 0.008938981733385153, LR: 0.001 [2025-07-27 20:55:21] (step=0000461) Train Loss: 0.2284, Train Steps/Sec: 0.12, Epoch: 0.008958414302370774, LR: 0.001 [2025-07-27 20:55:26] (step=0000462) Train Loss: 0.2026, Train Steps/Sec: 0.19, Epoch: 0.008977846871356393, LR: 0.001 [2025-07-27 20:55:34] (step=0000463) Train Loss: 0.1894, Train Steps/Sec: 0.12, Epoch: 0.008997279440342013, LR: 0.001 [2025-07-27 20:55:42] (step=0000464) Train Loss: 0.1762, Train Steps/Sec: 0.12, Epoch: 0.009016712009327634, LR: 0.001 [2025-07-27 20:55:50] (step=0000465) Train Loss: 0.2082, Train Steps/Sec: 0.12, Epoch: 0.009036144578313253, LR: 0.001 [2025-07-27 20:55:58] (step=0000466) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.009055577147298873, LR: 0.001 [2025-07-27 20:56:06] (step=0000467) Train Loss: 0.2533, Train Steps/Sec: 0.12, Epoch: 0.009075009716284492, LR: 0.001 [2025-07-27 20:56:14] (step=0000468) Train Loss: 0.2651, Train Steps/Sec: 0.12, Epoch: 0.009094442285270113, LR: 0.001 [2025-07-27 20:56:22] (step=0000469) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.009113874854255733, LR: 0.001 [2025-07-27 20:56:30] (step=0000470) Train Loss: 0.3137, Train Steps/Sec: 0.12, Epoch: 0.009133307423241352, LR: 0.001 [2025-07-27 20:56:38] (step=0000471) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.009152739992226972, LR: 0.001 [2025-07-27 20:56:46] (step=0000472) Train Loss: 0.3088, Train Steps/Sec: 0.12, Epoch: 0.009172172561212593, LR: 0.001 [2025-07-27 20:56:54] (step=0000473) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.009191605130198212, LR: 0.001 [2025-07-27 20:57:02] (step=0000474) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.009211037699183832, LR: 0.001 [2025-07-27 20:57:11] (step=0000475) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.009230470268169451, LR: 0.001 [2025-07-27 20:57:19] (step=0000476) Train Loss: 0.2671, Train Steps/Sec: 0.12, Epoch: 0.009249902837155072, LR: 0.001 [2025-07-27 20:57:27] (step=0000477) Train Loss: 0.2407, Train Steps/Sec: 0.12, Epoch: 0.009269335406140692, LR: 0.001 [2025-07-27 20:57:35] (step=0000478) Train Loss: 0.2767, Train Steps/Sec: 0.12, Epoch: 0.009288767975126311, LR: 0.001 [2025-07-27 20:57:43] (step=0000479) Train Loss: 0.1906, Train Steps/Sec: 0.12, Epoch: 0.009308200544111932, LR: 0.001 [2025-07-27 20:57:51] (step=0000480) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.009327633113097552, LR: 0.001 [2025-07-27 20:57:59] (step=0000481) Train Loss: 0.2405, Train Steps/Sec: 0.12, Epoch: 0.009347065682083171, LR: 0.001 [2025-07-27 20:58:07] (step=0000482) Train Loss: 0.1902, Train Steps/Sec: 0.12, Epoch: 0.009366498251068792, LR: 0.001 [2025-07-27 20:58:15] (step=0000483) Train Loss: 0.2537, Train Steps/Sec: 0.13, Epoch: 0.00938593082005441, LR: 0.001 [2025-07-27 20:58:23] (step=0000484) Train Loss: 0.2104, Train Steps/Sec: 0.12, Epoch: 0.009405363389040031, LR: 0.001 [2025-07-27 20:58:31] (step=0000485) Train Loss: 0.1892, Train Steps/Sec: 0.12, Epoch: 0.009424795958025652, LR: 0.001 [2025-07-27 20:58:39] (step=0000486) Train Loss: 0.2658, Train Steps/Sec: 0.13, Epoch: 0.00944422852701127, LR: 0.001 [2025-07-27 20:58:47] (step=0000487) Train Loss: 0.2284, Train Steps/Sec: 0.12, Epoch: 0.009463661095996891, LR: 0.001 [2025-07-27 20:58:55] (step=0000488) Train Loss: 0.2549, Train Steps/Sec: 0.12, Epoch: 0.009483093664982511, LR: 0.001 [2025-07-27 20:59:03] (step=0000489) Train Loss: 0.3014, Train Steps/Sec: 0.12, Epoch: 0.00950252623396813, LR: 0.001 [2025-07-27 20:59:11] (step=0000490) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.009521958802953751, LR: 0.001 [2025-07-27 20:59:19] (step=0000491) Train Loss: 0.2296, Train Steps/Sec: 0.13, Epoch: 0.00954139137193937, LR: 0.001 [2025-07-27 20:59:27] (step=0000492) Train Loss: 0.3054, Train Steps/Sec: 0.12, Epoch: 0.00956082394092499, LR: 0.001 [2025-07-27 20:59:35] (step=0000493) Train Loss: 0.2448, Train Steps/Sec: 0.13, Epoch: 0.00958025650991061, LR: 0.001 [2025-07-27 20:59:43] (step=0000494) Train Loss: 0.3024, Train Steps/Sec: 0.12, Epoch: 0.00959968907889623, LR: 0.001 [2025-07-27 20:59:49] (step=0000495) Train Loss: 0.2333, Train Steps/Sec: 0.19, Epoch: 0.00961912164788185, LR: 0.001 [2025-07-27 20:59:57] (step=0000496) Train Loss: 0.2887, Train Steps/Sec: 0.12, Epoch: 0.00963855421686747, LR: 0.001 [2025-07-27 21:00:05] (step=0000497) Train Loss: 0.2631, Train Steps/Sec: 0.12, Epoch: 0.00965798678585309, LR: 0.001 [2025-07-27 21:00:13] (step=0000498) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.00967741935483871, LR: 0.001 [2025-07-27 21:00:21] (step=0000499) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.009696851923824329, LR: 0.001 [2025-07-27 21:00:29] (step=0000500) Train Loss: 0.3141, Train Steps/Sec: 0.12, Epoch: 0.00971628449280995, LR: 0.001 [2025-07-27 21:00:37] (step=0000501) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.00973571706179557, LR: 0.001 [2025-07-27 21:00:45] (step=0000502) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.009755149630781189, LR: 0.001 [2025-07-27 21:00:53] (step=0000503) Train Loss: 0.2489, Train Steps/Sec: 0.12, Epoch: 0.00977458219976681, LR: 0.001 [2025-07-27 21:01:01] (step=0000504) Train Loss: 0.1952, Train Steps/Sec: 0.13, Epoch: 0.009794014768752428, LR: 0.001 [2025-07-27 21:01:09] (step=0000505) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.009813447337738049, LR: 0.001 [2025-07-27 21:01:17] (step=0000506) Train Loss: 0.2209, Train Steps/Sec: 0.12, Epoch: 0.00983287990672367, LR: 0.001 [2025-07-27 21:01:25] (step=0000507) Train Loss: 0.3063, Train Steps/Sec: 0.13, Epoch: 0.009852312475709288, LR: 0.001 [2025-07-27 21:01:33] (step=0000508) Train Loss: 0.2160, Train Steps/Sec: 0.12, Epoch: 0.009871745044694909, LR: 0.001 [2025-07-27 21:01:41] (step=0000509) Train Loss: 0.2419, Train Steps/Sec: 0.13, Epoch: 0.00989117761368053, LR: 0.001 [2025-07-27 21:01:49] (step=0000510) Train Loss: 0.2089, Train Steps/Sec: 0.12, Epoch: 0.009910610182666148, LR: 0.001 [2025-07-27 21:01:57] (step=0000511) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.009930042751651769, LR: 0.001 [2025-07-27 21:02:05] (step=0000512) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.009949475320637387, LR: 0.001 [2025-07-27 21:02:13] (step=0000513) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.009968907889623008, LR: 0.001 [2025-07-27 21:02:21] (step=0000514) Train Loss: 0.1949, Train Steps/Sec: 0.13, Epoch: 0.009988340458608629, LR: 0.001 [2025-07-27 21:02:29] (step=0000515) Train Loss: 0.3125, Train Steps/Sec: 0.12, Epoch: 0.010007773027594247, LR: 0.001 [2025-07-27 21:02:37] (step=0000516) Train Loss: 0.3006, Train Steps/Sec: 0.12, Epoch: 0.010027205596579868, LR: 0.001 [2025-07-27 21:02:45] (step=0000517) Train Loss: 0.2718, Train Steps/Sec: 0.12, Epoch: 0.010046638165565488, LR: 0.001 [2025-07-27 21:02:53] (step=0000518) Train Loss: 0.2424, Train Steps/Sec: 0.12, Epoch: 0.010066070734551107, LR: 0.001 [2025-07-27 21:03:01] (step=0000519) Train Loss: 0.1865, Train Steps/Sec: 0.12, Epoch: 0.010085503303536728, LR: 0.001 [2025-07-27 21:03:09] (step=0000520) Train Loss: 0.2766, Train Steps/Sec: 0.12, Epoch: 0.010104935872522347, LR: 0.001 [2025-07-27 21:03:17] (step=0000521) Train Loss: 0.3352, Train Steps/Sec: 0.13, Epoch: 0.010124368441507967, LR: 0.001 [2025-07-27 21:03:26] (step=0000522) Train Loss: 0.2140, Train Steps/Sec: 0.12, Epoch: 0.010143801010493588, LR: 0.001 [2025-07-27 21:03:34] (step=0000523) Train Loss: 0.2850, Train Steps/Sec: 0.12, Epoch: 0.010163233579479207, LR: 0.001 [2025-07-27 21:03:42] (step=0000524) Train Loss: 0.1629, Train Steps/Sec: 0.12, Epoch: 0.010182666148464827, LR: 0.001 [2025-07-27 21:03:50] (step=0000525) Train Loss: 0.2440, Train Steps/Sec: 0.13, Epoch: 0.010202098717450448, LR: 0.001 [2025-07-27 21:03:57] (step=0000526) Train Loss: 0.3203, Train Steps/Sec: 0.13, Epoch: 0.010221531286436067, LR: 0.001 [2025-07-27 21:04:05] (step=0000527) Train Loss: 0.2686, Train Steps/Sec: 0.12, Epoch: 0.010240963855421687, LR: 0.001 [2025-07-27 21:04:11] (step=0000528) Train Loss: 0.1914, Train Steps/Sec: 0.18, Epoch: 0.010260396424407306, LR: 0.001 [2025-07-27 21:04:19] (step=0000529) Train Loss: 0.1713, Train Steps/Sec: 0.12, Epoch: 0.010279828993392926, LR: 0.001 [2025-07-27 21:04:27] (step=0000530) Train Loss: 0.2622, Train Steps/Sec: 0.12, Epoch: 0.010299261562378547, LR: 0.001 [2025-07-27 21:04:35] (step=0000531) Train Loss: 0.2070, Train Steps/Sec: 0.12, Epoch: 0.010318694131364166, LR: 0.001 [2025-07-27 21:04:43] (step=0000532) Train Loss: 0.2579, Train Steps/Sec: 0.12, Epoch: 0.010338126700349786, LR: 0.001 [2025-07-27 21:04:51] (step=0000533) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.010357559269335407, LR: 0.001 [2025-07-27 21:04:59] (step=0000534) Train Loss: 0.2393, Train Steps/Sec: 0.13, Epoch: 0.010376991838321026, LR: 0.001 [2025-07-27 21:05:07] (step=0000535) Train Loss: 0.2309, Train Steps/Sec: 0.12, Epoch: 0.010396424407306646, LR: 0.001 [2025-07-27 21:05:15] (step=0000536) Train Loss: 0.2425, Train Steps/Sec: 0.12, Epoch: 0.010415856976292265, LR: 0.001 [2025-07-27 21:05:24] (step=0000537) Train Loss: 0.2305, Train Steps/Sec: 0.12, Epoch: 0.010435289545277886, LR: 0.001 [2025-07-27 21:05:32] (step=0000538) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.010454722114263506, LR: 0.001 [2025-07-27 21:05:40] (step=0000539) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.010474154683249125, LR: 0.001 [2025-07-27 21:05:48] (step=0000540) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.010493587252234746, LR: 0.001 [2025-07-27 21:05:56] (step=0000541) Train Loss: 0.2872, Train Steps/Sec: 0.12, Epoch: 0.010513019821220366, LR: 0.001 [2025-07-27 21:06:04] (step=0000542) Train Loss: 0.3403, Train Steps/Sec: 0.12, Epoch: 0.010532452390205985, LR: 0.001 [2025-07-27 21:06:12] (step=0000543) Train Loss: 0.1887, Train Steps/Sec: 0.12, Epoch: 0.010551884959191606, LR: 0.001 [2025-07-27 21:06:20] (step=0000544) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.010571317528177224, LR: 0.001 [2025-07-27 21:06:28] (step=0000545) Train Loss: 0.2509, Train Steps/Sec: 0.12, Epoch: 0.010590750097162845, LR: 0.001 [2025-07-27 21:06:36] (step=0000546) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.010610182666148466, LR: 0.001 [2025-07-27 21:06:44] (step=0000547) Train Loss: 0.2597, Train Steps/Sec: 0.12, Epoch: 0.010629615235134084, LR: 0.001 [2025-07-27 21:06:52] (step=0000548) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.010649047804119705, LR: 0.001 [2025-07-27 21:07:00] (step=0000549) Train Loss: 0.2545, Train Steps/Sec: 0.13, Epoch: 0.010668480373105324, LR: 0.001 [2025-07-27 21:07:08] (step=0000550) Train Loss: 0.1696, Train Steps/Sec: 0.12, Epoch: 0.010687912942090944, LR: 0.001 [2025-07-27 21:07:16] (step=0000551) Train Loss: 0.1883, Train Steps/Sec: 0.13, Epoch: 0.010707345511076565, LR: 0.001 [2025-07-27 21:07:24] (step=0000552) Train Loss: 0.3241, Train Steps/Sec: 0.12, Epoch: 0.010726778080062184, LR: 0.001 [2025-07-27 21:07:32] (step=0000553) Train Loss: 0.2113, Train Steps/Sec: 0.12, Epoch: 0.010746210649047804, LR: 0.001 [2025-07-27 21:07:40] (step=0000554) Train Loss: 0.3202, Train Steps/Sec: 0.12, Epoch: 0.010765643218033425, LR: 0.001 [2025-07-27 21:07:49] (step=0000555) Train Loss: 0.2804, Train Steps/Sec: 0.12, Epoch: 0.010785075787019044, LR: 0.001 [2025-07-27 21:07:57] (step=0000556) Train Loss: 0.1668, Train Steps/Sec: 0.12, Epoch: 0.010804508356004664, LR: 0.001 [2025-07-27 21:08:05] (step=0000557) Train Loss: 0.2213, Train Steps/Sec: 0.12, Epoch: 0.010823940924990283, LR: 0.001 [2025-07-27 21:08:13] (step=0000558) Train Loss: 0.2321, Train Steps/Sec: 0.12, Epoch: 0.010843373493975903, LR: 0.001 [2025-07-27 21:08:21] (step=0000559) Train Loss: 0.1769, Train Steps/Sec: 0.13, Epoch: 0.010862806062961524, LR: 0.001 [2025-07-27 21:08:29] (step=0000560) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.010882238631947143, LR: 0.001 [2025-07-27 21:08:35] (step=0000561) Train Loss: 0.3059, Train Steps/Sec: 0.18, Epoch: 0.010901671200932763, LR: 0.001 [2025-07-27 21:08:43] (step=0000562) Train Loss: 0.2247, Train Steps/Sec: 0.12, Epoch: 0.010921103769918384, LR: 0.001 [2025-07-27 21:08:51] (step=0000563) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.010940536338904003, LR: 0.001 [2025-07-27 21:08:58] (step=0000564) Train Loss: 0.1693, Train Steps/Sec: 0.13, Epoch: 0.010959968907889623, LR: 0.001 [2025-07-27 21:09:06] (step=0000565) Train Loss: 0.2851, Train Steps/Sec: 0.12, Epoch: 0.010979401476875242, LR: 0.001 [2025-07-27 21:09:15] (step=0000566) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.010998834045860863, LR: 0.001 [2025-07-27 21:09:23] (step=0000567) Train Loss: 0.2782, Train Steps/Sec: 0.13, Epoch: 0.011018266614846483, LR: 0.001 [2025-07-27 21:09:31] (step=0000568) Train Loss: 0.1528, Train Steps/Sec: 0.12, Epoch: 0.011037699183832102, LR: 0.001 [2025-07-27 21:09:39] (step=0000569) Train Loss: 0.1918, Train Steps/Sec: 0.12, Epoch: 0.011057131752817723, LR: 0.001 [2025-07-27 21:09:47] (step=0000570) Train Loss: 0.1516, Train Steps/Sec: 0.12, Epoch: 0.011076564321803343, LR: 0.001 [2025-07-27 21:09:55] (step=0000571) Train Loss: 0.2266, Train Steps/Sec: 0.12, Epoch: 0.011095996890788962, LR: 0.001 [2025-07-27 21:10:03] (step=0000572) Train Loss: 0.1924, Train Steps/Sec: 0.12, Epoch: 0.011115429459774583, LR: 0.001 [2025-07-27 21:10:11] (step=0000573) Train Loss: 0.2576, Train Steps/Sec: 0.12, Epoch: 0.011134862028760201, LR: 0.001 [2025-07-27 21:10:19] (step=0000574) Train Loss: 0.2561, Train Steps/Sec: 0.13, Epoch: 0.011154294597745822, LR: 0.001 [2025-07-27 21:10:27] (step=0000575) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.011173727166731443, LR: 0.001 [2025-07-27 21:10:35] (step=0000576) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.011193159735717061, LR: 0.001 [2025-07-27 21:10:43] (step=0000577) Train Loss: 0.2159, Train Steps/Sec: 0.12, Epoch: 0.011212592304702682, LR: 0.001 [2025-07-27 21:10:51] (step=0000578) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.011232024873688302, LR: 0.001 [2025-07-27 21:10:59] (step=0000579) Train Loss: 0.2870, Train Steps/Sec: 0.13, Epoch: 0.011251457442673921, LR: 0.001 [2025-07-27 21:11:07] (step=0000580) Train Loss: 0.3105, Train Steps/Sec: 0.12, Epoch: 0.011270890011659542, LR: 0.001 [2025-07-27 21:11:15] (step=0000581) Train Loss: 0.3012, Train Steps/Sec: 0.12, Epoch: 0.01129032258064516, LR: 0.001 [2025-07-27 21:11:23] (step=0000582) Train Loss: 0.2258, Train Steps/Sec: 0.13, Epoch: 0.011309755149630781, LR: 0.001 [2025-07-27 21:11:31] (step=0000583) Train Loss: 0.2216, Train Steps/Sec: 0.12, Epoch: 0.011329187718616402, LR: 0.001 [2025-07-27 21:11:39] (step=0000584) Train Loss: 0.3140, Train Steps/Sec: 0.13, Epoch: 0.01134862028760202, LR: 0.001 [2025-07-27 21:11:47] (step=0000585) Train Loss: 0.2798, Train Steps/Sec: 0.12, Epoch: 0.011368052856587641, LR: 0.001 [2025-07-27 21:11:55] (step=0000586) Train Loss: 0.3286, Train Steps/Sec: 0.13, Epoch: 0.01138748542557326, LR: 0.001 [2025-07-27 21:12:03] (step=0000587) Train Loss: 0.2906, Train Steps/Sec: 0.13, Epoch: 0.01140691799455888, LR: 0.001 [2025-07-27 21:12:11] (step=0000588) Train Loss: 0.2174, Train Steps/Sec: 0.12, Epoch: 0.011426350563544501, LR: 0.001 [2025-07-27 21:12:19] (step=0000589) Train Loss: 0.2324, Train Steps/Sec: 0.13, Epoch: 0.01144578313253012, LR: 0.001 [2025-07-27 21:12:28] (step=0000590) Train Loss: 0.2605, Train Steps/Sec: 0.12, Epoch: 0.01146521570151574, LR: 0.001 [2025-07-27 21:12:36] (step=0000591) Train Loss: 0.2813, Train Steps/Sec: 0.13, Epoch: 0.011484648270501361, LR: 0.001 [2025-07-27 21:12:43] (step=0000592) Train Loss: 0.2323, Train Steps/Sec: 0.13, Epoch: 0.01150408083948698, LR: 0.001 [2025-07-27 21:12:51] (step=0000593) Train Loss: 0.1722, Train Steps/Sec: 0.12, Epoch: 0.0115235134084726, LR: 0.001 [2025-07-27 21:12:58] (step=0000594) Train Loss: 0.2774, Train Steps/Sec: 0.17, Epoch: 0.01154294597745822, LR: 0.001 [2025-07-27 21:13:06] (step=0000595) Train Loss: 0.2944, Train Steps/Sec: 0.13, Epoch: 0.01156237854644384, LR: 0.001 [2025-07-27 21:13:14] (step=0000596) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.01158181111542946, LR: 0.001 [2025-07-27 21:13:22] (step=0000597) Train Loss: 0.3054, Train Steps/Sec: 0.12, Epoch: 0.011601243684415079, LR: 0.001 [2025-07-27 21:13:30] (step=0000598) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.0116206762534007, LR: 0.001 [2025-07-27 21:13:38] (step=0000599) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.01164010882238632, LR: 0.001 [2025-07-27 21:13:46] (step=0000600) Train Loss: 0.2855, Train Steps/Sec: 0.12, Epoch: 0.011659541391371939, LR: 0.001 [2025-07-27 21:13:54] (step=0000601) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.01167897396035756, LR: 0.001 [2025-07-27 21:14:02] (step=0000602) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.011698406529343178, LR: 0.001 [2025-07-27 21:14:10] (step=0000603) Train Loss: 0.1876, Train Steps/Sec: 0.12, Epoch: 0.011717839098328799, LR: 0.001 [2025-07-27 21:14:18] (step=0000604) Train Loss: 0.2299, Train Steps/Sec: 0.12, Epoch: 0.01173727166731442, LR: 0.001 [2025-07-27 21:14:26] (step=0000605) Train Loss: 0.2814, Train Steps/Sec: 0.12, Epoch: 0.011756704236300038, LR: 0.001 [2025-07-27 21:14:34] (step=0000606) Train Loss: 0.2107, Train Steps/Sec: 0.12, Epoch: 0.011776136805285659, LR: 0.001 [2025-07-27 21:14:42] (step=0000607) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.01179556937427128, LR: 0.001 [2025-07-27 21:14:50] (step=0000608) Train Loss: 0.1723, Train Steps/Sec: 0.12, Epoch: 0.011815001943256898, LR: 0.001 [2025-07-27 21:14:58] (step=0000609) Train Loss: 0.2685, Train Steps/Sec: 0.12, Epoch: 0.011834434512242519, LR: 0.001 [2025-07-27 21:15:06] (step=0000610) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.011853867081228138, LR: 0.001 [2025-07-27 21:15:14] (step=0000611) Train Loss: 0.2002, Train Steps/Sec: 0.12, Epoch: 0.011873299650213758, LR: 0.001 [2025-07-27 21:15:22] (step=0000612) Train Loss: 0.2989, Train Steps/Sec: 0.13, Epoch: 0.011892732219199379, LR: 0.001 [2025-07-27 21:15:30] (step=0000613) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.011912164788184998, LR: 0.001 [2025-07-27 21:15:38] (step=0000614) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.011931597357170618, LR: 0.001 [2025-07-27 21:15:46] (step=0000615) Train Loss: 0.1878, Train Steps/Sec: 0.12, Epoch: 0.011951029926156239, LR: 0.001 [2025-07-27 21:15:54] (step=0000616) Train Loss: 0.1958, Train Steps/Sec: 0.12, Epoch: 0.011970462495141857, LR: 0.001 [2025-07-27 21:16:02] (step=0000617) Train Loss: 0.2057, Train Steps/Sec: 0.12, Epoch: 0.011989895064127478, LR: 0.001 [2025-07-27 21:16:11] (step=0000618) Train Loss: 0.2748, Train Steps/Sec: 0.12, Epoch: 0.012009327633113097, LR: 0.001 [2025-07-27 21:16:19] (step=0000619) Train Loss: 0.2380, Train Steps/Sec: 0.12, Epoch: 0.012028760202098717, LR: 0.001 [2025-07-27 21:16:27] (step=0000620) Train Loss: 0.1964, Train Steps/Sec: 0.12, Epoch: 0.012048192771084338, LR: 0.001 [2025-07-27 21:16:35] (step=0000621) Train Loss: 0.2842, Train Steps/Sec: 0.12, Epoch: 0.012067625340069957, LR: 0.001 [2025-07-27 21:16:43] (step=0000622) Train Loss: 0.2598, Train Steps/Sec: 0.13, Epoch: 0.012087057909055577, LR: 0.001 [2025-07-27 21:16:51] (step=0000623) Train Loss: 0.1764, Train Steps/Sec: 0.12, Epoch: 0.012106490478041198, LR: 0.001 [2025-07-27 21:16:59] (step=0000624) Train Loss: 0.2801, Train Steps/Sec: 0.13, Epoch: 0.012125923047026817, LR: 0.001 [2025-07-27 21:17:07] (step=0000625) Train Loss: 0.2022, Train Steps/Sec: 0.13, Epoch: 0.012145355616012437, LR: 0.001 [2025-07-27 21:17:15] (step=0000626) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.012164788184998056, LR: 0.001 [2025-07-27 21:17:21] (step=0000627) Train Loss: 0.2503, Train Steps/Sec: 0.15, Epoch: 0.012184220753983677, LR: 0.001 [2025-07-27 21:17:29] (step=0000628) Train Loss: 0.2578, Train Steps/Sec: 0.13, Epoch: 0.012203653322969297, LR: 0.001 [2025-07-27 21:17:37] (step=0000629) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.012223085891954916, LR: 0.001 [2025-07-27 21:17:45] (step=0000630) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.012242518460940537, LR: 0.001 [2025-07-27 21:17:53] (step=0000631) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.012261951029926155, LR: 0.001 [2025-07-27 21:18:01] (step=0000632) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.012281383598911776, LR: 0.001 [2025-07-27 21:18:09] (step=0000633) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.012300816167897397, LR: 0.001 [2025-07-27 21:18:17] (step=0000634) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.012320248736883015, LR: 0.001 [2025-07-27 21:18:25] (step=0000635) Train Loss: 0.2911, Train Steps/Sec: 0.12, Epoch: 0.012339681305868636, LR: 0.001 [2025-07-27 21:18:33] (step=0000636) Train Loss: 0.2154, Train Steps/Sec: 0.13, Epoch: 0.012359113874854256, LR: 0.001 [2025-07-27 21:18:41] (step=0000637) Train Loss: 0.2857, Train Steps/Sec: 0.12, Epoch: 0.012378546443839875, LR: 0.001 [2025-07-27 21:18:49] (step=0000638) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.012397979012825496, LR: 0.001 [2025-07-27 21:18:57] (step=0000639) Train Loss: 0.2970, Train Steps/Sec: 0.12, Epoch: 0.012417411581811115, LR: 0.001 [2025-07-27 21:19:05] (step=0000640) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.012436844150796735, LR: 0.001 [2025-07-27 21:19:13] (step=0000641) Train Loss: 0.2559, Train Steps/Sec: 0.13, Epoch: 0.012456276719782356, LR: 0.001 [2025-07-27 21:19:21] (step=0000642) Train Loss: 0.2648, Train Steps/Sec: 0.12, Epoch: 0.012475709288767975, LR: 0.001 [2025-07-27 21:19:29] (step=0000643) Train Loss: 0.2245, Train Steps/Sec: 0.12, Epoch: 0.012495141857753595, LR: 0.001 [2025-07-27 21:19:37] (step=0000644) Train Loss: 0.2658, Train Steps/Sec: 0.12, Epoch: 0.012514574426739216, LR: 0.001 [2025-07-27 21:19:45] (step=0000645) Train Loss: 0.2649, Train Steps/Sec: 0.13, Epoch: 0.012534006995724834, LR: 0.001 [2025-07-27 21:19:53] (step=0000646) Train Loss: 0.2656, Train Steps/Sec: 0.12, Epoch: 0.012553439564710455, LR: 0.001 [2025-07-27 21:20:01] (step=0000647) Train Loss: 0.2289, Train Steps/Sec: 0.13, Epoch: 0.012572872133696074, LR: 0.001 [2025-07-27 21:20:09] (step=0000648) Train Loss: 0.2088, Train Steps/Sec: 0.12, Epoch: 0.012592304702681694, LR: 0.001 [2025-07-27 21:20:17] (step=0000649) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.012611737271667315, LR: 0.001 [2025-07-27 21:20:25] (step=0000650) Train Loss: 0.1974, Train Steps/Sec: 0.12, Epoch: 0.012631169840652934, LR: 0.001 [2025-07-27 21:20:33] (step=0000651) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.012650602409638554, LR: 0.001 [2025-07-27 21:20:41] (step=0000652) Train Loss: 0.2572, Train Steps/Sec: 0.13, Epoch: 0.012670034978624175, LR: 0.001 [2025-07-27 21:20:50] (step=0000653) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.012689467547609794, LR: 0.001 [2025-07-27 21:20:58] (step=0000654) Train Loss: 0.2543, Train Steps/Sec: 0.12, Epoch: 0.012708900116595414, LR: 0.001 [2025-07-27 21:21:06] (step=0000655) Train Loss: 0.2356, Train Steps/Sec: 0.12, Epoch: 0.012728332685581033, LR: 0.001 [2025-07-27 21:21:14] (step=0000656) Train Loss: 0.2281, Train Steps/Sec: 0.12, Epoch: 0.012747765254566654, LR: 0.001 [2025-07-27 21:21:22] (step=0000657) Train Loss: 0.2768, Train Steps/Sec: 0.12, Epoch: 0.012767197823552274, LR: 0.001 [2025-07-27 21:21:30] (step=0000658) Train Loss: 0.2578, Train Steps/Sec: 0.12, Epoch: 0.012786630392537893, LR: 0.001 [2025-07-27 21:21:38] (step=0000659) Train Loss: 0.2396, Train Steps/Sec: 0.13, Epoch: 0.012806062961523514, LR: 0.001 [2025-07-27 21:21:45] (step=0000660) Train Loss: 0.2500, Train Steps/Sec: 0.14, Epoch: 0.012825495530509134, LR: 0.001 [2025-07-27 21:21:52] (step=0000661) Train Loss: 0.2761, Train Steps/Sec: 0.14, Epoch: 0.012844928099494753, LR: 0.001 [2025-07-27 21:22:00] (step=0000662) Train Loss: 0.1797, Train Steps/Sec: 0.12, Epoch: 0.012864360668480374, LR: 0.001 [2025-07-27 21:22:08] (step=0000663) Train Loss: 0.2509, Train Steps/Sec: 0.13, Epoch: 0.012883793237465992, LR: 0.001 [2025-07-27 21:22:16] (step=0000664) Train Loss: 0.2147, Train Steps/Sec: 0.13, Epoch: 0.012903225806451613, LR: 0.001 [2025-07-27 21:22:24] (step=0000665) Train Loss: 0.1836, Train Steps/Sec: 0.12, Epoch: 0.012922658375437233, LR: 0.001 [2025-07-27 21:22:32] (step=0000666) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.012942090944422852, LR: 0.001 [2025-07-27 21:22:40] (step=0000667) Train Loss: 0.2348, Train Steps/Sec: 0.12, Epoch: 0.012961523513408473, LR: 0.001 [2025-07-27 21:22:48] (step=0000668) Train Loss: 0.2230, Train Steps/Sec: 0.12, Epoch: 0.012980956082394092, LR: 0.001 [2025-07-27 21:22:56] (step=0000669) Train Loss: 0.1686, Train Steps/Sec: 0.12, Epoch: 0.013000388651379712, LR: 0.001 [2025-07-27 21:23:04] (step=0000670) Train Loss: 0.1900, Train Steps/Sec: 0.12, Epoch: 0.013019821220365333, LR: 0.001 [2025-07-27 21:23:12] (step=0000671) Train Loss: 0.2370, Train Steps/Sec: 0.13, Epoch: 0.013039253789350952, LR: 0.001 [2025-07-27 21:23:20] (step=0000672) Train Loss: 0.2134, Train Steps/Sec: 0.12, Epoch: 0.013058686358336572, LR: 0.001 [2025-07-27 21:23:28] (step=0000673) Train Loss: 0.3301, Train Steps/Sec: 0.13, Epoch: 0.013078118927322193, LR: 0.001 [2025-07-27 21:23:36] (step=0000674) Train Loss: 0.2110, Train Steps/Sec: 0.12, Epoch: 0.013097551496307811, LR: 0.001 [2025-07-27 21:23:44] (step=0000675) Train Loss: 0.2902, Train Steps/Sec: 0.12, Epoch: 0.013116984065293432, LR: 0.001 [2025-07-27 21:23:52] (step=0000676) Train Loss: 0.2895, Train Steps/Sec: 0.12, Epoch: 0.013136416634279051, LR: 0.001 [2025-07-27 21:24:00] (step=0000677) Train Loss: 0.2816, Train Steps/Sec: 0.12, Epoch: 0.013155849203264671, LR: 0.001 [2025-07-27 21:24:08] (step=0000678) Train Loss: 0.2379, Train Steps/Sec: 0.13, Epoch: 0.013175281772250292, LR: 0.001 [2025-07-27 21:24:17] (step=0000679) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.01319471434123591, LR: 0.001 [2025-07-27 21:24:25] (step=0000680) Train Loss: 0.3019, Train Steps/Sec: 0.12, Epoch: 0.013214146910221531, LR: 0.001 [2025-07-27 21:24:33] (step=0000681) Train Loss: 0.2509, Train Steps/Sec: 0.13, Epoch: 0.013233579479207152, LR: 0.001 [2025-07-27 21:24:41] (step=0000682) Train Loss: 0.3076, Train Steps/Sec: 0.12, Epoch: 0.01325301204819277, LR: 0.001 [2025-07-27 21:24:49] (step=0000683) Train Loss: 0.1941, Train Steps/Sec: 0.13, Epoch: 0.013272444617178391, LR: 0.001 [2025-07-27 21:24:57] (step=0000684) Train Loss: 0.2296, Train Steps/Sec: 0.12, Epoch: 0.01329187718616401, LR: 0.001 [2025-07-27 21:25:05] (step=0000685) Train Loss: 0.2335, Train Steps/Sec: 0.13, Epoch: 0.01331130975514963, LR: 0.001 [2025-07-27 21:25:13] (step=0000686) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.013330742324135251, LR: 0.001 [2025-07-27 21:25:21] (step=0000687) Train Loss: 0.2841, Train Steps/Sec: 0.12, Epoch: 0.01335017489312087, LR: 0.001 [2025-07-27 21:25:29] (step=0000688) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.01336960746210649, LR: 0.001 [2025-07-27 21:25:37] (step=0000689) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.013389040031092111, LR: 0.001 [2025-07-27 21:25:45] (step=0000690) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.01340847260007773, LR: 0.001 [2025-07-27 21:25:53] (step=0000691) Train Loss: 0.1886, Train Steps/Sec: 0.12, Epoch: 0.01342790516906335, LR: 0.001 [2025-07-27 21:26:01] (step=0000692) Train Loss: 0.2272, Train Steps/Sec: 0.13, Epoch: 0.01344733773804897, LR: 0.001 [2025-07-27 21:26:08] (step=0000693) Train Loss: 0.2650, Train Steps/Sec: 0.14, Epoch: 0.01346677030703459, LR: 0.001 [2025-07-27 21:26:15] (step=0000694) Train Loss: 0.1844, Train Steps/Sec: 0.15, Epoch: 0.01348620287602021, LR: 0.001 [2025-07-27 21:26:23] (step=0000695) Train Loss: 0.1750, Train Steps/Sec: 0.12, Epoch: 0.01350563544500583, LR: 0.001 [2025-07-27 21:26:31] (step=0000696) Train Loss: 0.2346, Train Steps/Sec: 0.13, Epoch: 0.01352506801399145, LR: 0.001 [2025-07-27 21:26:39] (step=0000697) Train Loss: 0.1648, Train Steps/Sec: 0.12, Epoch: 0.01354450058297707, LR: 0.001 [2025-07-27 21:26:47] (step=0000698) Train Loss: 0.2491, Train Steps/Sec: 0.12, Epoch: 0.01356393315196269, LR: 0.001 [2025-07-27 21:26:55] (step=0000699) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.01358336572094831, LR: 0.001 [2025-07-27 21:27:03] (step=0000700) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.013602798289933929, LR: 0.001 [2025-07-27 21:27:11] (step=0000701) Train Loss: 0.2244, Train Steps/Sec: 0.13, Epoch: 0.013622230858919549, LR: 0.001 [2025-07-27 21:27:19] (step=0000702) Train Loss: 0.2270, Train Steps/Sec: 0.12, Epoch: 0.01364166342790517, LR: 0.001 [2025-07-27 21:27:28] (step=0000703) Train Loss: 0.2837, Train Steps/Sec: 0.12, Epoch: 0.013661095996890789, LR: 0.001 [2025-07-27 21:27:36] (step=0000704) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.013680528565876409, LR: 0.001 [2025-07-27 21:27:44] (step=0000705) Train Loss: 0.2003, Train Steps/Sec: 0.12, Epoch: 0.01369996113486203, LR: 0.001 [2025-07-27 21:27:52] (step=0000706) Train Loss: 0.2023, Train Steps/Sec: 0.13, Epoch: 0.013719393703847648, LR: 0.001 [2025-07-27 21:27:59] (step=0000707) Train Loss: 0.3019, Train Steps/Sec: 0.13, Epoch: 0.013738826272833269, LR: 0.001 [2025-07-27 21:28:07] (step=0000708) Train Loss: 0.2351, Train Steps/Sec: 0.12, Epoch: 0.013758258841818888, LR: 0.001 [2025-07-27 21:28:16] (step=0000709) Train Loss: 0.2603, Train Steps/Sec: 0.12, Epoch: 0.013777691410804508, LR: 0.001 [2025-07-27 21:28:24] (step=0000710) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.013797123979790129, LR: 0.001 [2025-07-27 21:28:32] (step=0000711) Train Loss: 0.2402, Train Steps/Sec: 0.12, Epoch: 0.013816556548775748, LR: 0.001 [2025-07-27 21:28:40] (step=0000712) Train Loss: 0.2173, Train Steps/Sec: 0.12, Epoch: 0.013835989117761368, LR: 0.001 [2025-07-27 21:28:48] (step=0000713) Train Loss: 0.2753, Train Steps/Sec: 0.12, Epoch: 0.013855421686746987, LR: 0.001 [2025-07-27 21:28:56] (step=0000714) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.013874854255732608, LR: 0.001 [2025-07-27 21:29:04] (step=0000715) Train Loss: 0.2290, Train Steps/Sec: 0.13, Epoch: 0.013894286824718228, LR: 0.001 [2025-07-27 21:29:12] (step=0000716) Train Loss: 0.2075, Train Steps/Sec: 0.13, Epoch: 0.013913719393703847, LR: 0.001 [2025-07-27 21:29:20] (step=0000717) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.013933151962689468, LR: 0.001 [2025-07-27 21:29:28] (step=0000718) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.013952584531675088, LR: 0.001 [2025-07-27 21:29:36] (step=0000719) Train Loss: 0.3442, Train Steps/Sec: 0.12, Epoch: 0.013972017100660707, LR: 0.001 [2025-07-27 21:29:44] (step=0000720) Train Loss: 0.3475, Train Steps/Sec: 0.12, Epoch: 0.013991449669646328, LR: 0.001 [2025-07-27 21:29:52] (step=0000721) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.014010882238631946, LR: 0.001 [2025-07-27 21:30:00] (step=0000722) Train Loss: 0.2109, Train Steps/Sec: 0.12, Epoch: 0.014030314807617567, LR: 0.001 [2025-07-27 21:30:08] (step=0000723) Train Loss: 0.2333, Train Steps/Sec: 0.13, Epoch: 0.014049747376603187, LR: 0.001 [2025-07-27 21:30:16] (step=0000724) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.014069179945588806, LR: 0.001 [2025-07-27 21:30:24] (step=0000725) Train Loss: 0.1865, Train Steps/Sec: 0.13, Epoch: 0.014088612514574427, LR: 0.001 [2025-07-27 21:30:32] (step=0000726) Train Loss: 0.2473, Train Steps/Sec: 0.13, Epoch: 0.014108045083560047, LR: 0.001 [2025-07-27 21:30:38] (step=0000727) Train Loss: 0.2119, Train Steps/Sec: 0.16, Epoch: 0.014127477652545666, LR: 0.001 [2025-07-27 21:30:46] (step=0000728) Train Loss: 0.1939, Train Steps/Sec: 0.12, Epoch: 0.014146910221531287, LR: 0.001 [2025-07-27 21:30:54] (step=0000729) Train Loss: 0.3606, Train Steps/Sec: 0.12, Epoch: 0.014166342790516906, LR: 0.001 [2025-07-27 21:31:02] (step=0000730) Train Loss: 0.2371, Train Steps/Sec: 0.13, Epoch: 0.014185775359502526, LR: 0.001 [2025-07-27 21:31:10] (step=0000731) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.014205207928488147, LR: 0.001 [2025-07-27 21:31:18] (step=0000732) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.014224640497473766, LR: 0.001 [2025-07-27 21:31:26] (step=0000733) Train Loss: 0.2019, Train Steps/Sec: 0.12, Epoch: 0.014244073066459386, LR: 0.001 [2025-07-27 21:31:34] (step=0000734) Train Loss: 0.2764, Train Steps/Sec: 0.12, Epoch: 0.014263505635445007, LR: 0.001 [2025-07-27 21:31:42] (step=0000735) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.014282938204430625, LR: 0.001 [2025-07-27 21:31:50] (step=0000736) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 0.014302370773416246, LR: 0.001 [2025-07-27 21:31:58] (step=0000737) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.014321803342401865, LR: 0.001 [2025-07-27 21:32:06] (step=0000738) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.014341235911387485, LR: 0.001 [2025-07-27 21:32:14] (step=0000739) Train Loss: 0.1722, Train Steps/Sec: 0.12, Epoch: 0.014360668480373106, LR: 0.001 [2025-07-27 21:32:22] (step=0000740) Train Loss: 0.2521, Train Steps/Sec: 0.12, Epoch: 0.014380101049358725, LR: 0.001 [2025-07-27 21:32:30] (step=0000741) Train Loss: 0.1990, Train Steps/Sec: 0.13, Epoch: 0.014399533618344345, LR: 0.001 [2025-07-27 21:32:38] (step=0000742) Train Loss: 0.2405, Train Steps/Sec: 0.12, Epoch: 0.014418966187329966, LR: 0.001 [2025-07-27 21:32:46] (step=0000743) Train Loss: 0.3057, Train Steps/Sec: 0.12, Epoch: 0.014438398756315585, LR: 0.001 [2025-07-27 21:32:54] (step=0000744) Train Loss: 0.1910, Train Steps/Sec: 0.13, Epoch: 0.014457831325301205, LR: 0.001 [2025-07-27 21:33:02] (step=0000745) Train Loss: 0.1568, Train Steps/Sec: 0.12, Epoch: 0.014477263894286824, LR: 0.001 [2025-07-27 21:33:10] (step=0000746) Train Loss: 0.2705, Train Steps/Sec: 0.13, Epoch: 0.014496696463272445, LR: 0.001 [2025-07-27 21:33:18] (step=0000747) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.014516129032258065, LR: 0.001 [2025-07-27 21:33:26] (step=0000748) Train Loss: 0.3154, Train Steps/Sec: 0.13, Epoch: 0.014535561601243684, LR: 0.001 [2025-07-27 21:33:34] (step=0000749) Train Loss: 0.2124, Train Steps/Sec: 0.12, Epoch: 0.014554994170229305, LR: 0.001 [2025-07-27 21:33:43] (step=0000750) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.014574426739214923, LR: 0.001 [2025-07-27 21:33:51] (step=0000751) Train Loss: 0.2459, Train Steps/Sec: 0.13, Epoch: 0.014593859308200544, LR: 0.001 [2025-07-27 21:33:59] (step=0000752) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.014613291877186164, LR: 0.001 [2025-07-27 21:34:07] (step=0000753) Train Loss: 0.2685, Train Steps/Sec: 0.13, Epoch: 0.014632724446171783, LR: 0.001 [2025-07-27 21:34:15] (step=0000754) Train Loss: 0.2901, Train Steps/Sec: 0.12, Epoch: 0.014652157015157404, LR: 0.001 [2025-07-27 21:34:23] (step=0000755) Train Loss: 0.1634, Train Steps/Sec: 0.12, Epoch: 0.014671589584143024, LR: 0.001 [2025-07-27 21:34:31] (step=0000756) Train Loss: 0.1832, Train Steps/Sec: 0.12, Epoch: 0.014691022153128643, LR: 0.001 [2025-07-27 21:34:39] (step=0000757) Train Loss: 0.2828, Train Steps/Sec: 0.12, Epoch: 0.014710454722114264, LR: 0.001 [2025-07-27 21:34:47] (step=0000758) Train Loss: 0.2403, Train Steps/Sec: 0.13, Epoch: 0.014729887291099883, LR: 0.001 [2025-07-27 21:34:55] (step=0000759) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.014749319860085503, LR: 0.001 [2025-07-27 21:35:00] (step=0000760) Train Loss: 0.1589, Train Steps/Sec: 0.18, Epoch: 0.014768752429071124, LR: 0.001 [2025-07-27 21:35:08] (step=0000761) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.014788184998056743, LR: 0.001 [2025-07-27 21:35:16] (step=0000762) Train Loss: 0.1928, Train Steps/Sec: 0.13, Epoch: 0.014807617567042363, LR: 0.001 [2025-07-27 21:35:25] (step=0000763) Train Loss: 0.2488, Train Steps/Sec: 0.12, Epoch: 0.014827050136027984, LR: 0.001 [2025-07-27 21:35:33] (step=0000764) Train Loss: 0.2891, Train Steps/Sec: 0.13, Epoch: 0.014846482705013602, LR: 0.001 [2025-07-27 21:35:41] (step=0000765) Train Loss: 0.2616, Train Steps/Sec: 0.12, Epoch: 0.014865915273999223, LR: 0.001 [2025-07-27 21:35:49] (step=0000766) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.014885347842984842, LR: 0.001 [2025-07-27 21:35:57] (step=0000767) Train Loss: 0.2772, Train Steps/Sec: 0.13, Epoch: 0.014904780411970462, LR: 0.001 [2025-07-27 21:36:05] (step=0000768) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.014924212980956083, LR: 0.001 [2025-07-27 21:36:13] (step=0000769) Train Loss: 0.2499, Train Steps/Sec: 0.13, Epoch: 0.014943645549941702, LR: 0.001 [2025-07-27 21:36:21] (step=0000770) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.014963078118927322, LR: 0.001 [2025-07-27 21:36:29] (step=0000771) Train Loss: 0.2895, Train Steps/Sec: 0.12, Epoch: 0.014982510687912943, LR: 0.001 [2025-07-27 21:36:37] (step=0000772) Train Loss: 0.1849, Train Steps/Sec: 0.13, Epoch: 0.015001943256898562, LR: 0.001 [2025-07-27 21:36:45] (step=0000773) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.015021375825884182, LR: 0.001 [2025-07-27 21:36:53] (step=0000774) Train Loss: 0.3360, Train Steps/Sec: 0.13, Epoch: 0.015040808394869801, LR: 0.001 [2025-07-27 21:37:01] (step=0000775) Train Loss: 0.2654, Train Steps/Sec: 0.12, Epoch: 0.015060240963855422, LR: 0.001 [2025-07-27 21:37:09] (step=0000776) Train Loss: 0.3026, Train Steps/Sec: 0.12, Epoch: 0.015079673532841042, LR: 0.001 [2025-07-27 21:37:17] (step=0000777) Train Loss: 0.2872, Train Steps/Sec: 0.13, Epoch: 0.015099106101826661, LR: 0.001 [2025-07-27 21:37:25] (step=0000778) Train Loss: 0.2310, Train Steps/Sec: 0.12, Epoch: 0.015118538670812282, LR: 0.001 [2025-07-27 21:37:33] (step=0000779) Train Loss: 0.2038, Train Steps/Sec: 0.13, Epoch: 0.015137971239797902, LR: 0.001 [2025-07-27 21:37:41] (step=0000780) Train Loss: 0.2076, Train Steps/Sec: 0.12, Epoch: 0.015157403808783521, LR: 0.001 [2025-07-27 21:37:49] (step=0000781) Train Loss: 0.2628, Train Steps/Sec: 0.13, Epoch: 0.015176836377769141, LR: 0.001 [2025-07-27 21:37:57] (step=0000782) Train Loss: 0.2753, Train Steps/Sec: 0.13, Epoch: 0.01519626894675476, LR: 0.001 [2025-07-27 21:38:05] (step=0000783) Train Loss: 0.3551, Train Steps/Sec: 0.12, Epoch: 0.01521570151574038, LR: 0.001 [2025-07-27 21:38:13] (step=0000784) Train Loss: 0.1961, Train Steps/Sec: 0.12, Epoch: 0.015235134084726001, LR: 0.001 [2025-07-27 21:38:21] (step=0000785) Train Loss: 0.2855, Train Steps/Sec: 0.12, Epoch: 0.01525456665371162, LR: 0.001 [2025-07-27 21:38:29] (step=0000786) Train Loss: 0.3162, Train Steps/Sec: 0.12, Epoch: 0.01527399922269724, LR: 0.001 [2025-07-27 21:38:37] (step=0000787) Train Loss: 0.1563, Train Steps/Sec: 0.12, Epoch: 0.015293431791682861, LR: 0.001 [2025-07-27 21:38:45] (step=0000788) Train Loss: 0.2723, Train Steps/Sec: 0.13, Epoch: 0.01531286436066848, LR: 0.001 [2025-07-27 21:38:53] (step=0000789) Train Loss: 0.2593, Train Steps/Sec: 0.12, Epoch: 0.0153322969296541, LR: 0.001 [2025-07-27 21:39:01] (step=0000790) Train Loss: 0.2457, Train Steps/Sec: 0.12, Epoch: 0.01535172949863972, LR: 0.001 [2025-07-27 21:39:09] (step=0000791) Train Loss: 0.2608, Train Steps/Sec: 0.13, Epoch: 0.01537116206762534, LR: 0.001 [2025-07-27 21:39:17] (step=0000792) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.01539059463661096, LR: 0.001 [2025-07-27 21:39:23] (step=0000793) Train Loss: 0.1891, Train Steps/Sec: 0.17, Epoch: 0.01541002720559658, LR: 0.001 [2025-07-27 21:39:31] (step=0000794) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.0154294597745822, LR: 0.001 [2025-07-27 21:39:39] (step=0000795) Train Loss: 0.3183, Train Steps/Sec: 0.13, Epoch: 0.015448892343567819, LR: 0.001 [2025-07-27 21:39:47] (step=0000796) Train Loss: 0.2037, Train Steps/Sec: 0.12, Epoch: 0.01546832491255344, LR: 0.001 [2025-07-27 21:39:55] (step=0000797) Train Loss: 0.3249, Train Steps/Sec: 0.12, Epoch: 0.01548775748153906, LR: 0.001 [2025-07-27 21:40:03] (step=0000798) Train Loss: 0.2916, Train Steps/Sec: 0.12, Epoch: 0.015507190050524679, LR: 0.001 [2025-07-27 21:40:11] (step=0000799) Train Loss: 0.2886, Train Steps/Sec: 0.12, Epoch: 0.0155266226195103, LR: 0.001 [2025-07-27 21:40:19] (step=0000800) Train Loss: 0.2596, Train Steps/Sec: 0.12, Epoch: 0.01554605518849592, LR: 0.001 [2025-07-27 21:40:27] (step=0000801) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.015565487757481539, LR: 0.001 [2025-07-27 21:40:35] (step=0000802) Train Loss: 0.1961, Train Steps/Sec: 0.13, Epoch: 0.01558492032646716, LR: 0.001 [2025-07-27 21:40:44] (step=0000803) Train Loss: 0.2580, Train Steps/Sec: 0.12, Epoch: 0.015604352895452778, LR: 0.001 [2025-07-27 21:40:52] (step=0000804) Train Loss: 0.2510, Train Steps/Sec: 0.13, Epoch: 0.015623785464438399, LR: 0.001 [2025-07-27 21:41:00] (step=0000805) Train Loss: 0.2158, Train Steps/Sec: 0.12, Epoch: 0.015643218033424017, LR: 0.001 [2025-07-27 21:41:08] (step=0000806) Train Loss: 0.2364, Train Steps/Sec: 0.12, Epoch: 0.01566265060240964, LR: 0.001 [2025-07-27 21:41:16] (step=0000807) Train Loss: 0.2984, Train Steps/Sec: 0.13, Epoch: 0.01568208317139526, LR: 0.001 [2025-07-27 21:41:24] (step=0000808) Train Loss: 0.2403, Train Steps/Sec: 0.12, Epoch: 0.015701515740380877, LR: 0.001 [2025-07-27 21:41:32] (step=0000809) Train Loss: 0.2256, Train Steps/Sec: 0.13, Epoch: 0.0157209483093665, LR: 0.001 [2025-07-27 21:41:40] (step=0000810) Train Loss: 0.2759, Train Steps/Sec: 0.13, Epoch: 0.01574038087835212, LR: 0.001 [2025-07-27 21:41:48] (step=0000811) Train Loss: 0.2851, Train Steps/Sec: 0.12, Epoch: 0.015759813447337737, LR: 0.001 [2025-07-27 21:41:56] (step=0000812) Train Loss: 0.3210, Train Steps/Sec: 0.12, Epoch: 0.01577924601632336, LR: 0.001 [2025-07-27 21:42:04] (step=0000813) Train Loss: 0.2358, Train Steps/Sec: 0.12, Epoch: 0.01579867858530898, LR: 0.001 [2025-07-27 21:42:12] (step=0000814) Train Loss: 0.2815, Train Steps/Sec: 0.12, Epoch: 0.015818111154294597, LR: 0.001 [2025-07-27 21:42:20] (step=0000815) Train Loss: 0.2579, Train Steps/Sec: 0.12, Epoch: 0.015837543723280216, LR: 0.001 [2025-07-27 21:42:28] (step=0000816) Train Loss: 0.3026, Train Steps/Sec: 0.13, Epoch: 0.01585697629226584, LR: 0.001 [2025-07-27 21:42:36] (step=0000817) Train Loss: 0.2410, Train Steps/Sec: 0.13, Epoch: 0.015876408861251457, LR: 0.001 [2025-07-27 21:42:44] (step=0000818) Train Loss: 0.2451, Train Steps/Sec: 0.12, Epoch: 0.015895841430237076, LR: 0.001 [2025-07-27 21:42:52] (step=0000819) Train Loss: 0.2051, Train Steps/Sec: 0.13, Epoch: 0.015915273999222698, LR: 0.001 [2025-07-27 21:43:00] (step=0000820) Train Loss: 0.2552, Train Steps/Sec: 0.12, Epoch: 0.015934706568208317, LR: 0.001 [2025-07-27 21:43:08] (step=0000821) Train Loss: 0.2806, Train Steps/Sec: 0.12, Epoch: 0.015954139137193936, LR: 0.001 [2025-07-27 21:43:16] (step=0000822) Train Loss: 0.2894, Train Steps/Sec: 0.12, Epoch: 0.015973571706179558, LR: 0.001 [2025-07-27 21:43:24] (step=0000823) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.015993004275165177, LR: 0.001 [2025-07-27 21:43:32] (step=0000824) Train Loss: 0.2155, Train Steps/Sec: 0.13, Epoch: 0.016012436844150796, LR: 0.001 [2025-07-27 21:43:40] (step=0000825) Train Loss: 0.2664, Train Steps/Sec: 0.12, Epoch: 0.016031869413136418, LR: 0.001 [2025-07-27 21:43:45] (step=0000826) Train Loss: 0.2491, Train Steps/Sec: 0.19, Epoch: 0.016051301982122037, LR: 0.001 [2025-07-27 21:43:54] (step=0000827) Train Loss: 0.2057, Train Steps/Sec: 0.12, Epoch: 0.016070734551107656, LR: 0.001 [2025-07-27 21:44:02] (step=0000828) Train Loss: 0.2931, Train Steps/Sec: 0.12, Epoch: 0.016090167120093278, LR: 0.001 [2025-07-27 21:44:10] (step=0000829) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.016109599689078897, LR: 0.001 [2025-07-27 21:44:18] (step=0000830) Train Loss: 0.1780, Train Steps/Sec: 0.13, Epoch: 0.016129032258064516, LR: 0.001 [2025-07-27 21:44:26] (step=0000831) Train Loss: 0.2558, Train Steps/Sec: 0.12, Epoch: 0.016148464827050135, LR: 0.001 [2025-07-27 21:44:34] (step=0000832) Train Loss: 0.2877, Train Steps/Sec: 0.12, Epoch: 0.016167897396035757, LR: 0.001 [2025-07-27 21:44:42] (step=0000833) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.016187329965021376, LR: 0.001 [2025-07-27 21:44:50] (step=0000834) Train Loss: 0.3166, Train Steps/Sec: 0.12, Epoch: 0.016206762534006994, LR: 0.001 [2025-07-27 21:44:58] (step=0000835) Train Loss: 0.1714, Train Steps/Sec: 0.13, Epoch: 0.016226195102992617, LR: 0.001 [2025-07-27 21:45:06] (step=0000836) Train Loss: 0.2279, Train Steps/Sec: 0.12, Epoch: 0.016245627671978236, LR: 0.001 [2025-07-27 21:45:14] (step=0000837) Train Loss: 0.2969, Train Steps/Sec: 0.12, Epoch: 0.016265060240963854, LR: 0.001 [2025-07-27 21:45:22] (step=0000838) Train Loss: 0.2982, Train Steps/Sec: 0.12, Epoch: 0.016284492809949477, LR: 0.001 [2025-07-27 21:45:30] (step=0000839) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.016303925378935095, LR: 0.001 [2025-07-27 21:45:38] (step=0000840) Train Loss: 0.1857, Train Steps/Sec: 0.12, Epoch: 0.016323357947920714, LR: 0.001 [2025-07-27 21:45:46] (step=0000841) Train Loss: 0.2347, Train Steps/Sec: 0.12, Epoch: 0.016342790516906337, LR: 0.001 [2025-07-27 21:45:54] (step=0000842) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.016362223085891955, LR: 0.001 [2025-07-27 21:46:02] (step=0000843) Train Loss: 0.1883, Train Steps/Sec: 0.12, Epoch: 0.016381655654877574, LR: 0.001 [2025-07-27 21:46:10] (step=0000844) Train Loss: 0.1306, Train Steps/Sec: 0.13, Epoch: 0.016401088223863193, LR: 0.001 [2025-07-27 21:46:18] (step=0000845) Train Loss: 0.2631, Train Steps/Sec: 0.12, Epoch: 0.016420520792848815, LR: 0.001 [2025-07-27 21:46:26] (step=0000846) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.016439953361834434, LR: 0.001 [2025-07-27 21:46:34] (step=0000847) Train Loss: 0.2945, Train Steps/Sec: 0.13, Epoch: 0.016459385930820053, LR: 0.001 [2025-07-27 21:46:42] (step=0000848) Train Loss: 0.2538, Train Steps/Sec: 0.12, Epoch: 0.016478818499805675, LR: 0.001 [2025-07-27 21:46:50] (step=0000849) Train Loss: 0.2024, Train Steps/Sec: 0.13, Epoch: 0.016498251068791294, LR: 0.001 [2025-07-27 21:46:58] (step=0000850) Train Loss: 0.2339, Train Steps/Sec: 0.12, Epoch: 0.016517683637776913, LR: 0.001 [2025-07-27 21:47:07] (step=0000851) Train Loss: 0.2470, Train Steps/Sec: 0.12, Epoch: 0.016537116206762535, LR: 0.001 [2025-07-27 21:47:15] (step=0000852) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.016556548775748154, LR: 0.001 [2025-07-27 21:47:23] (step=0000853) Train Loss: 0.2310, Train Steps/Sec: 0.12, Epoch: 0.016575981344733773, LR: 0.001 [2025-07-27 21:47:31] (step=0000854) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.016595413913719395, LR: 0.001 [2025-07-27 21:47:39] (step=0000855) Train Loss: 0.2197, Train Steps/Sec: 0.12, Epoch: 0.016614846482705014, LR: 0.001 [2025-07-27 21:47:47] (step=0000856) Train Loss: 0.1793, Train Steps/Sec: 0.12, Epoch: 0.016634279051690633, LR: 0.001 [2025-07-27 21:47:55] (step=0000857) Train Loss: 0.3015, Train Steps/Sec: 0.13, Epoch: 0.016653711620676255, LR: 0.001 [2025-07-27 21:48:03] (step=0000858) Train Loss: 0.1957, Train Steps/Sec: 0.12, Epoch: 0.016673144189661874, LR: 0.001 [2025-07-27 21:48:09] (step=0000859) Train Loss: 0.2490, Train Steps/Sec: 0.17, Epoch: 0.016692576758647493, LR: 0.001 [2025-07-27 21:48:17] (step=0000860) Train Loss: 0.2703, Train Steps/Sec: 0.13, Epoch: 0.01671200932763311, LR: 0.001 [2025-07-27 21:48:25] (step=0000861) Train Loss: 0.2915, Train Steps/Sec: 0.13, Epoch: 0.016731441896618734, LR: 0.001 [2025-07-27 21:48:33] (step=0000862) Train Loss: 0.2918, Train Steps/Sec: 0.12, Epoch: 0.016750874465604353, LR: 0.001 [2025-07-27 21:48:41] (step=0000863) Train Loss: 0.2065, Train Steps/Sec: 0.12, Epoch: 0.01677030703458997, LR: 0.001 [2025-07-27 21:48:49] (step=0000864) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.016789739603575594, LR: 0.001 [2025-07-27 21:48:57] (step=0000865) Train Loss: 0.2533, Train Steps/Sec: 0.13, Epoch: 0.016809172172561213, LR: 0.001 [2025-07-27 21:49:05] (step=0000866) Train Loss: 0.2891, Train Steps/Sec: 0.12, Epoch: 0.01682860474154683, LR: 0.001 [2025-07-27 21:49:13] (step=0000867) Train Loss: 0.2699, Train Steps/Sec: 0.12, Epoch: 0.016848037310532454, LR: 0.001 [2025-07-27 21:49:21] (step=0000868) Train Loss: 0.2448, Train Steps/Sec: 0.12, Epoch: 0.016867469879518072, LR: 0.001 [2025-07-27 21:49:29] (step=0000869) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.01688690244850369, LR: 0.001 [2025-07-27 21:49:37] (step=0000870) Train Loss: 0.2529, Train Steps/Sec: 0.13, Epoch: 0.016906335017489314, LR: 0.001 [2025-07-27 21:49:45] (step=0000871) Train Loss: 0.2469, Train Steps/Sec: 0.12, Epoch: 0.016925767586474932, LR: 0.001 [2025-07-27 21:49:53] (step=0000872) Train Loss: 0.3033, Train Steps/Sec: 0.12, Epoch: 0.01694520015546055, LR: 0.001 [2025-07-27 21:50:01] (step=0000873) Train Loss: 0.2366, Train Steps/Sec: 0.13, Epoch: 0.016964632724446174, LR: 0.001 [2025-07-27 21:50:09] (step=0000874) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.016984065293431792, LR: 0.001 [2025-07-27 21:50:17] (step=0000875) Train Loss: 0.2787, Train Steps/Sec: 0.12, Epoch: 0.01700349786241741, LR: 0.001 [2025-07-27 21:50:25] (step=0000876) Train Loss: 0.2230, Train Steps/Sec: 0.12, Epoch: 0.01702293043140303, LR: 0.001 [2025-07-27 21:50:33] (step=0000877) Train Loss: 0.2430, Train Steps/Sec: 0.13, Epoch: 0.017042363000388652, LR: 0.001 [2025-07-27 21:50:41] (step=0000878) Train Loss: 0.2649, Train Steps/Sec: 0.12, Epoch: 0.01706179556937427, LR: 0.001 [2025-07-27 21:50:49] (step=0000879) Train Loss: 0.2125, Train Steps/Sec: 0.12, Epoch: 0.01708122813835989, LR: 0.001 [2025-07-27 21:50:57] (step=0000880) Train Loss: 0.1824, Train Steps/Sec: 0.12, Epoch: 0.017100660707345512, LR: 0.001 [2025-07-27 21:51:05] (step=0000881) Train Loss: 0.2902, Train Steps/Sec: 0.12, Epoch: 0.01712009327633113, LR: 0.001 [2025-07-27 21:51:13] (step=0000882) Train Loss: 0.2729, Train Steps/Sec: 0.12, Epoch: 0.01713952584531675, LR: 0.001 [2025-07-27 21:51:21] (step=0000883) Train Loss: 0.2787, Train Steps/Sec: 0.12, Epoch: 0.017158958414302372, LR: 0.001 [2025-07-27 21:51:29] (step=0000884) Train Loss: 0.2597, Train Steps/Sec: 0.12, Epoch: 0.01717839098328799, LR: 0.001 [2025-07-27 21:51:37] (step=0000885) Train Loss: 0.2099, Train Steps/Sec: 0.13, Epoch: 0.01719782355227361, LR: 0.001 [2025-07-27 21:51:45] (step=0000886) Train Loss: 0.2549, Train Steps/Sec: 0.12, Epoch: 0.017217256121259232, LR: 0.001 [2025-07-27 21:51:53] (step=0000887) Train Loss: 0.1977, Train Steps/Sec: 0.12, Epoch: 0.01723668869024485, LR: 0.001 [2025-07-27 21:52:01] (step=0000888) Train Loss: 0.2980, Train Steps/Sec: 0.12, Epoch: 0.01725612125923047, LR: 0.001 [2025-07-27 21:52:09] (step=0000889) Train Loss: 0.2055, Train Steps/Sec: 0.13, Epoch: 0.01727555382821609, LR: 0.001 [2025-07-27 21:52:17] (step=0000890) Train Loss: 0.2339, Train Steps/Sec: 0.13, Epoch: 0.01729498639720171, LR: 0.001 [2025-07-27 21:52:25] (step=0000891) Train Loss: 0.2470, Train Steps/Sec: 0.12, Epoch: 0.01731441896618733, LR: 0.001 [2025-07-27 21:52:32] (step=0000892) Train Loss: 0.1887, Train Steps/Sec: 0.15, Epoch: 0.01733385153517295, LR: 0.001 [2025-07-27 21:52:39] (step=0000893) Train Loss: 0.3486, Train Steps/Sec: 0.15, Epoch: 0.01735328410415857, LR: 0.001 [2025-07-27 21:52:47] (step=0000894) Train Loss: 0.2856, Train Steps/Sec: 0.12, Epoch: 0.01737271667314419, LR: 0.001 [2025-07-27 21:52:55] (step=0000895) Train Loss: 0.2144, Train Steps/Sec: 0.13, Epoch: 0.01739214924212981, LR: 0.001 [2025-07-27 21:53:03] (step=0000896) Train Loss: 0.2604, Train Steps/Sec: 0.12, Epoch: 0.01741158181111543, LR: 0.001 [2025-07-27 21:53:11] (step=0000897) Train Loss: 0.2424, Train Steps/Sec: 0.12, Epoch: 0.01743101438010105, LR: 0.001 [2025-07-27 21:53:19] (step=0000898) Train Loss: 0.2361, Train Steps/Sec: 0.12, Epoch: 0.01745044694908667, LR: 0.001 [2025-07-27 21:53:27] (step=0000899) Train Loss: 0.2276, Train Steps/Sec: 0.12, Epoch: 0.01746987951807229, LR: 0.001 [2025-07-27 21:53:35] (step=0000900) Train Loss: 0.2183, Train Steps/Sec: 0.13, Epoch: 0.01748931208705791, LR: 0.001 [2025-07-27 21:53:43] (step=0000901) Train Loss: 0.2532, Train Steps/Sec: 0.12, Epoch: 0.017508744656043528, LR: 0.001 [2025-07-27 21:53:51] (step=0000902) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.01752817722502915, LR: 0.001 [2025-07-27 21:53:59] (step=0000903) Train Loss: 0.2367, Train Steps/Sec: 0.13, Epoch: 0.01754760979401477, LR: 0.001 [2025-07-27 21:54:07] (step=0000904) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.017567042363000388, LR: 0.001 [2025-07-27 21:54:15] (step=0000905) Train Loss: 0.2886, Train Steps/Sec: 0.12, Epoch: 0.017586474931986007, LR: 0.001 [2025-07-27 21:54:23] (step=0000906) Train Loss: 0.2025, Train Steps/Sec: 0.12, Epoch: 0.01760590750097163, LR: 0.001 [2025-07-27 21:54:32] (step=0000907) Train Loss: 0.1952, Train Steps/Sec: 0.12, Epoch: 0.017625340069957248, LR: 0.001 [2025-07-27 21:54:40] (step=0000908) Train Loss: 0.2326, Train Steps/Sec: 0.13, Epoch: 0.017644772638942867, LR: 0.001 [2025-07-27 21:54:48] (step=0000909) Train Loss: 0.3393, Train Steps/Sec: 0.12, Epoch: 0.01766420520792849, LR: 0.001 [2025-07-27 21:54:56] (step=0000910) Train Loss: 0.2337, Train Steps/Sec: 0.13, Epoch: 0.017683637776914108, LR: 0.001 [2025-07-27 21:55:04] (step=0000911) Train Loss: 0.2774, Train Steps/Sec: 0.12, Epoch: 0.017703070345899727, LR: 0.001 [2025-07-27 21:55:12] (step=0000912) Train Loss: 0.2121, Train Steps/Sec: 0.12, Epoch: 0.01772250291488535, LR: 0.001 [2025-07-27 21:55:20] (step=0000913) Train Loss: 0.3034, Train Steps/Sec: 0.12, Epoch: 0.017741935483870968, LR: 0.001 [2025-07-27 21:55:28] (step=0000914) Train Loss: 0.3207, Train Steps/Sec: 0.12, Epoch: 0.017761368052856587, LR: 0.001 [2025-07-27 21:55:36] (step=0000915) Train Loss: 0.2018, Train Steps/Sec: 0.12, Epoch: 0.01778080062184221, LR: 0.001 [2025-07-27 21:55:44] (step=0000916) Train Loss: 0.2422, Train Steps/Sec: 0.12, Epoch: 0.017800233190827828, LR: 0.001 [2025-07-27 21:55:52] (step=0000917) Train Loss: 0.1996, Train Steps/Sec: 0.13, Epoch: 0.017819665759813447, LR: 0.001 [2025-07-27 21:56:00] (step=0000918) Train Loss: 0.2085, Train Steps/Sec: 0.12, Epoch: 0.017839098328799066, LR: 0.001 [2025-07-27 21:56:08] (step=0000919) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.017858530897784688, LR: 0.001 [2025-07-27 21:56:16] (step=0000920) Train Loss: 0.2943, Train Steps/Sec: 0.13, Epoch: 0.017877963466770307, LR: 0.001 [2025-07-27 21:56:24] (step=0000921) Train Loss: 0.1984, Train Steps/Sec: 0.12, Epoch: 0.017897396035755925, LR: 0.001 [2025-07-27 21:56:32] (step=0000922) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.017916828604741548, LR: 0.001 [2025-07-27 21:56:40] (step=0000923) Train Loss: 0.2639, Train Steps/Sec: 0.12, Epoch: 0.017936261173727167, LR: 0.001 [2025-07-27 21:56:48] (step=0000924) Train Loss: 0.2380, Train Steps/Sec: 0.13, Epoch: 0.017955693742712785, LR: 0.001 [2025-07-27 21:56:55] (step=0000925) Train Loss: 0.2962, Train Steps/Sec: 0.14, Epoch: 0.017975126311698408, LR: 0.001 [2025-07-27 21:57:02] (step=0000926) Train Loss: 0.2364, Train Steps/Sec: 0.16, Epoch: 0.017994558880684026, LR: 0.001 [2025-07-27 21:57:10] (step=0000927) Train Loss: 0.2519, Train Steps/Sec: 0.12, Epoch: 0.018013991449669645, LR: 0.001 [2025-07-27 21:57:18] (step=0000928) Train Loss: 0.1956, Train Steps/Sec: 0.12, Epoch: 0.018033424018655268, LR: 0.001 [2025-07-27 21:57:26] (step=0000929) Train Loss: 0.2108, Train Steps/Sec: 0.13, Epoch: 0.018052856587640886, LR: 0.001 [2025-07-27 21:57:34] (step=0000930) Train Loss: 0.2685, Train Steps/Sec: 0.12, Epoch: 0.018072289156626505, LR: 0.001 [2025-07-27 21:57:42] (step=0000931) Train Loss: 0.3135, Train Steps/Sec: 0.12, Epoch: 0.018091721725612128, LR: 0.001 [2025-07-27 21:57:50] (step=0000932) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.018111154294597746, LR: 0.001 [2025-07-27 21:57:58] (step=0000933) Train Loss: 0.2322, Train Steps/Sec: 0.12, Epoch: 0.018130586863583365, LR: 0.001 [2025-07-27 21:58:06] (step=0000934) Train Loss: 0.2317, Train Steps/Sec: 0.12, Epoch: 0.018150019432568984, LR: 0.001 [2025-07-27 21:58:14] (step=0000935) Train Loss: 0.2748, Train Steps/Sec: 0.12, Epoch: 0.018169452001554606, LR: 0.001 [2025-07-27 21:58:22] (step=0000936) Train Loss: 0.2306, Train Steps/Sec: 0.12, Epoch: 0.018188884570540225, LR: 0.001 [2025-07-27 21:58:30] (step=0000937) Train Loss: 0.1428, Train Steps/Sec: 0.12, Epoch: 0.018208317139525844, LR: 0.001 [2025-07-27 21:58:38] (step=0000938) Train Loss: 0.2785, Train Steps/Sec: 0.13, Epoch: 0.018227749708511466, LR: 0.001 [2025-07-27 21:58:46] (step=0000939) Train Loss: 0.3295, Train Steps/Sec: 0.13, Epoch: 0.018247182277497085, LR: 0.001 [2025-07-27 21:58:54] (step=0000940) Train Loss: 0.2404, Train Steps/Sec: 0.13, Epoch: 0.018266614846482704, LR: 0.001 [2025-07-27 21:59:02] (step=0000941) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.018286047415468326, LR: 0.001 [2025-07-27 21:59:10] (step=0000942) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.018305479984453945, LR: 0.001 [2025-07-27 21:59:18] (step=0000943) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.018324912553439564, LR: 0.001 [2025-07-27 21:59:26] (step=0000944) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.018344345122425186, LR: 0.001 [2025-07-27 21:59:34] (step=0000945) Train Loss: 0.1533, Train Steps/Sec: 0.12, Epoch: 0.018363777691410805, LR: 0.001 [2025-07-27 21:59:42] (step=0000946) Train Loss: 0.2248, Train Steps/Sec: 0.12, Epoch: 0.018383210260396424, LR: 0.001 [2025-07-27 21:59:50] (step=0000947) Train Loss: 0.2468, Train Steps/Sec: 0.13, Epoch: 0.018402642829382046, LR: 0.001 [2025-07-27 21:59:58] (step=0000948) Train Loss: 0.2471, Train Steps/Sec: 0.13, Epoch: 0.018422075398367665, LR: 0.001 [2025-07-27 22:00:06] (step=0000949) Train Loss: 0.2265, Train Steps/Sec: 0.12, Epoch: 0.018441507967353284, LR: 0.001 [2025-07-27 22:00:14] (step=0000950) Train Loss: 0.2342, Train Steps/Sec: 0.13, Epoch: 0.018460940536338902, LR: 0.001 [2025-07-27 22:00:22] (step=0000951) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.018480373105324525, LR: 0.001 [2025-07-27 22:00:30] (step=0000952) Train Loss: 0.2043, Train Steps/Sec: 0.12, Epoch: 0.018499805674310144, LR: 0.001 [2025-07-27 22:00:38] (step=0000953) Train Loss: 0.2714, Train Steps/Sec: 0.13, Epoch: 0.018519238243295762, LR: 0.001 [2025-07-27 22:00:46] (step=0000954) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.018538670812281385, LR: 0.001 [2025-07-27 22:00:55] (step=0000955) Train Loss: 0.2245, Train Steps/Sec: 0.12, Epoch: 0.018558103381267003, LR: 0.001 [2025-07-27 22:01:03] (step=0000956) Train Loss: 0.2651, Train Steps/Sec: 0.12, Epoch: 0.018577535950252622, LR: 0.001 [2025-07-27 22:01:11] (step=0000957) Train Loss: 0.2417, Train Steps/Sec: 0.13, Epoch: 0.018596968519238245, LR: 0.001 [2025-07-27 22:01:19] (step=0000958) Train Loss: 0.3466, Train Steps/Sec: 0.12, Epoch: 0.018616401088223863, LR: 0.001 [2025-07-27 22:01:24] (step=0000959) Train Loss: 0.2863, Train Steps/Sec: 0.19, Epoch: 0.018635833657209482, LR: 0.001 [2025-07-27 22:01:32] (step=0000960) Train Loss: 0.2937, Train Steps/Sec: 0.13, Epoch: 0.018655266226195105, LR: 0.001 [2025-07-27 22:01:40] (step=0000961) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.018674698795180723, LR: 0.001 [2025-07-27 22:01:48] (step=0000962) Train Loss: 0.2581, Train Steps/Sec: 0.12, Epoch: 0.018694131364166342, LR: 0.001 [2025-07-27 22:01:56] (step=0000963) Train Loss: 0.2974, Train Steps/Sec: 0.13, Epoch: 0.01871356393315196, LR: 0.001 [2025-07-27 22:02:04] (step=0000964) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.018732996502137583, LR: 0.001 [2025-07-27 22:02:12] (step=0000965) Train Loss: 0.2639, Train Steps/Sec: 0.12, Epoch: 0.018752429071123202, LR: 0.001 [2025-07-27 22:02:20] (step=0000966) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.01877186164010882, LR: 0.001 [2025-07-27 22:02:28] (step=0000967) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.018791294209094443, LR: 0.001 [2025-07-27 22:02:36] (step=0000968) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.018810726778080062, LR: 0.001 [2025-07-27 22:02:44] (step=0000969) Train Loss: 0.2393, Train Steps/Sec: 0.12, Epoch: 0.01883015934706568, LR: 0.001 [2025-07-27 22:02:52] (step=0000970) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.018849591916051303, LR: 0.001 [2025-07-27 22:03:00] (step=0000971) Train Loss: 0.3054, Train Steps/Sec: 0.12, Epoch: 0.018869024485036922, LR: 0.001 [2025-07-27 22:03:09] (step=0000972) Train Loss: 0.1517, Train Steps/Sec: 0.12, Epoch: 0.01888845705402254, LR: 0.001 [2025-07-27 22:03:17] (step=0000973) Train Loss: 0.2095, Train Steps/Sec: 0.12, Epoch: 0.018907889623008163, LR: 0.001 [2025-07-27 22:03:25] (step=0000974) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.018927322191993782, LR: 0.001 [2025-07-27 22:03:33] (step=0000975) Train Loss: 0.2157, Train Steps/Sec: 0.12, Epoch: 0.0189467547609794, LR: 0.001 [2025-07-27 22:03:41] (step=0000976) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.018966187329965023, LR: 0.001 [2025-07-27 22:03:49] (step=0000977) Train Loss: 0.2651, Train Steps/Sec: 0.12, Epoch: 0.018985619898950642, LR: 0.001 [2025-07-27 22:03:57] (step=0000978) Train Loss: 0.2000, Train Steps/Sec: 0.12, Epoch: 0.01900505246793626, LR: 0.001 [2025-07-27 22:04:05] (step=0000979) Train Loss: 0.3081, Train Steps/Sec: 0.12, Epoch: 0.01902448503692188, LR: 0.001 [2025-07-27 22:04:13] (step=0000980) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.019043917605907502, LR: 0.001 [2025-07-27 22:04:21] (step=0000981) Train Loss: 0.2566, Train Steps/Sec: 0.12, Epoch: 0.01906335017489312, LR: 0.001 [2025-07-27 22:04:29] (step=0000982) Train Loss: 0.2102, Train Steps/Sec: 0.13, Epoch: 0.01908278274387874, LR: 0.001 [2025-07-27 22:04:37] (step=0000983) Train Loss: 0.2743, Train Steps/Sec: 0.13, Epoch: 0.01910221531286436, LR: 0.001 [2025-07-27 22:04:45] (step=0000984) Train Loss: 0.2429, Train Steps/Sec: 0.12, Epoch: 0.01912164788184998, LR: 0.001 [2025-07-27 22:04:53] (step=0000985) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.0191410804508356, LR: 0.001 [2025-07-27 22:05:01] (step=0000986) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.01916051301982122, LR: 0.001 [2025-07-27 22:05:09] (step=0000987) Train Loss: 0.2379, Train Steps/Sec: 0.13, Epoch: 0.01917994558880684, LR: 0.001 [2025-07-27 22:05:17] (step=0000988) Train Loss: 0.3011, Train Steps/Sec: 0.13, Epoch: 0.01919937815779246, LR: 0.001 [2025-07-27 22:05:25] (step=0000989) Train Loss: 0.2495, Train Steps/Sec: 0.12, Epoch: 0.01921881072677808, LR: 0.001 [2025-07-27 22:05:33] (step=0000990) Train Loss: 0.2147, Train Steps/Sec: 0.13, Epoch: 0.0192382432957637, LR: 0.001 [2025-07-27 22:05:41] (step=0000991) Train Loss: 0.2513, Train Steps/Sec: 0.12, Epoch: 0.01925767586474932, LR: 0.001 [2025-07-27 22:05:47] (step=0000992) Train Loss: 0.2819, Train Steps/Sec: 0.18, Epoch: 0.01927710843373494, LR: 0.001 [2025-07-27 22:05:55] (step=0000993) Train Loss: 0.2590, Train Steps/Sec: 0.12, Epoch: 0.01929654100272056, LR: 0.001 [2025-07-27 22:06:03] (step=0000994) Train Loss: 0.3260, Train Steps/Sec: 0.12, Epoch: 0.01931597357170618, LR: 0.001 [2025-07-27 22:06:11] (step=0000995) Train Loss: 0.2775, Train Steps/Sec: 0.12, Epoch: 0.019335406140691798, LR: 0.001 [2025-07-27 22:06:19] (step=0000996) Train Loss: 0.2095, Train Steps/Sec: 0.13, Epoch: 0.01935483870967742, LR: 0.001 [2025-07-27 22:06:27] (step=0000997) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.01937427127866304, LR: 0.001 [2025-07-27 22:06:35] (step=0000998) Train Loss: 0.1992, Train Steps/Sec: 0.13, Epoch: 0.019393703847648658, LR: 0.001 [2025-07-27 22:06:43] (step=0000999) Train Loss: 0.2349, Train Steps/Sec: 0.12, Epoch: 0.01941313641663428, LR: 0.001 [2025-07-27 22:06:51] (step=0001000) Train Loss: 0.1729, Train Steps/Sec: 0.12, Epoch: 0.0194325689856199, LR: 0.001 [2025-07-27 22:06:59] (step=0001001) Train Loss: 0.3038, Train Steps/Sec: 0.12, Epoch: 0.019452001554605518, LR: 0.001 [2025-07-27 22:07:07] (step=0001002) Train Loss: 0.2471, Train Steps/Sec: 0.12, Epoch: 0.01947143412359114, LR: 0.001 [2025-07-27 22:07:15] (step=0001003) Train Loss: 0.2956, Train Steps/Sec: 0.12, Epoch: 0.01949086669257676, LR: 0.001 [2025-07-27 22:07:23] (step=0001004) Train Loss: 0.2781, Train Steps/Sec: 0.12, Epoch: 0.019510299261562378, LR: 0.001 [2025-07-27 22:07:31] (step=0001005) Train Loss: 0.2113, Train Steps/Sec: 0.12, Epoch: 0.019529731830548, LR: 0.001 [2025-07-27 22:07:39] (step=0001006) Train Loss: 0.1978, Train Steps/Sec: 0.12, Epoch: 0.01954916439953362, LR: 0.001 [2025-07-27 22:07:48] (step=0001007) Train Loss: 0.2141, Train Steps/Sec: 0.12, Epoch: 0.019568596968519238, LR: 0.001 [2025-07-27 22:07:56] (step=0001008) Train Loss: 0.2534, Train Steps/Sec: 0.13, Epoch: 0.019588029537504856, LR: 0.001 [2025-07-27 22:08:03] (step=0001009) Train Loss: 0.2331, Train Steps/Sec: 0.13, Epoch: 0.01960746210649048, LR: 0.001 [2025-07-27 22:08:11] (step=0001010) Train Loss: 0.2550, Train Steps/Sec: 0.12, Epoch: 0.019626894675476098, LR: 0.001 [2025-07-27 22:08:19] (step=0001011) Train Loss: 0.3227, Train Steps/Sec: 0.12, Epoch: 0.019646327244461716, LR: 0.001 [2025-07-27 22:08:27] (step=0001012) Train Loss: 0.2702, Train Steps/Sec: 0.12, Epoch: 0.01966575981344734, LR: 0.001 [2025-07-27 22:08:35] (step=0001013) Train Loss: 0.2621, Train Steps/Sec: 0.12, Epoch: 0.019685192382432957, LR: 0.001 [2025-07-27 22:08:43] (step=0001014) Train Loss: 0.1747, Train Steps/Sec: 0.12, Epoch: 0.019704624951418576, LR: 0.001 [2025-07-27 22:08:51] (step=0001015) Train Loss: 0.3054, Train Steps/Sec: 0.13, Epoch: 0.0197240575204042, LR: 0.001 [2025-07-27 22:08:59] (step=0001016) Train Loss: 0.3141, Train Steps/Sec: 0.12, Epoch: 0.019743490089389817, LR: 0.001 [2025-07-27 22:09:07] (step=0001017) Train Loss: 0.2997, Train Steps/Sec: 0.12, Epoch: 0.019762922658375436, LR: 0.001 [2025-07-27 22:09:16] (step=0001018) Train Loss: 0.1867, Train Steps/Sec: 0.12, Epoch: 0.01978235522736106, LR: 0.001 [2025-07-27 22:09:24] (step=0001019) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.019801787796346677, LR: 0.001 [2025-07-27 22:09:32] (step=0001020) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.019821220365332296, LR: 0.001 [2025-07-27 22:09:40] (step=0001021) Train Loss: 0.2058, Train Steps/Sec: 0.12, Epoch: 0.01984065293431792, LR: 0.001 [2025-07-27 22:09:48] (step=0001022) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.019860085503303537, LR: 0.001 [2025-07-27 22:09:56] (step=0001023) Train Loss: 0.1951, Train Steps/Sec: 0.13, Epoch: 0.019879518072289156, LR: 0.001 [2025-07-27 22:10:04] (step=0001024) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.019898950641274775, LR: 0.001 [2025-07-27 22:10:09] (step=0001025) Train Loss: 0.2592, Train Steps/Sec: 0.18, Epoch: 0.019918383210260397, LR: 0.001 [2025-07-27 22:10:17] (step=0001026) Train Loss: 0.2970, Train Steps/Sec: 0.12, Epoch: 0.019937815779246016, LR: 0.001 [2025-07-27 22:10:25] (step=0001027) Train Loss: 0.2391, Train Steps/Sec: 0.12, Epoch: 0.019957248348231635, LR: 0.001 [2025-07-27 22:10:33] (step=0001028) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.019976680917217257, LR: 0.001 [2025-07-27 22:10:42] (step=0001029) Train Loss: 0.2546, Train Steps/Sec: 0.13, Epoch: 0.019996113486202876, LR: 0.001 [2025-07-27 22:10:50] (step=0001030) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.020015546055188495, LR: 0.001 [2025-07-27 22:10:58] (step=0001031) Train Loss: 0.2166, Train Steps/Sec: 0.12, Epoch: 0.020034978624174117, LR: 0.001 [2025-07-27 22:11:06] (step=0001032) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.020054411193159736, LR: 0.001 [2025-07-27 22:11:14] (step=0001033) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.020073843762145355, LR: 0.001 [2025-07-27 22:11:22] (step=0001034) Train Loss: 0.2743, Train Steps/Sec: 0.13, Epoch: 0.020093276331130977, LR: 0.001 [2025-07-27 22:11:30] (step=0001035) Train Loss: 0.2850, Train Steps/Sec: 0.12, Epoch: 0.020112708900116596, LR: 0.001 [2025-07-27 22:11:38] (step=0001036) Train Loss: 0.2590, Train Steps/Sec: 0.13, Epoch: 0.020132141469102215, LR: 0.001 [2025-07-27 22:11:46] (step=0001037) Train Loss: 0.2367, Train Steps/Sec: 0.12, Epoch: 0.020151574038087837, LR: 0.001 [2025-07-27 22:11:54] (step=0001038) Train Loss: 0.2587, Train Steps/Sec: 0.12, Epoch: 0.020171006607073456, LR: 0.001 [2025-07-27 22:12:02] (step=0001039) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.020190439176059075, LR: 0.001 [2025-07-27 22:12:10] (step=0001040) Train Loss: 0.2363, Train Steps/Sec: 0.12, Epoch: 0.020209871745044693, LR: 0.001 [2025-07-27 22:12:18] (step=0001041) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.020229304314030316, LR: 0.001 [2025-07-27 22:12:26] (step=0001042) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.020248736883015934, LR: 0.001 [2025-07-27 22:12:34] (step=0001043) Train Loss: 0.2818, Train Steps/Sec: 0.13, Epoch: 0.020268169452001553, LR: 0.001 [2025-07-27 22:12:42] (step=0001044) Train Loss: 0.2967, Train Steps/Sec: 0.12, Epoch: 0.020287602020987176, LR: 0.001 [2025-07-27 22:12:50] (step=0001045) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.020307034589972794, LR: 0.001 [2025-07-27 22:12:58] (step=0001046) Train Loss: 0.2180, Train Steps/Sec: 0.12, Epoch: 0.020326467158958413, LR: 0.001 [2025-07-27 22:13:06] (step=0001047) Train Loss: 0.3072, Train Steps/Sec: 0.12, Epoch: 0.020345899727944036, LR: 0.001 [2025-07-27 22:13:14] (step=0001048) Train Loss: 0.2569, Train Steps/Sec: 0.13, Epoch: 0.020365332296929654, LR: 0.001 [2025-07-27 22:13:22] (step=0001049) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.020384764865915273, LR: 0.001 [2025-07-27 22:13:30] (step=0001050) Train Loss: 0.1866, Train Steps/Sec: 0.12, Epoch: 0.020404197434900895, LR: 0.001 [2025-07-27 22:13:39] (step=0001051) Train Loss: 0.1579, Train Steps/Sec: 0.12, Epoch: 0.020423630003886514, LR: 0.001 [2025-07-27 22:13:47] (step=0001052) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.020443062572872133, LR: 0.001 [2025-07-27 22:13:55] (step=0001053) Train Loss: 0.2458, Train Steps/Sec: 0.13, Epoch: 0.020462495141857752, LR: 0.001 [2025-07-27 22:14:03] (step=0001054) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.020481927710843374, LR: 0.001 [2025-07-27 22:14:11] (step=0001055) Train Loss: 0.2870, Train Steps/Sec: 0.12, Epoch: 0.020501360279828993, LR: 0.001 [2025-07-27 22:14:19] (step=0001056) Train Loss: 0.2112, Train Steps/Sec: 0.13, Epoch: 0.020520792848814612, LR: 0.001 [2025-07-27 22:14:27] (step=0001057) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.020540225417800234, LR: 0.001 [2025-07-27 22:14:32] (step=0001058) Train Loss: 0.2851, Train Steps/Sec: 0.18, Epoch: 0.020559657986785853, LR: 0.001 [2025-07-27 22:14:40] (step=0001059) Train Loss: 0.1838, Train Steps/Sec: 0.12, Epoch: 0.020579090555771472, LR: 0.001 [2025-07-27 22:14:48] (step=0001060) Train Loss: 0.2872, Train Steps/Sec: 0.12, Epoch: 0.020598523124757094, LR: 0.001 [2025-07-27 22:14:56] (step=0001061) Train Loss: 0.3022, Train Steps/Sec: 0.12, Epoch: 0.020617955693742713, LR: 0.001 [2025-07-27 22:15:05] (step=0001062) Train Loss: 0.3103, Train Steps/Sec: 0.12, Epoch: 0.02063738826272833, LR: 0.001 [2025-07-27 22:15:13] (step=0001063) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.020656820831713954, LR: 0.001 [2025-07-27 22:15:21] (step=0001064) Train Loss: 0.2391, Train Steps/Sec: 0.13, Epoch: 0.020676253400699573, LR: 0.001 [2025-07-27 22:15:29] (step=0001065) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.02069568596968519, LR: 0.001 [2025-07-27 22:15:37] (step=0001066) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.020715118538670814, LR: 0.001 [2025-07-27 22:15:45] (step=0001067) Train Loss: 0.3023, Train Steps/Sec: 0.12, Epoch: 0.020734551107656433, LR: 0.001 [2025-07-27 22:15:53] (step=0001068) Train Loss: 0.2797, Train Steps/Sec: 0.12, Epoch: 0.02075398367664205, LR: 0.001 [2025-07-27 22:16:01] (step=0001069) Train Loss: 0.1958, Train Steps/Sec: 0.12, Epoch: 0.02077341624562767, LR: 0.001 [2025-07-27 22:16:09] (step=0001070) Train Loss: 0.2264, Train Steps/Sec: 0.12, Epoch: 0.020792848814613293, LR: 0.001 [2025-07-27 22:16:17] (step=0001071) Train Loss: 0.2309, Train Steps/Sec: 0.12, Epoch: 0.02081228138359891, LR: 0.001 [2025-07-27 22:16:25] (step=0001072) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.02083171395258453, LR: 0.001 [2025-07-27 22:16:33] (step=0001073) Train Loss: 0.2422, Train Steps/Sec: 0.12, Epoch: 0.020851146521570153, LR: 0.001 [2025-07-27 22:16:41] (step=0001074) Train Loss: 0.2496, Train Steps/Sec: 0.12, Epoch: 0.02087057909055577, LR: 0.001 [2025-07-27 22:16:49] (step=0001075) Train Loss: 0.1796, Train Steps/Sec: 0.12, Epoch: 0.02089001165954139, LR: 0.001 [2025-07-27 22:16:57] (step=0001076) Train Loss: 0.3101, Train Steps/Sec: 0.13, Epoch: 0.020909444228527013, LR: 0.001 [2025-07-27 22:17:05] (step=0001077) Train Loss: 0.3274, Train Steps/Sec: 0.12, Epoch: 0.02092887679751263, LR: 0.001 [2025-07-27 22:17:13] (step=0001078) Train Loss: 0.2719, Train Steps/Sec: 0.12, Epoch: 0.02094830936649825, LR: 0.001 [2025-07-27 22:17:21] (step=0001079) Train Loss: 0.3177, Train Steps/Sec: 0.13, Epoch: 0.020967741935483872, LR: 0.001 [2025-07-27 22:17:29] (step=0001080) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.02098717450446949, LR: 0.001 [2025-07-27 22:17:37] (step=0001081) Train Loss: 0.2197, Train Steps/Sec: 0.13, Epoch: 0.02100660707345511, LR: 0.001 [2025-07-27 22:17:45] (step=0001082) Train Loss: 0.2864, Train Steps/Sec: 0.12, Epoch: 0.021026039642440732, LR: 0.001 [2025-07-27 22:17:53] (step=0001083) Train Loss: 0.2669, Train Steps/Sec: 0.13, Epoch: 0.02104547221142635, LR: 0.001 [2025-07-27 22:18:01] (step=0001084) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.02106490478041197, LR: 0.001 [2025-07-27 22:18:10] (step=0001085) Train Loss: 0.2220, Train Steps/Sec: 0.12, Epoch: 0.02108433734939759, LR: 0.001 [2025-07-27 22:18:18] (step=0001086) Train Loss: 0.2621, Train Steps/Sec: 0.12, Epoch: 0.02110376991838321, LR: 0.001 [2025-07-27 22:18:26] (step=0001087) Train Loss: 0.1992, Train Steps/Sec: 0.12, Epoch: 0.02112320248736883, LR: 0.001 [2025-07-27 22:18:34] (step=0001088) Train Loss: 0.2469, Train Steps/Sec: 0.13, Epoch: 0.02114263505635445, LR: 0.001 [2025-07-27 22:18:42] (step=0001089) Train Loss: 0.3038, Train Steps/Sec: 0.13, Epoch: 0.02116206762534007, LR: 0.001 [2025-07-27 22:18:50] (step=0001090) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.02118150019432569, LR: 0.001 [2025-07-27 22:18:56] (step=0001091) Train Loss: 0.2107, Train Steps/Sec: 0.17, Epoch: 0.02120093276331131, LR: 0.001 [2025-07-27 22:19:04] (step=0001092) Train Loss: 0.3118, Train Steps/Sec: 0.13, Epoch: 0.02122036533229693, LR: 0.001 [2025-07-27 22:19:12] (step=0001093) Train Loss: 0.2008, Train Steps/Sec: 0.13, Epoch: 0.02123979790128255, LR: 0.001 [2025-07-27 22:19:20] (step=0001094) Train Loss: 0.1600, Train Steps/Sec: 0.12, Epoch: 0.02125923047026817, LR: 0.001 [2025-07-27 22:19:28] (step=0001095) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.02127866303925379, LR: 0.001 [2025-07-27 22:19:36] (step=0001096) Train Loss: 0.2279, Train Steps/Sec: 0.12, Epoch: 0.02129809560823941, LR: 0.001 [2025-07-27 22:19:44] (step=0001097) Train Loss: 0.1341, Train Steps/Sec: 0.13, Epoch: 0.02131752817722503, LR: 0.001 [2025-07-27 22:19:52] (step=0001098) Train Loss: 0.2640, Train Steps/Sec: 0.12, Epoch: 0.021336960746210647, LR: 0.001 [2025-07-27 22:20:00] (step=0001099) Train Loss: 0.2404, Train Steps/Sec: 0.12, Epoch: 0.02135639331519627, LR: 0.001 [2025-07-27 22:20:08] (step=0001100) Train Loss: 0.2309, Train Steps/Sec: 0.13, Epoch: 0.02137582588418189, LR: 0.001 [2025-07-27 22:20:16] (step=0001101) Train Loss: 0.2181, Train Steps/Sec: 0.12, Epoch: 0.021395258453167507, LR: 0.001 [2025-07-27 22:20:24] (step=0001102) Train Loss: 0.2174, Train Steps/Sec: 0.13, Epoch: 0.02141469102215313, LR: 0.001 [2025-07-27 22:20:32] (step=0001103) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.02143412359113875, LR: 0.001 [2025-07-27 22:20:40] (step=0001104) Train Loss: 0.2622, Train Steps/Sec: 0.12, Epoch: 0.021453556160124367, LR: 0.001 [2025-07-27 22:20:48] (step=0001105) Train Loss: 0.2215, Train Steps/Sec: 0.12, Epoch: 0.02147298872910999, LR: 0.001 [2025-07-27 22:20:56] (step=0001106) Train Loss: 0.3352, Train Steps/Sec: 0.12, Epoch: 0.02149242129809561, LR: 0.001 [2025-07-27 22:21:04] (step=0001107) Train Loss: 0.2985, Train Steps/Sec: 0.13, Epoch: 0.021511853867081227, LR: 0.001 [2025-07-27 22:21:12] (step=0001108) Train Loss: 0.2675, Train Steps/Sec: 0.12, Epoch: 0.02153128643606685, LR: 0.001 [2025-07-27 22:21:20] (step=0001109) Train Loss: 0.2698, Train Steps/Sec: 0.12, Epoch: 0.02155071900505247, LR: 0.001 [2025-07-27 22:21:28] (step=0001110) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.021570151574038087, LR: 0.001 [2025-07-27 22:21:36] (step=0001111) Train Loss: 0.2912, Train Steps/Sec: 0.12, Epoch: 0.02158958414302371, LR: 0.001 [2025-07-27 22:21:44] (step=0001112) Train Loss: 0.2250, Train Steps/Sec: 0.13, Epoch: 0.021609016712009328, LR: 0.001 [2025-07-27 22:21:52] (step=0001113) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.021628449280994947, LR: 0.001 [2025-07-27 22:22:00] (step=0001114) Train Loss: 0.2320, Train Steps/Sec: 0.13, Epoch: 0.021647881849980566, LR: 0.001 [2025-07-27 22:22:08] (step=0001115) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.021667314418966188, LR: 0.001 [2025-07-27 22:22:16] (step=0001116) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.021686746987951807, LR: 0.001 [2025-07-27 22:22:24] (step=0001117) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.021706179556937426, LR: 0.001 [2025-07-27 22:22:32] (step=0001118) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.021725612125923048, LR: 0.001 [2025-07-27 22:22:41] (step=0001119) Train Loss: 0.2602, Train Steps/Sec: 0.12, Epoch: 0.021745044694908667, LR: 0.001 [2025-07-27 22:22:49] (step=0001120) Train Loss: 0.1712, Train Steps/Sec: 0.12, Epoch: 0.021764477263894286, LR: 0.001 [2025-07-27 22:22:57] (step=0001121) Train Loss: 0.2058, Train Steps/Sec: 0.13, Epoch: 0.021783909832879908, LR: 0.001 [2025-07-27 22:23:04] (step=0001122) Train Loss: 0.2181, Train Steps/Sec: 0.13, Epoch: 0.021803342401865527, LR: 0.001 [2025-07-27 22:23:13] (step=0001123) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.021822774970851146, LR: 0.001 [2025-07-27 22:23:19] (step=0001124) Train Loss: 0.2751, Train Steps/Sec: 0.15, Epoch: 0.021842207539836768, LR: 0.001 [2025-07-27 22:23:26] (step=0001125) Train Loss: 0.3242, Train Steps/Sec: 0.14, Epoch: 0.021861640108822387, LR: 0.001 [2025-07-27 22:23:34] (step=0001126) Train Loss: 0.1886, Train Steps/Sec: 0.12, Epoch: 0.021881072677808006, LR: 0.001 [2025-07-27 22:23:42] (step=0001127) Train Loss: 0.2188, Train Steps/Sec: 0.12, Epoch: 0.021900505246793624, LR: 0.001 [2025-07-27 22:23:50] (step=0001128) Train Loss: 0.2414, Train Steps/Sec: 0.13, Epoch: 0.021919937815779247, LR: 0.001 [2025-07-27 22:23:59] (step=0001129) Train Loss: 0.1961, Train Steps/Sec: 0.12, Epoch: 0.021939370384764866, LR: 0.001 [2025-07-27 22:24:07] (step=0001130) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.021958802953750484, LR: 0.001 [2025-07-27 22:24:15] (step=0001131) Train Loss: 0.2484, Train Steps/Sec: 0.12, Epoch: 0.021978235522736107, LR: 0.001 [2025-07-27 22:24:23] (step=0001132) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.021997668091721725, LR: 0.001 [2025-07-27 22:24:31] (step=0001133) Train Loss: 0.1966, Train Steps/Sec: 0.12, Epoch: 0.022017100660707344, LR: 0.001 [2025-07-27 22:24:39] (step=0001134) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.022036533229692967, LR: 0.001 [2025-07-27 22:24:47] (step=0001135) Train Loss: 0.2609, Train Steps/Sec: 0.13, Epoch: 0.022055965798678585, LR: 0.001 [2025-07-27 22:24:55] (step=0001136) Train Loss: 0.2253, Train Steps/Sec: 0.12, Epoch: 0.022075398367664204, LR: 0.001 [2025-07-27 22:25:03] (step=0001137) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.022094830936649826, LR: 0.001 [2025-07-27 22:25:11] (step=0001138) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.022114263505635445, LR: 0.001 [2025-07-27 22:25:19] (step=0001139) Train Loss: 0.2268, Train Steps/Sec: 0.12, Epoch: 0.022133696074621064, LR: 0.001 [2025-07-27 22:25:27] (step=0001140) Train Loss: 0.2223, Train Steps/Sec: 0.12, Epoch: 0.022153128643606686, LR: 0.001 [2025-07-27 22:25:35] (step=0001141) Train Loss: 0.2553, Train Steps/Sec: 0.12, Epoch: 0.022172561212592305, LR: 0.001 [2025-07-27 22:25:43] (step=0001142) Train Loss: 0.2757, Train Steps/Sec: 0.13, Epoch: 0.022191993781577924, LR: 0.001 [2025-07-27 22:25:51] (step=0001143) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.022211426350563543, LR: 0.001 [2025-07-27 22:25:59] (step=0001144) Train Loss: 0.2240, Train Steps/Sec: 0.13, Epoch: 0.022230858919549165, LR: 0.001 [2025-07-27 22:26:07] (step=0001145) Train Loss: 0.2660, Train Steps/Sec: 0.12, Epoch: 0.022250291488534784, LR: 0.001 [2025-07-27 22:26:15] (step=0001146) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.022269724057520403, LR: 0.001 [2025-07-27 22:26:23] (step=0001147) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.022289156626506025, LR: 0.001 [2025-07-27 22:26:31] (step=0001148) Train Loss: 0.2440, Train Steps/Sec: 0.12, Epoch: 0.022308589195491644, LR: 0.001 [2025-07-27 22:26:39] (step=0001149) Train Loss: 0.2959, Train Steps/Sec: 0.13, Epoch: 0.022328021764477263, LR: 0.001 [2025-07-27 22:26:47] (step=0001150) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.022347454333462885, LR: 0.001 [2025-07-27 22:26:56] (step=0001151) Train Loss: 0.2745, Train Steps/Sec: 0.12, Epoch: 0.022366886902448504, LR: 0.001 [2025-07-27 22:27:04] (step=0001152) Train Loss: 0.2130, Train Steps/Sec: 0.12, Epoch: 0.022386319471434123, LR: 0.001 [2025-07-27 22:27:12] (step=0001153) Train Loss: 0.2078, Train Steps/Sec: 0.12, Epoch: 0.022405752040419745, LR: 0.001 [2025-07-27 22:27:20] (step=0001154) Train Loss: 0.2826, Train Steps/Sec: 0.13, Epoch: 0.022425184609405364, LR: 0.001 [2025-07-27 22:27:28] (step=0001155) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.022444617178390983, LR: 0.001 [2025-07-27 22:27:36] (step=0001156) Train Loss: 0.2349, Train Steps/Sec: 0.13, Epoch: 0.022464049747376605, LR: 0.001 [2025-07-27 22:27:43] (step=0001157) Train Loss: 0.1624, Train Steps/Sec: 0.14, Epoch: 0.022483482316362224, LR: 0.001 [2025-07-27 22:27:49] (step=0001158) Train Loss: 0.2816, Train Steps/Sec: 0.16, Epoch: 0.022502914885347843, LR: 0.001 [2025-07-27 22:27:57] (step=0001159) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.02252234745433346, LR: 0.001 [2025-07-27 22:28:05] (step=0001160) Train Loss: 0.2707, Train Steps/Sec: 0.13, Epoch: 0.022541780023319084, LR: 0.001 [2025-07-27 22:28:13] (step=0001161) Train Loss: 0.1800, Train Steps/Sec: 0.13, Epoch: 0.022561212592304702, LR: 0.001 [2025-07-27 22:28:21] (step=0001162) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.02258064516129032, LR: 0.001 [2025-07-27 22:28:29] (step=0001163) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.022600077730275944, LR: 0.001 [2025-07-27 22:28:37] (step=0001164) Train Loss: 0.2778, Train Steps/Sec: 0.12, Epoch: 0.022619510299261562, LR: 0.001 [2025-07-27 22:28:45] (step=0001165) Train Loss: 0.2302, Train Steps/Sec: 0.13, Epoch: 0.02263894286824718, LR: 0.001 [2025-07-27 22:28:53] (step=0001166) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.022658375437232803, LR: 0.001 [2025-07-27 22:29:01] (step=0001167) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.022677808006218422, LR: 0.001 [2025-07-27 22:29:09] (step=0001168) Train Loss: 0.2924, Train Steps/Sec: 0.13, Epoch: 0.02269724057520404, LR: 0.001 [2025-07-27 22:29:17] (step=0001169) Train Loss: 0.2069, Train Steps/Sec: 0.12, Epoch: 0.022716673144189663, LR: 0.001 [2025-07-27 22:29:25] (step=0001170) Train Loss: 0.3330, Train Steps/Sec: 0.13, Epoch: 0.022736105713175282, LR: 0.001 [2025-07-27 22:29:33] (step=0001171) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.0227555382821609, LR: 0.001 [2025-07-27 22:29:41] (step=0001172) Train Loss: 0.1927, Train Steps/Sec: 0.13, Epoch: 0.02277497085114652, LR: 0.001 [2025-07-27 22:29:49] (step=0001173) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.022794403420132142, LR: 0.001 [2025-07-27 22:29:57] (step=0001174) Train Loss: 0.1786, Train Steps/Sec: 0.13, Epoch: 0.02281383598911776, LR: 0.001 [2025-07-27 22:30:05] (step=0001175) Train Loss: 0.3193, Train Steps/Sec: 0.12, Epoch: 0.02283326855810338, LR: 0.001 [2025-07-27 22:30:13] (step=0001176) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.022852701127089002, LR: 0.001 [2025-07-27 22:30:21] (step=0001177) Train Loss: 0.2540, Train Steps/Sec: 0.13, Epoch: 0.02287213369607462, LR: 0.001 [2025-07-27 22:30:29] (step=0001178) Train Loss: 0.2551, Train Steps/Sec: 0.13, Epoch: 0.02289156626506024, LR: 0.001 [2025-07-27 22:30:37] (step=0001179) Train Loss: 0.2274, Train Steps/Sec: 0.13, Epoch: 0.022910998834045862, LR: 0.001 [2025-07-27 22:30:45] (step=0001180) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.02293043140303148, LR: 0.001 [2025-07-27 22:30:53] (step=0001181) Train Loss: 0.2496, Train Steps/Sec: 0.12, Epoch: 0.0229498639720171, LR: 0.001 [2025-07-27 22:31:01] (step=0001182) Train Loss: 0.2714, Train Steps/Sec: 0.12, Epoch: 0.022969296541002722, LR: 0.001 [2025-07-27 22:31:09] (step=0001183) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.02298872910998834, LR: 0.001 [2025-07-27 22:31:17] (step=0001184) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.02300816167897396, LR: 0.001 [2025-07-27 22:31:25] (step=0001185) Train Loss: 0.2161, Train Steps/Sec: 0.12, Epoch: 0.023027594247959582, LR: 0.001 [2025-07-27 22:31:33] (step=0001186) Train Loss: 0.2345, Train Steps/Sec: 0.12, Epoch: 0.0230470268169452, LR: 0.001 [2025-07-27 22:31:41] (step=0001187) Train Loss: 0.3320, Train Steps/Sec: 0.13, Epoch: 0.02306645938593082, LR: 0.001 [2025-07-27 22:31:49] (step=0001188) Train Loss: 0.1917, Train Steps/Sec: 0.12, Epoch: 0.02308589195491644, LR: 0.001 [2025-07-27 22:31:57] (step=0001189) Train Loss: 0.1852, Train Steps/Sec: 0.13, Epoch: 0.02310532452390206, LR: 0.001 [2025-07-27 22:32:05] (step=0001190) Train Loss: 0.2095, Train Steps/Sec: 0.13, Epoch: 0.02312475709288768, LR: 0.001 [2025-07-27 22:32:11] (step=0001191) Train Loss: 0.1933, Train Steps/Sec: 0.18, Epoch: 0.023144189661873298, LR: 0.001 [2025-07-27 22:32:19] (step=0001192) Train Loss: 0.2582, Train Steps/Sec: 0.12, Epoch: 0.02316362223085892, LR: 0.001 [2025-07-27 22:32:27] (step=0001193) Train Loss: 0.2226, Train Steps/Sec: 0.13, Epoch: 0.02318305479984454, LR: 0.001 [2025-07-27 22:32:35] (step=0001194) Train Loss: 0.2202, Train Steps/Sec: 0.12, Epoch: 0.023202487368830158, LR: 0.001 [2025-07-27 22:32:43] (step=0001195) Train Loss: 0.2933, Train Steps/Sec: 0.12, Epoch: 0.02322191993781578, LR: 0.001 [2025-07-27 22:32:51] (step=0001196) Train Loss: 0.2356, Train Steps/Sec: 0.12, Epoch: 0.0232413525068014, LR: 0.001 [2025-07-27 22:32:59] (step=0001197) Train Loss: 0.2545, Train Steps/Sec: 0.12, Epoch: 0.023260785075787018, LR: 0.001 [2025-07-27 22:33:07] (step=0001198) Train Loss: 0.2589, Train Steps/Sec: 0.13, Epoch: 0.02328021764477264, LR: 0.001 [2025-07-27 22:33:15] (step=0001199) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.02329965021375826, LR: 0.001 [2025-07-27 22:33:23] (step=0001200) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.023319082782743878, LR: 0.001 [2025-07-27 22:33:31] (step=0001201) Train Loss: 0.2528, Train Steps/Sec: 0.12, Epoch: 0.0233385153517295, LR: 0.001 [2025-07-27 22:33:39] (step=0001202) Train Loss: 0.2206, Train Steps/Sec: 0.13, Epoch: 0.02335794792071512, LR: 0.001 [2025-07-27 22:33:47] (step=0001203) Train Loss: 0.2507, Train Steps/Sec: 0.13, Epoch: 0.023377380489700738, LR: 0.001 [2025-07-27 22:33:55] (step=0001204) Train Loss: 0.2145, Train Steps/Sec: 0.12, Epoch: 0.023396813058686357, LR: 0.001 [2025-07-27 22:34:03] (step=0001205) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.02341624562767198, LR: 0.001 [2025-07-27 22:34:11] (step=0001206) Train Loss: 0.2338, Train Steps/Sec: 0.12, Epoch: 0.023435678196657598, LR: 0.001 [2025-07-27 22:34:19] (step=0001207) Train Loss: 0.1819, Train Steps/Sec: 0.13, Epoch: 0.023455110765643217, LR: 0.001 [2025-07-27 22:34:27] (step=0001208) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.02347454333462884, LR: 0.001 [2025-07-27 22:34:35] (step=0001209) Train Loss: 0.3283, Train Steps/Sec: 0.12, Epoch: 0.023493975903614458, LR: 0.001 [2025-07-27 22:34:43] (step=0001210) Train Loss: 0.3208, Train Steps/Sec: 0.12, Epoch: 0.023513408472600077, LR: 0.001 [2025-07-27 22:34:51] (step=0001211) Train Loss: 0.2687, Train Steps/Sec: 0.12, Epoch: 0.0235328410415857, LR: 0.001 [2025-07-27 22:34:59] (step=0001212) Train Loss: 0.2259, Train Steps/Sec: 0.13, Epoch: 0.023552273610571318, LR: 0.001 [2025-07-27 22:35:07] (step=0001213) Train Loss: 0.1785, Train Steps/Sec: 0.12, Epoch: 0.023571706179556937, LR: 0.001 [2025-07-27 22:35:15] (step=0001214) Train Loss: 0.2631, Train Steps/Sec: 0.12, Epoch: 0.02359113874854256, LR: 0.001 [2025-07-27 22:35:23] (step=0001215) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.023610571317528178, LR: 0.001 [2025-07-27 22:35:32] (step=0001216) Train Loss: 0.2906, Train Steps/Sec: 0.12, Epoch: 0.023630003886513797, LR: 0.001 [2025-07-27 22:35:40] (step=0001217) Train Loss: 0.2016, Train Steps/Sec: 0.12, Epoch: 0.023649436455499415, LR: 0.001 [2025-07-27 22:35:48] (step=0001218) Train Loss: 0.2667, Train Steps/Sec: 0.12, Epoch: 0.023668869024485038, LR: 0.001 [2025-07-27 22:35:56] (step=0001219) Train Loss: 0.1997, Train Steps/Sec: 0.12, Epoch: 0.023688301593470656, LR: 0.001 [2025-07-27 22:36:04] (step=0001220) Train Loss: 0.2191, Train Steps/Sec: 0.12, Epoch: 0.023707734162456275, LR: 0.001 [2025-07-27 22:36:12] (step=0001221) Train Loss: 0.2987, Train Steps/Sec: 0.12, Epoch: 0.023727166731441898, LR: 0.001 [2025-07-27 22:36:20] (step=0001222) Train Loss: 0.2420, Train Steps/Sec: 0.13, Epoch: 0.023746599300427516, LR: 0.001 [2025-07-27 22:36:28] (step=0001223) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.023766031869413135, LR: 0.001 [2025-07-27 22:36:33] (step=0001224) Train Loss: 0.2400, Train Steps/Sec: 0.18, Epoch: 0.023785464438398757, LR: 0.001 [2025-07-27 22:36:41] (step=0001225) Train Loss: 0.2621, Train Steps/Sec: 0.12, Epoch: 0.023804897007384376, LR: 0.001 [2025-07-27 22:36:49] (step=0001226) Train Loss: 0.1855, Train Steps/Sec: 0.13, Epoch: 0.023824329576369995, LR: 0.001 [2025-07-27 22:36:57] (step=0001227) Train Loss: 0.1755, Train Steps/Sec: 0.12, Epoch: 0.023843762145355617, LR: 0.001 [2025-07-27 22:37:05] (step=0001228) Train Loss: 0.3069, Train Steps/Sec: 0.12, Epoch: 0.023863194714341236, LR: 0.001 [2025-07-27 22:37:13] (step=0001229) Train Loss: 0.3143, Train Steps/Sec: 0.12, Epoch: 0.023882627283326855, LR: 0.001 [2025-07-27 22:37:22] (step=0001230) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.023902059852312477, LR: 0.001 [2025-07-27 22:37:30] (step=0001231) Train Loss: 0.3064, Train Steps/Sec: 0.13, Epoch: 0.023921492421298096, LR: 0.001 [2025-07-27 22:37:38] (step=0001232) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.023940924990283715, LR: 0.001 [2025-07-27 22:37:46] (step=0001233) Train Loss: 0.3002, Train Steps/Sec: 0.12, Epoch: 0.023960357559269334, LR: 0.001 [2025-07-27 22:37:54] (step=0001234) Train Loss: 0.2501, Train Steps/Sec: 0.12, Epoch: 0.023979790128254956, LR: 0.001 [2025-07-27 22:38:02] (step=0001235) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.023999222697240575, LR: 0.001 [2025-07-27 22:38:10] (step=0001236) Train Loss: 0.2796, Train Steps/Sec: 0.12, Epoch: 0.024018655266226194, LR: 0.001 [2025-07-27 22:38:18] (step=0001237) Train Loss: 0.1794, Train Steps/Sec: 0.12, Epoch: 0.024038087835211816, LR: 0.001 [2025-07-27 22:38:26] (step=0001238) Train Loss: 0.1911, Train Steps/Sec: 0.13, Epoch: 0.024057520404197435, LR: 0.001 [2025-07-27 22:38:34] (step=0001239) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.024076952973183054, LR: 0.001 [2025-07-27 22:38:42] (step=0001240) Train Loss: 0.2236, Train Steps/Sec: 0.12, Epoch: 0.024096385542168676, LR: 0.001 [2025-07-27 22:38:50] (step=0001241) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.024115818111154295, LR: 0.001 [2025-07-27 22:38:58] (step=0001242) Train Loss: 0.2492, Train Steps/Sec: 0.13, Epoch: 0.024135250680139914, LR: 0.001 [2025-07-27 22:39:06] (step=0001243) Train Loss: 0.2673, Train Steps/Sec: 0.12, Epoch: 0.024154683249125536, LR: 0.001 [2025-07-27 22:39:14] (step=0001244) Train Loss: 0.2814, Train Steps/Sec: 0.12, Epoch: 0.024174115818111155, LR: 0.001 [2025-07-27 22:39:22] (step=0001245) Train Loss: 0.2777, Train Steps/Sec: 0.13, Epoch: 0.024193548387096774, LR: 0.001 [2025-07-27 22:39:30] (step=0001246) Train Loss: 0.2615, Train Steps/Sec: 0.12, Epoch: 0.024212980956082396, LR: 0.001 [2025-07-27 22:39:38] (step=0001247) Train Loss: 0.1782, Train Steps/Sec: 0.13, Epoch: 0.024232413525068015, LR: 0.001 [2025-07-27 22:39:46] (step=0001248) Train Loss: 0.2293, Train Steps/Sec: 0.13, Epoch: 0.024251846094053633, LR: 0.001 [2025-07-27 22:39:54] (step=0001249) Train Loss: 0.2224, Train Steps/Sec: 0.12, Epoch: 0.024271278663039252, LR: 0.001 [2025-07-27 22:40:02] (step=0001250) Train Loss: 0.1931, Train Steps/Sec: 0.13, Epoch: 0.024290711232024875, LR: 0.001 [2025-07-27 22:40:10] (step=0001251) Train Loss: 0.2831, Train Steps/Sec: 0.12, Epoch: 0.024310143801010493, LR: 0.001 [2025-07-27 22:40:18] (step=0001252) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.024329576369996112, LR: 0.001 [2025-07-27 22:40:26] (step=0001253) Train Loss: 0.2049, Train Steps/Sec: 0.12, Epoch: 0.024349008938981734, LR: 0.001 [2025-07-27 22:40:34] (step=0001254) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.024368441507967353, LR: 0.001 [2025-07-27 22:40:42] (step=0001255) Train Loss: 0.2330, Train Steps/Sec: 0.13, Epoch: 0.024387874076952972, LR: 0.001 [2025-07-27 22:40:50] (step=0001256) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.024407306645938594, LR: 0.001 [2025-07-27 22:40:57] (step=0001257) Train Loss: 0.2295, Train Steps/Sec: 0.16, Epoch: 0.024426739214924213, LR: 0.001 [2025-07-27 22:41:04] (step=0001258) Train Loss: 0.2257, Train Steps/Sec: 0.13, Epoch: 0.024446171783909832, LR: 0.001 [2025-07-27 22:41:12] (step=0001259) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.024465604352895454, LR: 0.001 [2025-07-27 22:41:20] (step=0001260) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.024485036921881073, LR: 0.001 [2025-07-27 22:41:28] (step=0001261) Train Loss: 0.1967, Train Steps/Sec: 0.12, Epoch: 0.024504469490866692, LR: 0.001 [2025-07-27 22:41:36] (step=0001262) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.02452390205985231, LR: 0.001 [2025-07-27 22:41:44] (step=0001263) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.024543334628837933, LR: 0.001 [2025-07-27 22:41:53] (step=0001264) Train Loss: 0.1955, Train Steps/Sec: 0.13, Epoch: 0.024562767197823552, LR: 0.001 [2025-07-27 22:42:01] (step=0001265) Train Loss: 0.2211, Train Steps/Sec: 0.12, Epoch: 0.02458219976680917, LR: 0.001 [2025-07-27 22:42:09] (step=0001266) Train Loss: 0.2262, Train Steps/Sec: 0.12, Epoch: 0.024601632335794793, LR: 0.001 [2025-07-27 22:42:17] (step=0001267) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.024621064904780412, LR: 0.001 [2025-07-27 22:42:25] (step=0001268) Train Loss: 0.2349, Train Steps/Sec: 0.12, Epoch: 0.02464049747376603, LR: 0.001 [2025-07-27 22:42:33] (step=0001269) Train Loss: 0.2643, Train Steps/Sec: 0.12, Epoch: 0.024659930042751653, LR: 0.001 [2025-07-27 22:42:41] (step=0001270) Train Loss: 0.1768, Train Steps/Sec: 0.12, Epoch: 0.024679362611737272, LR: 0.001 [2025-07-27 22:42:49] (step=0001271) Train Loss: 0.3336, Train Steps/Sec: 0.12, Epoch: 0.02469879518072289, LR: 0.001 [2025-07-27 22:42:57] (step=0001272) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.024718227749708513, LR: 0.001 [2025-07-27 22:43:05] (step=0001273) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.02473766031869413, LR: 0.001 [2025-07-27 22:43:13] (step=0001274) Train Loss: 0.2077, Train Steps/Sec: 0.13, Epoch: 0.02475709288767975, LR: 0.001 [2025-07-27 22:43:21] (step=0001275) Train Loss: 0.2402, Train Steps/Sec: 0.12, Epoch: 0.024776525456665373, LR: 0.001 [2025-07-27 22:43:29] (step=0001276) Train Loss: 0.2865, Train Steps/Sec: 0.13, Epoch: 0.02479595802565099, LR: 0.001 [2025-07-27 22:43:37] (step=0001277) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.02481539059463661, LR: 0.001 [2025-07-27 22:43:45] (step=0001278) Train Loss: 0.2427, Train Steps/Sec: 0.12, Epoch: 0.02483482316362223, LR: 0.001 [2025-07-27 22:43:53] (step=0001279) Train Loss: 0.2814, Train Steps/Sec: 0.12, Epoch: 0.02485425573260785, LR: 0.001 [2025-07-27 22:44:01] (step=0001280) Train Loss: 0.2371, Train Steps/Sec: 0.13, Epoch: 0.02487368830159347, LR: 0.001 [2025-07-27 22:44:09] (step=0001281) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.02489312087057909, LR: 0.001 [2025-07-27 22:44:17] (step=0001282) Train Loss: 0.2266, Train Steps/Sec: 0.12, Epoch: 0.02491255343956471, LR: 0.001 [2025-07-27 22:44:25] (step=0001283) Train Loss: 0.2615, Train Steps/Sec: 0.13, Epoch: 0.02493198600855033, LR: 0.001 [2025-07-27 22:44:33] (step=0001284) Train Loss: 0.1929, Train Steps/Sec: 0.12, Epoch: 0.02495141857753595, LR: 0.001 [2025-07-27 22:44:41] (step=0001285) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.02497085114652157, LR: 0.001 [2025-07-27 22:44:49] (step=0001286) Train Loss: 0.2594, Train Steps/Sec: 0.13, Epoch: 0.02499028371550719, LR: 0.001 [2025-07-27 22:44:57] (step=0001287) Train Loss: 0.3161, Train Steps/Sec: 0.12, Epoch: 0.02500971628449281, LR: 0.001 [2025-07-27 22:45:05] (step=0001288) Train Loss: 0.2740, Train Steps/Sec: 0.13, Epoch: 0.02502914885347843, LR: 0.001 [2025-07-27 22:45:13] (step=0001289) Train Loss: 0.2800, Train Steps/Sec: 0.12, Epoch: 0.02504858142246405, LR: 0.001 [2025-07-27 22:45:19] (step=0001290) Train Loss: 0.2201, Train Steps/Sec: 0.19, Epoch: 0.02506801399144967, LR: 0.001 [2025-07-27 22:45:27] (step=0001291) Train Loss: 0.2745, Train Steps/Sec: 0.12, Epoch: 0.025087446560435288, LR: 0.001 [2025-07-27 22:45:35] (step=0001292) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.02510687912942091, LR: 0.001 [2025-07-27 22:45:43] (step=0001293) Train Loss: 0.2837, Train Steps/Sec: 0.12, Epoch: 0.02512631169840653, LR: 0.001 [2025-07-27 22:45:51] (step=0001294) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.025145744267392148, LR: 0.001 [2025-07-27 22:45:59] (step=0001295) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.02516517683637777, LR: 0.001 [2025-07-27 22:46:07] (step=0001296) Train Loss: 0.2159, Train Steps/Sec: 0.12, Epoch: 0.02518460940536339, LR: 0.001 [2025-07-27 22:46:15] (step=0001297) Train Loss: 0.2157, Train Steps/Sec: 0.12, Epoch: 0.025204041974349008, LR: 0.001 [2025-07-27 22:46:23] (step=0001298) Train Loss: 0.2989, Train Steps/Sec: 0.12, Epoch: 0.02522347454333463, LR: 0.001 [2025-07-27 22:46:31] (step=0001299) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.02524290711232025, LR: 0.001 [2025-07-27 22:46:39] (step=0001300) Train Loss: 0.1666, Train Steps/Sec: 0.12, Epoch: 0.025262339681305868, LR: 0.001 [2025-07-27 22:46:47] (step=0001301) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.02528177225029149, LR: 0.001 [2025-07-27 22:46:55] (step=0001302) Train Loss: 0.2550, Train Steps/Sec: 0.12, Epoch: 0.02530120481927711, LR: 0.001 [2025-07-27 22:47:04] (step=0001303) Train Loss: 0.2305, Train Steps/Sec: 0.12, Epoch: 0.025320637388262728, LR: 0.001 [2025-07-27 22:47:12] (step=0001304) Train Loss: 0.3199, Train Steps/Sec: 0.13, Epoch: 0.02534006995724835, LR: 0.001 [2025-07-27 22:47:20] (step=0001305) Train Loss: 0.2253, Train Steps/Sec: 0.12, Epoch: 0.02535950252623397, LR: 0.001 [2025-07-27 22:47:28] (step=0001306) Train Loss: 0.2725, Train Steps/Sec: 0.13, Epoch: 0.025378935095219587, LR: 0.001 [2025-07-27 22:47:36] (step=0001307) Train Loss: 0.3279, Train Steps/Sec: 0.13, Epoch: 0.025398367664205206, LR: 0.001 [2025-07-27 22:47:44] (step=0001308) Train Loss: 0.3056, Train Steps/Sec: 0.13, Epoch: 0.02541780023319083, LR: 0.001 [2025-07-27 22:47:52] (step=0001309) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.025437232802176447, LR: 0.001 [2025-07-27 22:48:00] (step=0001310) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.025456665371162066, LR: 0.001 [2025-07-27 22:48:08] (step=0001311) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.02547609794014769, LR: 0.001 [2025-07-27 22:48:16] (step=0001312) Train Loss: 0.2860, Train Steps/Sec: 0.12, Epoch: 0.025495530509133307, LR: 0.001 [2025-07-27 22:48:24] (step=0001313) Train Loss: 0.2847, Train Steps/Sec: 0.13, Epoch: 0.025514963078118926, LR: 0.001 [2025-07-27 22:48:32] (step=0001314) Train Loss: 0.2548, Train Steps/Sec: 0.12, Epoch: 0.02553439564710455, LR: 0.001 [2025-07-27 22:48:40] (step=0001315) Train Loss: 0.2833, Train Steps/Sec: 0.13, Epoch: 0.025553828216090167, LR: 0.001 [2025-07-27 22:48:48] (step=0001316) Train Loss: 0.2267, Train Steps/Sec: 0.12, Epoch: 0.025573260785075786, LR: 0.001 [2025-07-27 22:48:56] (step=0001317) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.02559269335406141, LR: 0.001 [2025-07-27 22:49:04] (step=0001318) Train Loss: 0.2694, Train Steps/Sec: 0.13, Epoch: 0.025612125923047027, LR: 0.001 [2025-07-27 22:49:12] (step=0001319) Train Loss: 0.2905, Train Steps/Sec: 0.12, Epoch: 0.025631558492032646, LR: 0.001 [2025-07-27 22:49:20] (step=0001320) Train Loss: 0.2419, Train Steps/Sec: 0.12, Epoch: 0.02565099106101827, LR: 0.001 [2025-07-27 22:49:28] (step=0001321) Train Loss: 0.3472, Train Steps/Sec: 0.13, Epoch: 0.025670423630003887, LR: 0.001 [2025-07-27 22:49:36] (step=0001322) Train Loss: 0.2627, Train Steps/Sec: 0.12, Epoch: 0.025689856198989506, LR: 0.001 [2025-07-27 22:49:42] (step=0001323) Train Loss: 0.2375, Train Steps/Sec: 0.17, Epoch: 0.025709288767975125, LR: 0.001 [2025-07-27 22:49:50] (step=0001324) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.025728721336960747, LR: 0.001 [2025-07-27 22:49:58] (step=0001325) Train Loss: 0.1967, Train Steps/Sec: 0.12, Epoch: 0.025748153905946366, LR: 0.001 [2025-07-27 22:50:06] (step=0001326) Train Loss: 0.2285, Train Steps/Sec: 0.12, Epoch: 0.025767586474931985, LR: 0.001 [2025-07-27 22:50:14] (step=0001327) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.025787019043917607, LR: 0.001 [2025-07-27 22:50:22] (step=0001328) Train Loss: 0.2838, Train Steps/Sec: 0.12, Epoch: 0.025806451612903226, LR: 0.001 [2025-07-27 22:50:30] (step=0001329) Train Loss: 0.3042, Train Steps/Sec: 0.13, Epoch: 0.025825884181888845, LR: 0.001 [2025-07-27 22:50:38] (step=0001330) Train Loss: 0.2773, Train Steps/Sec: 0.12, Epoch: 0.025845316750874467, LR: 0.001 [2025-07-27 22:50:46] (step=0001331) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.025864749319860086, LR: 0.001 [2025-07-27 22:50:54] (step=0001332) Train Loss: 0.1869, Train Steps/Sec: 0.12, Epoch: 0.025884181888845705, LR: 0.001 [2025-07-27 22:51:02] (step=0001333) Train Loss: 0.3358, Train Steps/Sec: 0.12, Epoch: 0.025903614457831327, LR: 0.001 [2025-07-27 22:51:10] (step=0001334) Train Loss: 0.2558, Train Steps/Sec: 0.13, Epoch: 0.025923047026816946, LR: 0.001 [2025-07-27 22:51:18] (step=0001335) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.025942479595802564, LR: 0.001 [2025-07-27 22:51:26] (step=0001336) Train Loss: 0.2117, Train Steps/Sec: 0.13, Epoch: 0.025961912164788183, LR: 0.001 [2025-07-27 22:51:34] (step=0001337) Train Loss: 0.1721, Train Steps/Sec: 0.13, Epoch: 0.025981344733773806, LR: 0.001 [2025-07-27 22:51:42] (step=0001338) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.026000777302759424, LR: 0.001 [2025-07-27 22:51:50] (step=0001339) Train Loss: 0.1928, Train Steps/Sec: 0.12, Epoch: 0.026020209871745043, LR: 0.001 [2025-07-27 22:51:58] (step=0001340) Train Loss: 0.2294, Train Steps/Sec: 0.12, Epoch: 0.026039642440730666, LR: 0.001 [2025-07-27 22:52:06] (step=0001341) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.026059075009716284, LR: 0.001 [2025-07-27 22:52:14] (step=0001342) Train Loss: 0.2281, Train Steps/Sec: 0.13, Epoch: 0.026078507578701903, LR: 0.001 [2025-07-27 22:52:22] (step=0001343) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.026097940147687525, LR: 0.001 [2025-07-27 22:52:30] (step=0001344) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.026117372716673144, LR: 0.001 [2025-07-27 22:52:39] (step=0001345) Train Loss: 0.3047, Train Steps/Sec: 0.12, Epoch: 0.026136805285658763, LR: 0.001 [2025-07-27 22:52:47] (step=0001346) Train Loss: 0.2375, Train Steps/Sec: 0.13, Epoch: 0.026156237854644385, LR: 0.001 [2025-07-27 22:52:55] (step=0001347) Train Loss: 0.2953, Train Steps/Sec: 0.12, Epoch: 0.026175670423630004, LR: 0.001 [2025-07-27 22:53:03] (step=0001348) Train Loss: 0.2373, Train Steps/Sec: 0.13, Epoch: 0.026195102992615623, LR: 0.001 [2025-07-27 22:53:11] (step=0001349) Train Loss: 0.1686, Train Steps/Sec: 0.12, Epoch: 0.026214535561601245, LR: 0.001 [2025-07-27 22:53:19] (step=0001350) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.026233968130586864, LR: 0.001 [2025-07-27 22:53:27] (step=0001351) Train Loss: 0.3038, Train Steps/Sec: 0.13, Epoch: 0.026253400699572483, LR: 0.001 [2025-07-27 22:53:35] (step=0001352) Train Loss: 0.1868, Train Steps/Sec: 0.12, Epoch: 0.026272833268558102, LR: 0.001 [2025-07-27 22:53:43] (step=0001353) Train Loss: 0.1708, Train Steps/Sec: 0.12, Epoch: 0.026292265837543724, LR: 0.001 [2025-07-27 22:53:51] (step=0001354) Train Loss: 0.2369, Train Steps/Sec: 0.13, Epoch: 0.026311698406529343, LR: 0.001 [2025-07-27 22:53:59] (step=0001355) Train Loss: 0.2032, Train Steps/Sec: 0.12, Epoch: 0.02633113097551496, LR: 0.001 [2025-07-27 22:54:05] (step=0001356) Train Loss: 0.2351, Train Steps/Sec: 0.16, Epoch: 0.026350563544500584, LR: 0.001 [2025-07-27 22:54:12] (step=0001357) Train Loss: 0.2509, Train Steps/Sec: 0.14, Epoch: 0.026369996113486203, LR: 0.001 [2025-07-27 22:54:20] (step=0001358) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.02638942868247182, LR: 0.001 [2025-07-27 22:54:28] (step=0001359) Train Loss: 0.2379, Train Steps/Sec: 0.13, Epoch: 0.026408861251457444, LR: 0.001 [2025-07-27 22:54:36] (step=0001360) Train Loss: 0.3039, Train Steps/Sec: 0.13, Epoch: 0.026428293820443063, LR: 0.001 [2025-07-27 22:54:44] (step=0001361) Train Loss: 0.1669, Train Steps/Sec: 0.12, Epoch: 0.02644772638942868, LR: 0.001 [2025-07-27 22:54:52] (step=0001362) Train Loss: 0.2678, Train Steps/Sec: 0.13, Epoch: 0.026467158958414304, LR: 0.001 [2025-07-27 22:55:00] (step=0001363) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.026486591527399923, LR: 0.001 [2025-07-27 22:55:08] (step=0001364) Train Loss: 0.2551, Train Steps/Sec: 0.13, Epoch: 0.02650602409638554, LR: 0.001 [2025-07-27 22:55:16] (step=0001365) Train Loss: 0.2463, Train Steps/Sec: 0.13, Epoch: 0.026525456665371164, LR: 0.001 [2025-07-27 22:55:25] (step=0001366) Train Loss: 0.2811, Train Steps/Sec: 0.12, Epoch: 0.026544889234356783, LR: 0.001 [2025-07-27 22:55:33] (step=0001367) Train Loss: 0.3042, Train Steps/Sec: 0.13, Epoch: 0.0265643218033424, LR: 0.001 [2025-07-27 22:55:41] (step=0001368) Train Loss: 0.2576, Train Steps/Sec: 0.12, Epoch: 0.02658375437232802, LR: 0.001 [2025-07-27 22:55:49] (step=0001369) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.026603186941313643, LR: 0.001 [2025-07-27 22:55:57] (step=0001370) Train Loss: 0.2646, Train Steps/Sec: 0.12, Epoch: 0.02662261951029926, LR: 0.001 [2025-07-27 22:56:05] (step=0001371) Train Loss: 0.2007, Train Steps/Sec: 0.12, Epoch: 0.02664205207928488, LR: 0.001 [2025-07-27 22:56:13] (step=0001372) Train Loss: 0.2562, Train Steps/Sec: 0.13, Epoch: 0.026661484648270502, LR: 0.001 [2025-07-27 22:56:21] (step=0001373) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.02668091721725612, LR: 0.001 [2025-07-27 22:56:29] (step=0001374) Train Loss: 0.2722, Train Steps/Sec: 0.13, Epoch: 0.02670034978624174, LR: 0.001 [2025-07-27 22:56:37] (step=0001375) Train Loss: 0.2029, Train Steps/Sec: 0.12, Epoch: 0.026719782355227362, LR: 0.001 [2025-07-27 22:56:45] (step=0001376) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.02673921492421298, LR: 0.001 [2025-07-27 22:56:53] (step=0001377) Train Loss: 0.2454, Train Steps/Sec: 0.12, Epoch: 0.0267586474931986, LR: 0.001 [2025-07-27 22:57:01] (step=0001378) Train Loss: 0.1946, Train Steps/Sec: 0.12, Epoch: 0.026778080062184222, LR: 0.001 [2025-07-27 22:57:09] (step=0001379) Train Loss: 0.3609, Train Steps/Sec: 0.13, Epoch: 0.02679751263116984, LR: 0.001 [2025-07-27 22:57:17] (step=0001380) Train Loss: 0.2101, Train Steps/Sec: 0.12, Epoch: 0.02681694520015546, LR: 0.001 [2025-07-27 22:57:25] (step=0001381) Train Loss: 0.2008, Train Steps/Sec: 0.12, Epoch: 0.02683637776914108, LR: 0.001 [2025-07-27 22:57:33] (step=0001382) Train Loss: 0.1814, Train Steps/Sec: 0.12, Epoch: 0.0268558103381267, LR: 0.001 [2025-07-27 22:57:41] (step=0001383) Train Loss: 0.2931, Train Steps/Sec: 0.12, Epoch: 0.02687524290711232, LR: 0.001 [2025-07-27 22:57:49] (step=0001384) Train Loss: 0.2017, Train Steps/Sec: 0.12, Epoch: 0.02689467547609794, LR: 0.001 [2025-07-27 22:57:57] (step=0001385) Train Loss: 0.3213, Train Steps/Sec: 0.12, Epoch: 0.02691410804508356, LR: 0.001 [2025-07-27 22:58:05] (step=0001386) Train Loss: 0.2046, Train Steps/Sec: 0.13, Epoch: 0.02693354061406918, LR: 0.001 [2025-07-27 22:58:13] (step=0001387) Train Loss: 0.3315, Train Steps/Sec: 0.12, Epoch: 0.0269529731830548, LR: 0.001 [2025-07-27 22:58:21] (step=0001388) Train Loss: 0.2434, Train Steps/Sec: 0.13, Epoch: 0.02697240575204042, LR: 0.001 [2025-07-27 22:58:29] (step=0001389) Train Loss: 0.2546, Train Steps/Sec: 0.14, Epoch: 0.02699183832102604, LR: 0.001 [2025-07-27 22:58:35] (step=0001390) Train Loss: 0.2798, Train Steps/Sec: 0.16, Epoch: 0.02701127089001166, LR: 0.001 [2025-07-27 22:58:43] (step=0001391) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.02703070345899728, LR: 0.001 [2025-07-27 22:58:51] (step=0001392) Train Loss: 0.2348, Train Steps/Sec: 0.13, Epoch: 0.0270501360279829, LR: 0.001 [2025-07-27 22:58:59] (step=0001393) Train Loss: 0.2616, Train Steps/Sec: 0.13, Epoch: 0.02706956859696852, LR: 0.001 [2025-07-27 22:59:07] (step=0001394) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.02708900116595414, LR: 0.001 [2025-07-27 22:59:15] (step=0001395) Train Loss: 0.2707, Train Steps/Sec: 0.12, Epoch: 0.02710843373493976, LR: 0.001 [2025-07-27 22:59:23] (step=0001396) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.02712786630392538, LR: 0.001 [2025-07-27 22:59:31] (step=0001397) Train Loss: 0.2422, Train Steps/Sec: 0.13, Epoch: 0.027147298872910997, LR: 0.001 [2025-07-27 22:59:39] (step=0001398) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.02716673144189662, LR: 0.001 [2025-07-27 22:59:47] (step=0001399) Train Loss: 0.2717, Train Steps/Sec: 0.12, Epoch: 0.02718616401088224, LR: 0.001 [2025-07-27 22:59:55] (step=0001400) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.027205596579867857, LR: 0.001 [2025-07-27 23:00:03] (step=0001401) Train Loss: 0.2360, Train Steps/Sec: 0.12, Epoch: 0.02722502914885348, LR: 0.001 [2025-07-27 23:00:12] (step=0001402) Train Loss: 0.3107, Train Steps/Sec: 0.12, Epoch: 0.027244461717839098, LR: 0.001 [2025-07-27 23:00:20] (step=0001403) Train Loss: 0.2518, Train Steps/Sec: 0.12, Epoch: 0.027263894286824717, LR: 0.001 [2025-07-27 23:00:28] (step=0001404) Train Loss: 0.2125, Train Steps/Sec: 0.13, Epoch: 0.02728332685581034, LR: 0.001 [2025-07-27 23:00:36] (step=0001405) Train Loss: 0.1980, Train Steps/Sec: 0.12, Epoch: 0.027302759424795958, LR: 0.001 [2025-07-27 23:00:44] (step=0001406) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.027322191993781577, LR: 0.001 [2025-07-27 23:00:52] (step=0001407) Train Loss: 0.2176, Train Steps/Sec: 0.12, Epoch: 0.0273416245627672, LR: 0.001 [2025-07-27 23:01:00] (step=0001408) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.027361057131752818, LR: 0.001 [2025-07-27 23:01:08] (step=0001409) Train Loss: 0.3219, Train Steps/Sec: 0.12, Epoch: 0.027380489700738437, LR: 0.001 [2025-07-27 23:01:16] (step=0001410) Train Loss: 0.1884, Train Steps/Sec: 0.12, Epoch: 0.02739992226972406, LR: 0.001 [2025-07-27 23:01:24] (step=0001411) Train Loss: 0.2489, Train Steps/Sec: 0.12, Epoch: 0.027419354838709678, LR: 0.001 [2025-07-27 23:01:32] (step=0001412) Train Loss: 0.1874, Train Steps/Sec: 0.12, Epoch: 0.027438787407695297, LR: 0.001 [2025-07-27 23:01:40] (step=0001413) Train Loss: 0.3080, Train Steps/Sec: 0.13, Epoch: 0.027458219976680916, LR: 0.001 [2025-07-27 23:01:48] (step=0001414) Train Loss: 0.2068, Train Steps/Sec: 0.13, Epoch: 0.027477652545666538, LR: 0.001 [2025-07-27 23:01:56] (step=0001415) Train Loss: 0.2645, Train Steps/Sec: 0.12, Epoch: 0.027497085114652157, LR: 0.001 [2025-07-27 23:02:04] (step=0001416) Train Loss: 0.2811, Train Steps/Sec: 0.13, Epoch: 0.027516517683637776, LR: 0.001 [2025-07-27 23:02:12] (step=0001417) Train Loss: 0.1776, Train Steps/Sec: 0.13, Epoch: 0.027535950252623398, LR: 0.001 [2025-07-27 23:02:20] (step=0001418) Train Loss: 0.2460, Train Steps/Sec: 0.12, Epoch: 0.027555382821609017, LR: 0.001 [2025-07-27 23:02:28] (step=0001419) Train Loss: 0.2432, Train Steps/Sec: 0.13, Epoch: 0.027574815390594636, LR: 0.001 [2025-07-27 23:02:36] (step=0001420) Train Loss: 0.1655, Train Steps/Sec: 0.12, Epoch: 0.027594247959580258, LR: 0.001 [2025-07-27 23:02:44] (step=0001421) Train Loss: 0.2333, Train Steps/Sec: 0.13, Epoch: 0.027613680528565877, LR: 0.001 [2025-07-27 23:02:52] (step=0001422) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.027633113097551495, LR: 0.001 [2025-07-27 23:02:58] (step=0001423) Train Loss: 0.2720, Train Steps/Sec: 0.17, Epoch: 0.027652545666537118, LR: 0.001 [2025-07-27 23:03:06] (step=0001424) Train Loss: 0.1984, Train Steps/Sec: 0.12, Epoch: 0.027671978235522737, LR: 0.001 [2025-07-27 23:03:14] (step=0001425) Train Loss: 0.2061, Train Steps/Sec: 0.13, Epoch: 0.027691410804508355, LR: 0.001 [2025-07-27 23:03:22] (step=0001426) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.027710843373493974, LR: 0.001 [2025-07-27 23:03:30] (step=0001427) Train Loss: 0.1420, Train Steps/Sec: 0.12, Epoch: 0.027730275942479597, LR: 0.001 [2025-07-27 23:03:38] (step=0001428) Train Loss: 0.2287, Train Steps/Sec: 0.13, Epoch: 0.027749708511465215, LR: 0.001 [2025-07-27 23:03:46] (step=0001429) Train Loss: 0.1868, Train Steps/Sec: 0.12, Epoch: 0.027769141080450834, LR: 0.001 [2025-07-27 23:03:54] (step=0001430) Train Loss: 0.2661, Train Steps/Sec: 0.13, Epoch: 0.027788573649436456, LR: 0.001 [2025-07-27 23:04:02] (step=0001431) Train Loss: 0.1892, Train Steps/Sec: 0.12, Epoch: 0.027808006218422075, LR: 0.001 [2025-07-27 23:04:10] (step=0001432) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.027827438787407694, LR: 0.001 [2025-07-27 23:04:18] (step=0001433) Train Loss: 0.1855, Train Steps/Sec: 0.13, Epoch: 0.027846871356393316, LR: 0.001 [2025-07-27 23:04:26] (step=0001434) Train Loss: 0.2079, Train Steps/Sec: 0.12, Epoch: 0.027866303925378935, LR: 0.001 [2025-07-27 23:04:34] (step=0001435) Train Loss: 0.1803, Train Steps/Sec: 0.12, Epoch: 0.027885736494364554, LR: 0.001 [2025-07-27 23:04:43] (step=0001436) Train Loss: 0.3645, Train Steps/Sec: 0.12, Epoch: 0.027905169063350176, LR: 0.001 [2025-07-27 23:04:51] (step=0001437) Train Loss: 0.2340, Train Steps/Sec: 0.12, Epoch: 0.027924601632335795, LR: 0.001 [2025-07-27 23:04:59] (step=0001438) Train Loss: 0.2397, Train Steps/Sec: 0.12, Epoch: 0.027944034201321414, LR: 0.001 [2025-07-27 23:05:07] (step=0001439) Train Loss: 0.2603, Train Steps/Sec: 0.12, Epoch: 0.027963466770307036, LR: 0.001 [2025-07-27 23:05:15] (step=0001440) Train Loss: 0.2146, Train Steps/Sec: 0.12, Epoch: 0.027982899339292655, LR: 0.001 [2025-07-27 23:05:23] (step=0001441) Train Loss: 0.2874, Train Steps/Sec: 0.12, Epoch: 0.028002331908278274, LR: 0.001 [2025-07-27 23:05:31] (step=0001442) Train Loss: 0.2752, Train Steps/Sec: 0.12, Epoch: 0.028021764477263893, LR: 0.001 [2025-07-27 23:05:39] (step=0001443) Train Loss: 0.3592, Train Steps/Sec: 0.12, Epoch: 0.028041197046249515, LR: 0.001 [2025-07-27 23:05:47] (step=0001444) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.028060629615235134, LR: 0.001 [2025-07-27 23:05:55] (step=0001445) Train Loss: 0.2396, Train Steps/Sec: 0.12, Epoch: 0.028080062184220753, LR: 0.001 [2025-07-27 23:06:03] (step=0001446) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.028099494753206375, LR: 0.001 [2025-07-27 23:06:11] (step=0001447) Train Loss: 0.2279, Train Steps/Sec: 0.12, Epoch: 0.028118927322191994, LR: 0.001 [2025-07-27 23:06:19] (step=0001448) Train Loss: 0.2768, Train Steps/Sec: 0.12, Epoch: 0.028138359891177613, LR: 0.001 [2025-07-27 23:06:27] (step=0001449) Train Loss: 0.2057, Train Steps/Sec: 0.12, Epoch: 0.028157792460163235, LR: 0.001 [2025-07-27 23:06:35] (step=0001450) Train Loss: 0.2062, Train Steps/Sec: 0.12, Epoch: 0.028177225029148854, LR: 0.001 [2025-07-27 23:06:43] (step=0001451) Train Loss: 0.1998, Train Steps/Sec: 0.12, Epoch: 0.028196657598134472, LR: 0.001 [2025-07-27 23:06:51] (step=0001452) Train Loss: 0.1550, Train Steps/Sec: 0.12, Epoch: 0.028216090167120095, LR: 0.001 [2025-07-27 23:06:59] (step=0001453) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.028235522736105714, LR: 0.001 [2025-07-27 23:07:07] (step=0001454) Train Loss: 0.2499, Train Steps/Sec: 0.13, Epoch: 0.028254955305091332, LR: 0.001 [2025-07-27 23:07:15] (step=0001455) Train Loss: 0.2743, Train Steps/Sec: 0.12, Epoch: 0.028274387874076955, LR: 0.001 [2025-07-27 23:07:21] (step=0001456) Train Loss: 0.2082, Train Steps/Sec: 0.19, Epoch: 0.028293820443062574, LR: 0.001 [2025-07-27 23:07:29] (step=0001457) Train Loss: 0.2773, Train Steps/Sec: 0.12, Epoch: 0.028313253012048192, LR: 0.001 [2025-07-27 23:07:37] (step=0001458) Train Loss: 0.1954, Train Steps/Sec: 0.12, Epoch: 0.02833268558103381, LR: 0.001 [2025-07-27 23:07:45] (step=0001459) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.028352118150019433, LR: 0.001 [2025-07-27 23:07:53] (step=0001460) Train Loss: 0.2320, Train Steps/Sec: 0.12, Epoch: 0.028371550719005052, LR: 0.001 [2025-07-27 23:08:01] (step=0001461) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.02839098328799067, LR: 0.001 [2025-07-27 23:08:09] (step=0001462) Train Loss: 0.1977, Train Steps/Sec: 0.12, Epoch: 0.028410415856976293, LR: 0.001 [2025-07-27 23:08:17] (step=0001463) Train Loss: 0.1752, Train Steps/Sec: 0.12, Epoch: 0.028429848425961912, LR: 0.001 [2025-07-27 23:08:25] (step=0001464) Train Loss: 0.2415, Train Steps/Sec: 0.12, Epoch: 0.02844928099494753, LR: 0.001 [2025-07-27 23:08:33] (step=0001465) Train Loss: 0.1872, Train Steps/Sec: 0.13, Epoch: 0.028468713563933153, LR: 0.001 [2025-07-27 23:08:41] (step=0001466) Train Loss: 0.2560, Train Steps/Sec: 0.12, Epoch: 0.028488146132918772, LR: 0.001 [2025-07-27 23:08:49] (step=0001467) Train Loss: 0.1867, Train Steps/Sec: 0.12, Epoch: 0.02850757870190439, LR: 0.001 [2025-07-27 23:08:57] (step=0001468) Train Loss: 0.2589, Train Steps/Sec: 0.12, Epoch: 0.028527011270890013, LR: 0.001 [2025-07-27 23:09:05] (step=0001469) Train Loss: 0.3113, Train Steps/Sec: 0.12, Epoch: 0.028546443839875632, LR: 0.001 [2025-07-27 23:09:13] (step=0001470) Train Loss: 0.2780, Train Steps/Sec: 0.12, Epoch: 0.02856587640886125, LR: 0.001 [2025-07-27 23:09:21] (step=0001471) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.02858530897784687, LR: 0.001 [2025-07-27 23:09:29] (step=0001472) Train Loss: 0.2675, Train Steps/Sec: 0.13, Epoch: 0.028604741546832492, LR: 0.001 [2025-07-27 23:09:37] (step=0001473) Train Loss: 0.2015, Train Steps/Sec: 0.13, Epoch: 0.02862417411581811, LR: 0.001 [2025-07-27 23:09:45] (step=0001474) Train Loss: 0.2457, Train Steps/Sec: 0.13, Epoch: 0.02864360668480373, LR: 0.001 [2025-07-27 23:09:53] (step=0001475) Train Loss: 0.2267, Train Steps/Sec: 0.13, Epoch: 0.028663039253789352, LR: 0.001 [2025-07-27 23:10:01] (step=0001476) Train Loss: 0.2580, Train Steps/Sec: 0.12, Epoch: 0.02868247182277497, LR: 0.001 [2025-07-27 23:10:09] (step=0001477) Train Loss: 0.3142, Train Steps/Sec: 0.13, Epoch: 0.02870190439176059, LR: 0.001 [2025-07-27 23:10:17] (step=0001478) Train Loss: 0.2331, Train Steps/Sec: 0.12, Epoch: 0.028721336960746212, LR: 0.001 [2025-07-27 23:10:25] (step=0001479) Train Loss: 0.3545, Train Steps/Sec: 0.12, Epoch: 0.02874076952973183, LR: 0.001 [2025-07-27 23:10:34] (step=0001480) Train Loss: 0.3167, Train Steps/Sec: 0.12, Epoch: 0.02876020209871745, LR: 0.001 [2025-07-27 23:10:42] (step=0001481) Train Loss: 0.2251, Train Steps/Sec: 0.12, Epoch: 0.028779634667703072, LR: 0.001 [2025-07-27 23:10:50] (step=0001482) Train Loss: 0.1834, Train Steps/Sec: 0.13, Epoch: 0.02879906723668869, LR: 0.001 [2025-07-27 23:10:58] (step=0001483) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.02881849980567431, LR: 0.001 [2025-07-27 23:11:05] (step=0001484) Train Loss: 0.2253, Train Steps/Sec: 0.13, Epoch: 0.02883793237465993, LR: 0.001 [2025-07-27 23:11:13] (step=0001485) Train Loss: 0.2893, Train Steps/Sec: 0.13, Epoch: 0.02885736494364555, LR: 0.001 [2025-07-27 23:11:21] (step=0001486) Train Loss: 0.3225, Train Steps/Sec: 0.13, Epoch: 0.02887679751263117, LR: 0.001 [2025-07-27 23:11:29] (step=0001487) Train Loss: 0.2866, Train Steps/Sec: 0.13, Epoch: 0.028896230081616788, LR: 0.001 [2025-07-27 23:11:37] (step=0001488) Train Loss: 0.2152, Train Steps/Sec: 0.12, Epoch: 0.02891566265060241, LR: 0.001 [2025-07-27 23:11:42] (step=0001489) Train Loss: 0.2524, Train Steps/Sec: 0.18, Epoch: 0.02893509521958803, LR: 0.001 [2025-07-27 23:11:51] (step=0001490) Train Loss: 0.2446, Train Steps/Sec: 0.12, Epoch: 0.028954527788573648, LR: 0.001 [2025-07-27 23:11:59] (step=0001491) Train Loss: 0.2070, Train Steps/Sec: 0.13, Epoch: 0.02897396035755927, LR: 0.001 [2025-07-27 23:12:07] (step=0001492) Train Loss: 0.2516, Train Steps/Sec: 0.12, Epoch: 0.02899339292654489, LR: 0.001 [2025-07-27 23:12:15] (step=0001493) Train Loss: 0.2296, Train Steps/Sec: 0.13, Epoch: 0.029012825495530508, LR: 0.001 [2025-07-27 23:12:23] (step=0001494) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.02903225806451613, LR: 0.001 [2025-07-27 23:12:31] (step=0001495) Train Loss: 0.2528, Train Steps/Sec: 0.13, Epoch: 0.02905169063350175, LR: 0.001 [2025-07-27 23:12:39] (step=0001496) Train Loss: 0.1798, Train Steps/Sec: 0.12, Epoch: 0.029071123202487368, LR: 0.001 [2025-07-27 23:12:47] (step=0001497) Train Loss: 0.1374, Train Steps/Sec: 0.12, Epoch: 0.02909055577147299, LR: 0.001 [2025-07-27 23:12:55] (step=0001498) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.02910998834045861, LR: 0.001 [2025-07-27 23:13:03] (step=0001499) Train Loss: 0.2586, Train Steps/Sec: 0.12, Epoch: 0.029129420909444228, LR: 0.001 [2025-07-27 23:13:11] (step=0001500) Train Loss: 0.2551, Train Steps/Sec: 0.12, Epoch: 0.029148853478429847, LR: 0.001 [2025-07-27 23:13:19] (step=0001501) Train Loss: 0.2593, Train Steps/Sec: 0.12, Epoch: 0.02916828604741547, LR: 0.001 [2025-07-27 23:13:27] (step=0001502) Train Loss: 0.2320, Train Steps/Sec: 0.12, Epoch: 0.029187718616401088, LR: 0.001 [2025-07-27 23:13:35] (step=0001503) Train Loss: 0.2138, Train Steps/Sec: 0.13, Epoch: 0.029207151185386707, LR: 0.001 [2025-07-27 23:13:43] (step=0001504) Train Loss: 0.2778, Train Steps/Sec: 0.12, Epoch: 0.02922658375437233, LR: 0.001 [2025-07-27 23:13:51] (step=0001505) Train Loss: 0.2701, Train Steps/Sec: 0.12, Epoch: 0.029246016323357948, LR: 0.001 [2025-07-27 23:13:59] (step=0001506) Train Loss: 0.2040, Train Steps/Sec: 0.13, Epoch: 0.029265448892343567, LR: 0.001 [2025-07-27 23:14:07] (step=0001507) Train Loss: 0.2630, Train Steps/Sec: 0.12, Epoch: 0.02928488146132919, LR: 0.001 [2025-07-27 23:14:15] (step=0001508) Train Loss: 0.2615, Train Steps/Sec: 0.12, Epoch: 0.029304314030314808, LR: 0.001 [2025-07-27 23:14:23] (step=0001509) Train Loss: 0.2681, Train Steps/Sec: 0.12, Epoch: 0.029323746599300426, LR: 0.001 [2025-07-27 23:14:31] (step=0001510) Train Loss: 0.1750, Train Steps/Sec: 0.13, Epoch: 0.02934317916828605, LR: 0.001 [2025-07-27 23:14:39] (step=0001511) Train Loss: 0.2331, Train Steps/Sec: 0.12, Epoch: 0.029362611737271668, LR: 0.001 [2025-07-27 23:14:48] (step=0001512) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.029382044306257286, LR: 0.001 [2025-07-27 23:14:56] (step=0001513) Train Loss: 0.2822, Train Steps/Sec: 0.12, Epoch: 0.02940147687524291, LR: 0.001 [2025-07-27 23:15:04] (step=0001514) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.029420909444228528, LR: 0.001 [2025-07-27 23:15:12] (step=0001515) Train Loss: 0.1853, Train Steps/Sec: 0.13, Epoch: 0.029440342013214146, LR: 0.001 [2025-07-27 23:15:20] (step=0001516) Train Loss: 0.2080, Train Steps/Sec: 0.12, Epoch: 0.029459774582199765, LR: 0.001 [2025-07-27 23:15:28] (step=0001517) Train Loss: 0.2610, Train Steps/Sec: 0.13, Epoch: 0.029479207151185387, LR: 0.001 [2025-07-27 23:15:36] (step=0001518) Train Loss: 0.3112, Train Steps/Sec: 0.12, Epoch: 0.029498639720171006, LR: 0.001 [2025-07-27 23:15:44] (step=0001519) Train Loss: 0.2987, Train Steps/Sec: 0.12, Epoch: 0.029518072289156625, LR: 0.001 [2025-07-27 23:15:52] (step=0001520) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.029537504858142247, LR: 0.001 [2025-07-27 23:16:00] (step=0001521) Train Loss: 0.1807, Train Steps/Sec: 0.12, Epoch: 0.029556937427127866, LR: 0.001 [2025-07-27 23:16:06] (step=0001522) Train Loss: 0.2373, Train Steps/Sec: 0.17, Epoch: 0.029576369996113485, LR: 0.001 [2025-07-27 23:16:14] (step=0001523) Train Loss: 0.2348, Train Steps/Sec: 0.12, Epoch: 0.029595802565099107, LR: 0.001 [2025-07-27 23:16:22] (step=0001524) Train Loss: 0.2154, Train Steps/Sec: 0.13, Epoch: 0.029615235134084726, LR: 0.001 [2025-07-27 23:16:30] (step=0001525) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.029634667703070345, LR: 0.001 [2025-07-27 23:16:38] (step=0001526) Train Loss: 0.1799, Train Steps/Sec: 0.13, Epoch: 0.029654100272055967, LR: 0.001 [2025-07-27 23:16:46] (step=0001527) Train Loss: 0.1708, Train Steps/Sec: 0.12, Epoch: 0.029673532841041586, LR: 0.001 [2025-07-27 23:16:54] (step=0001528) Train Loss: 0.2115, Train Steps/Sec: 0.13, Epoch: 0.029692965410027205, LR: 0.001 [2025-07-27 23:17:02] (step=0001529) Train Loss: 0.2550, Train Steps/Sec: 0.13, Epoch: 0.029712397979012827, LR: 0.001 [2025-07-27 23:17:10] (step=0001530) Train Loss: 0.2402, Train Steps/Sec: 0.12, Epoch: 0.029731830547998446, LR: 0.001 [2025-07-27 23:17:18] (step=0001531) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.029751263116984065, LR: 0.001 [2025-07-27 23:17:26] (step=0001532) Train Loss: 0.2080, Train Steps/Sec: 0.12, Epoch: 0.029770695685969684, LR: 0.001 [2025-07-27 23:17:34] (step=0001533) Train Loss: 0.1844, Train Steps/Sec: 0.12, Epoch: 0.029790128254955306, LR: 0.001 [2025-07-27 23:17:42] (step=0001534) Train Loss: 0.2811, Train Steps/Sec: 0.12, Epoch: 0.029809560823940925, LR: 0.001 [2025-07-27 23:17:50] (step=0001535) Train Loss: 0.2880, Train Steps/Sec: 0.12, Epoch: 0.029828993392926544, LR: 0.001 [2025-07-27 23:17:58] (step=0001536) Train Loss: 0.2395, Train Steps/Sec: 0.12, Epoch: 0.029848425961912166, LR: 0.001 [2025-07-27 23:18:06] (step=0001537) Train Loss: 0.2910, Train Steps/Sec: 0.12, Epoch: 0.029867858530897785, LR: 0.001 [2025-07-27 23:18:14] (step=0001538) Train Loss: 0.1917, Train Steps/Sec: 0.12, Epoch: 0.029887291099883403, LR: 0.001 [2025-07-27 23:18:22] (step=0001539) Train Loss: 0.3065, Train Steps/Sec: 0.12, Epoch: 0.029906723668869026, LR: 0.001 [2025-07-27 23:18:30] (step=0001540) Train Loss: 0.1479, Train Steps/Sec: 0.12, Epoch: 0.029926156237854645, LR: 0.001 [2025-07-27 23:18:38] (step=0001541) Train Loss: 0.2771, Train Steps/Sec: 0.13, Epoch: 0.029945588806840263, LR: 0.001 [2025-07-27 23:18:46] (step=0001542) Train Loss: 0.2716, Train Steps/Sec: 0.12, Epoch: 0.029965021375825886, LR: 0.001 [2025-07-27 23:18:55] (step=0001543) Train Loss: 0.3131, Train Steps/Sec: 0.13, Epoch: 0.029984453944811505, LR: 0.001 [2025-07-27 23:19:03] (step=0001544) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.030003886513797123, LR: 0.001 [2025-07-27 23:19:11] (step=0001545) Train Loss: 0.1597, Train Steps/Sec: 0.13, Epoch: 0.030023319082782742, LR: 0.001 [2025-07-27 23:19:19] (step=0001546) Train Loss: 0.3036, Train Steps/Sec: 0.12, Epoch: 0.030042751651768364, LR: 0.001 [2025-07-27 23:19:27] (step=0001547) Train Loss: 0.2915, Train Steps/Sec: 0.12, Epoch: 0.030062184220753983, LR: 0.001 [2025-07-27 23:19:35] (step=0001548) Train Loss: 0.2844, Train Steps/Sec: 0.13, Epoch: 0.030081616789739602, LR: 0.001 [2025-07-27 23:19:43] (step=0001549) Train Loss: 0.2606, Train Steps/Sec: 0.12, Epoch: 0.030101049358725224, LR: 0.001 [2025-07-27 23:19:51] (step=0001550) Train Loss: 0.2557, Train Steps/Sec: 0.13, Epoch: 0.030120481927710843, LR: 0.001 [2025-07-27 23:19:59] (step=0001551) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.030139914496696462, LR: 0.001 [2025-07-27 23:20:07] (step=0001552) Train Loss: 0.2580, Train Steps/Sec: 0.12, Epoch: 0.030159347065682084, LR: 0.001 [2025-07-27 23:20:15] (step=0001553) Train Loss: 0.2364, Train Steps/Sec: 0.13, Epoch: 0.030178779634667703, LR: 0.001 [2025-07-27 23:20:23] (step=0001554) Train Loss: 0.2559, Train Steps/Sec: 0.12, Epoch: 0.030198212203653322, LR: 0.001 [2025-07-27 23:20:29] (step=0001555) Train Loss: 0.2283, Train Steps/Sec: 0.15, Epoch: 0.030217644772638944, LR: 0.001 [2025-07-27 23:20:36] (step=0001556) Train Loss: 0.2679, Train Steps/Sec: 0.14, Epoch: 0.030237077341624563, LR: 0.001 [2025-07-27 23:20:44] (step=0001557) Train Loss: 0.2930, Train Steps/Sec: 0.13, Epoch: 0.030256509910610182, LR: 0.001 [2025-07-27 23:20:52] (step=0001558) Train Loss: 0.1578, Train Steps/Sec: 0.12, Epoch: 0.030275942479595804, LR: 0.001 [2025-07-27 23:21:00] (step=0001559) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.030295375048581423, LR: 0.001 [2025-07-27 23:21:08] (step=0001560) Train Loss: 0.1733, Train Steps/Sec: 0.12, Epoch: 0.030314807617567042, LR: 0.001 [2025-07-27 23:21:16] (step=0001561) Train Loss: 0.2147, Train Steps/Sec: 0.13, Epoch: 0.03033424018655266, LR: 0.001 [2025-07-27 23:21:25] (step=0001562) Train Loss: 0.2178, Train Steps/Sec: 0.12, Epoch: 0.030353672755538283, LR: 0.001 [2025-07-27 23:21:33] (step=0001563) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.030373105324523902, LR: 0.001 [2025-07-27 23:21:41] (step=0001564) Train Loss: 0.2072, Train Steps/Sec: 0.13, Epoch: 0.03039253789350952, LR: 0.001 [2025-07-27 23:21:49] (step=0001565) Train Loss: 0.2526, Train Steps/Sec: 0.12, Epoch: 0.030411970462495143, LR: 0.001 [2025-07-27 23:21:57] (step=0001566) Train Loss: 0.2741, Train Steps/Sec: 0.13, Epoch: 0.03043140303148076, LR: 0.001 [2025-07-27 23:22:05] (step=0001567) Train Loss: 0.2931, Train Steps/Sec: 0.12, Epoch: 0.03045083560046638, LR: 0.001 [2025-07-27 23:22:13] (step=0001568) Train Loss: 0.2304, Train Steps/Sec: 0.13, Epoch: 0.030470268169452003, LR: 0.001 [2025-07-27 23:22:21] (step=0001569) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.03048970073843762, LR: 0.001 [2025-07-27 23:22:29] (step=0001570) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.03050913330742324, LR: 0.001 [2025-07-27 23:22:37] (step=0001571) Train Loss: 0.2440, Train Steps/Sec: 0.13, Epoch: 0.030528565876408863, LR: 0.001 [2025-07-27 23:22:45] (step=0001572) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.03054799844539448, LR: 0.001 [2025-07-27 23:22:53] (step=0001573) Train Loss: 0.2618, Train Steps/Sec: 0.13, Epoch: 0.0305674310143801, LR: 0.001 [2025-07-27 23:23:01] (step=0001574) Train Loss: 0.3404, Train Steps/Sec: 0.12, Epoch: 0.030586863583365723, LR: 0.001 [2025-07-27 23:23:09] (step=0001575) Train Loss: 0.2182, Train Steps/Sec: 0.13, Epoch: 0.03060629615235134, LR: 0.001 [2025-07-27 23:23:17] (step=0001576) Train Loss: 0.1990, Train Steps/Sec: 0.13, Epoch: 0.03062572872133696, LR: 0.001 [2025-07-27 23:23:25] (step=0001577) Train Loss: 0.1780, Train Steps/Sec: 0.12, Epoch: 0.03064516129032258, LR: 0.001 [2025-07-27 23:23:33] (step=0001578) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.0306645938593082, LR: 0.001 [2025-07-27 23:23:41] (step=0001579) Train Loss: 0.1490, Train Steps/Sec: 0.12, Epoch: 0.03068402642829382, LR: 0.001 [2025-07-27 23:23:49] (step=0001580) Train Loss: 0.2451, Train Steps/Sec: 0.12, Epoch: 0.03070345899727944, LR: 0.001 [2025-07-27 23:23:57] (step=0001581) Train Loss: 0.1633, Train Steps/Sec: 0.12, Epoch: 0.03072289156626506, LR: 0.001 [2025-07-27 23:24:05] (step=0001582) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.03074232413525068, LR: 0.001 [2025-07-27 23:24:13] (step=0001583) Train Loss: 0.2874, Train Steps/Sec: 0.13, Epoch: 0.0307617567042363, LR: 0.001 [2025-07-27 23:24:21] (step=0001584) Train Loss: 0.3038, Train Steps/Sec: 0.12, Epoch: 0.03078118927322192, LR: 0.001 [2025-07-27 23:24:29] (step=0001585) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.03080062184220754, LR: 0.001 [2025-07-27 23:24:37] (step=0001586) Train Loss: 0.2126, Train Steps/Sec: 0.12, Epoch: 0.03082005441119316, LR: 0.001 [2025-07-27 23:24:45] (step=0001587) Train Loss: 0.3163, Train Steps/Sec: 0.13, Epoch: 0.03083948698017878, LR: 0.001 [2025-07-27 23:24:53] (step=0001588) Train Loss: 0.2860, Train Steps/Sec: 0.13, Epoch: 0.0308589195491644, LR: 0.001 [2025-07-27 23:24:59] (step=0001589) Train Loss: 0.2577, Train Steps/Sec: 0.17, Epoch: 0.03087835211815002, LR: 0.001 [2025-07-27 23:25:07] (step=0001590) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.030897784687135638, LR: 0.001 [2025-07-27 23:25:15] (step=0001591) Train Loss: 0.2344, Train Steps/Sec: 0.13, Epoch: 0.03091721725612126, LR: 0.001 [2025-07-27 23:25:23] (step=0001592) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.03093664982510688, LR: 0.001 [2025-07-27 23:25:31] (step=0001593) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.030956082394092498, LR: 0.001 [2025-07-27 23:25:39] (step=0001594) Train Loss: 0.1966, Train Steps/Sec: 0.12, Epoch: 0.03097551496307812, LR: 0.001 [2025-07-27 23:25:47] (step=0001595) Train Loss: 0.2326, Train Steps/Sec: 0.12, Epoch: 0.03099494753206374, LR: 0.001 [2025-07-27 23:25:55] (step=0001596) Train Loss: 0.2378, Train Steps/Sec: 0.13, Epoch: 0.031014380101049357, LR: 0.001 [2025-07-27 23:26:03] (step=0001597) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.03103381267003498, LR: 0.001 [2025-07-27 23:26:11] (step=0001598) Train Loss: 0.2843, Train Steps/Sec: 0.13, Epoch: 0.0310532452390206, LR: 0.001 [2025-07-27 23:26:19] (step=0001599) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.031072677808006217, LR: 0.001 [2025-07-27 23:26:27] (step=0001600) Train Loss: 0.2622, Train Steps/Sec: 0.12, Epoch: 0.03109211037699184, LR: 0.001 [2025-07-27 23:26:35] (step=0001601) Train Loss: 0.1944, Train Steps/Sec: 0.13, Epoch: 0.03111154294597746, LR: 0.001 [2025-07-27 23:26:43] (step=0001602) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.031130975514963077, LR: 0.001 [2025-07-27 23:26:51] (step=0001603) Train Loss: 0.1817, Train Steps/Sec: 0.12, Epoch: 0.0311504080839487, LR: 0.001 [2025-07-27 23:26:59] (step=0001604) Train Loss: 0.1922, Train Steps/Sec: 0.13, Epoch: 0.03116984065293432, LR: 0.001 [2025-07-27 23:27:07] (step=0001605) Train Loss: 0.3121, Train Steps/Sec: 0.12, Epoch: 0.031189273221919937, LR: 0.001 [2025-07-27 23:27:15] (step=0001606) Train Loss: 0.1967, Train Steps/Sec: 0.12, Epoch: 0.031208705790905556, LR: 0.001 [2025-07-27 23:27:23] (step=0001607) Train Loss: 0.2800, Train Steps/Sec: 0.12, Epoch: 0.03122813835989118, LR: 0.001 [2025-07-27 23:27:31] (step=0001608) Train Loss: 0.2849, Train Steps/Sec: 0.13, Epoch: 0.031247570928876797, LR: 0.001 [2025-07-27 23:27:39] (step=0001609) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.03126700349786242, LR: 0.001 [2025-07-27 23:27:47] (step=0001610) Train Loss: 0.1995, Train Steps/Sec: 0.12, Epoch: 0.031286436066848035, LR: 0.001 [2025-07-27 23:27:55] (step=0001611) Train Loss: 0.3239, Train Steps/Sec: 0.12, Epoch: 0.03130586863583366, LR: 0.001 [2025-07-27 23:28:04] (step=0001612) Train Loss: 0.2672, Train Steps/Sec: 0.12, Epoch: 0.03132530120481928, LR: 0.001 [2025-07-27 23:28:12] (step=0001613) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.031344733773804895, LR: 0.001 [2025-07-27 23:28:20] (step=0001614) Train Loss: 0.2368, Train Steps/Sec: 0.12, Epoch: 0.03136416634279052, LR: 0.001 [2025-07-27 23:28:28] (step=0001615) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.03138359891177614, LR: 0.001 [2025-07-27 23:28:36] (step=0001616) Train Loss: 0.2598, Train Steps/Sec: 0.12, Epoch: 0.031403031480761755, LR: 0.001 [2025-07-27 23:28:44] (step=0001617) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.03142246404974738, LR: 0.001 [2025-07-27 23:28:52] (step=0001618) Train Loss: 0.2596, Train Steps/Sec: 0.12, Epoch: 0.031441896618733, LR: 0.001 [2025-07-27 23:29:00] (step=0001619) Train Loss: 0.2382, Train Steps/Sec: 0.12, Epoch: 0.031461329187718615, LR: 0.001 [2025-07-27 23:29:08] (step=0001620) Train Loss: 0.2445, Train Steps/Sec: 0.13, Epoch: 0.03148076175670424, LR: 0.001 [2025-07-27 23:29:16] (step=0001621) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.03150019432568986, LR: 0.001 [2025-07-27 23:29:21] (step=0001622) Train Loss: 0.1891, Train Steps/Sec: 0.18, Epoch: 0.031519626894675475, LR: 0.001 [2025-07-27 23:29:29] (step=0001623) Train Loss: 0.1791, Train Steps/Sec: 0.12, Epoch: 0.0315390594636611, LR: 0.001 [2025-07-27 23:29:37] (step=0001624) Train Loss: 0.2042, Train Steps/Sec: 0.13, Epoch: 0.03155849203264672, LR: 0.001 [2025-07-27 23:29:45] (step=0001625) Train Loss: 0.2134, Train Steps/Sec: 0.12, Epoch: 0.031577924601632335, LR: 0.001 [2025-07-27 23:29:53] (step=0001626) Train Loss: 0.2057, Train Steps/Sec: 0.13, Epoch: 0.03159735717061796, LR: 0.001 [2025-07-27 23:30:02] (step=0001627) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.03161678973960357, LR: 0.001 [2025-07-27 23:30:10] (step=0001628) Train Loss: 0.2656, Train Steps/Sec: 0.12, Epoch: 0.031636222308589194, LR: 0.001 [2025-07-27 23:30:18] (step=0001629) Train Loss: 0.2154, Train Steps/Sec: 0.13, Epoch: 0.03165565487757482, LR: 0.001 [2025-07-27 23:30:26] (step=0001630) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.03167508744656043, LR: 0.001 [2025-07-27 23:30:34] (step=0001631) Train Loss: 0.2878, Train Steps/Sec: 0.12, Epoch: 0.031694520015546054, LR: 0.001 [2025-07-27 23:30:42] (step=0001632) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.03171395258453168, LR: 0.001 [2025-07-27 23:30:50] (step=0001633) Train Loss: 0.2308, Train Steps/Sec: 0.12, Epoch: 0.03173338515351729, LR: 0.001 [2025-07-27 23:30:58] (step=0001634) Train Loss: 0.2886, Train Steps/Sec: 0.12, Epoch: 0.031752817722502914, LR: 0.001 [2025-07-27 23:31:06] (step=0001635) Train Loss: 0.2205, Train Steps/Sec: 0.12, Epoch: 0.03177225029148854, LR: 0.001 [2025-07-27 23:31:14] (step=0001636) Train Loss: 0.3323, Train Steps/Sec: 0.12, Epoch: 0.03179168286047415, LR: 0.001 [2025-07-27 23:31:22] (step=0001637) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.031811115429459774, LR: 0.001 [2025-07-27 23:31:30] (step=0001638) Train Loss: 0.1990, Train Steps/Sec: 0.13, Epoch: 0.031830547998445397, LR: 0.001 [2025-07-27 23:31:38] (step=0001639) Train Loss: 0.2507, Train Steps/Sec: 0.13, Epoch: 0.03184998056743101, LR: 0.001 [2025-07-27 23:31:46] (step=0001640) Train Loss: 0.2595, Train Steps/Sec: 0.13, Epoch: 0.031869413136416634, LR: 0.001 [2025-07-27 23:31:54] (step=0001641) Train Loss: 0.2033, Train Steps/Sec: 0.13, Epoch: 0.031888845705402256, LR: 0.001 [2025-07-27 23:32:02] (step=0001642) Train Loss: 0.3079, Train Steps/Sec: 0.12, Epoch: 0.03190827827438787, LR: 0.001 [2025-07-27 23:32:10] (step=0001643) Train Loss: 0.1914, Train Steps/Sec: 0.13, Epoch: 0.031927710843373494, LR: 0.001 [2025-07-27 23:32:18] (step=0001644) Train Loss: 0.2532, Train Steps/Sec: 0.12, Epoch: 0.031947143412359116, LR: 0.001 [2025-07-27 23:32:26] (step=0001645) Train Loss: 0.2586, Train Steps/Sec: 0.12, Epoch: 0.03196657598134473, LR: 0.001 [2025-07-27 23:32:34] (step=0001646) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.031986008550330354, LR: 0.001 [2025-07-27 23:32:42] (step=0001647) Train Loss: 0.2159, Train Steps/Sec: 0.12, Epoch: 0.032005441119315976, LR: 0.001 [2025-07-27 23:32:50] (step=0001648) Train Loss: 0.2586, Train Steps/Sec: 0.13, Epoch: 0.03202487368830159, LR: 0.001 [2025-07-27 23:32:58] (step=0001649) Train Loss: 0.3294, Train Steps/Sec: 0.12, Epoch: 0.032044306257287214, LR: 0.001 [2025-07-27 23:33:06] (step=0001650) Train Loss: 0.2625, Train Steps/Sec: 0.13, Epoch: 0.032063738826272836, LR: 0.001 [2025-07-27 23:33:14] (step=0001651) Train Loss: 0.2033, Train Steps/Sec: 0.13, Epoch: 0.03208317139525845, LR: 0.001 [2025-07-27 23:33:22] (step=0001652) Train Loss: 0.2992, Train Steps/Sec: 0.12, Epoch: 0.032102603964244074, LR: 0.001 [2025-07-27 23:33:30] (step=0001653) Train Loss: 0.2406, Train Steps/Sec: 0.13, Epoch: 0.032122036533229696, LR: 0.001 [2025-07-27 23:33:38] (step=0001654) Train Loss: 0.2294, Train Steps/Sec: 0.12, Epoch: 0.03214146910221531, LR: 0.001 [2025-07-27 23:33:44] (step=0001655) Train Loss: 0.2385, Train Steps/Sec: 0.18, Epoch: 0.032160901671200934, LR: 0.001 [2025-07-27 23:33:52] (step=0001656) Train Loss: 0.2384, Train Steps/Sec: 0.12, Epoch: 0.032180334240186556, LR: 0.001 [2025-07-27 23:34:00] (step=0001657) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.03219976680917217, LR: 0.001 [2025-07-27 23:34:08] (step=0001658) Train Loss: 0.2031, Train Steps/Sec: 0.13, Epoch: 0.032219199378157794, LR: 0.001 [2025-07-27 23:34:16] (step=0001659) Train Loss: 0.2737, Train Steps/Sec: 0.13, Epoch: 0.03223863194714341, LR: 0.001 [2025-07-27 23:34:24] (step=0001660) Train Loss: 0.2127, Train Steps/Sec: 0.12, Epoch: 0.03225806451612903, LR: 0.001 [2025-07-27 23:34:32] (step=0001661) Train Loss: 0.2346, Train Steps/Sec: 0.13, Epoch: 0.032277497085114654, LR: 0.001 [2025-07-27 23:34:40] (step=0001662) Train Loss: 0.2963, Train Steps/Sec: 0.13, Epoch: 0.03229692965410027, LR: 0.001 [2025-07-27 23:34:48] (step=0001663) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.03231636222308589, LR: 0.001 [2025-07-27 23:34:56] (step=0001664) Train Loss: 0.2145, Train Steps/Sec: 0.13, Epoch: 0.032335794792071514, LR: 0.001 [2025-07-27 23:35:04] (step=0001665) Train Loss: 0.2823, Train Steps/Sec: 0.13, Epoch: 0.03235522736105713, LR: 0.001 [2025-07-27 23:35:12] (step=0001666) Train Loss: 0.2806, Train Steps/Sec: 0.13, Epoch: 0.03237465993004275, LR: 0.001 [2025-07-27 23:35:20] (step=0001667) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.032394092499028374, LR: 0.001 [2025-07-27 23:35:28] (step=0001668) Train Loss: 0.2233, Train Steps/Sec: 0.12, Epoch: 0.03241352506801399, LR: 0.001 [2025-07-27 23:35:36] (step=0001669) Train Loss: 0.2473, Train Steps/Sec: 0.12, Epoch: 0.03243295763699961, LR: 0.001 [2025-07-27 23:35:44] (step=0001670) Train Loss: 0.1979, Train Steps/Sec: 0.12, Epoch: 0.03245239020598523, LR: 0.001 [2025-07-27 23:35:52] (step=0001671) Train Loss: 0.2546, Train Steps/Sec: 0.12, Epoch: 0.03247182277497085, LR: 0.001 [2025-07-27 23:36:00] (step=0001672) Train Loss: 0.1910, Train Steps/Sec: 0.12, Epoch: 0.03249125534395647, LR: 0.001 [2025-07-27 23:36:08] (step=0001673) Train Loss: 0.2080, Train Steps/Sec: 0.13, Epoch: 0.03251068791294209, LR: 0.001 [2025-07-27 23:36:16] (step=0001674) Train Loss: 0.1753, Train Steps/Sec: 0.13, Epoch: 0.03253012048192771, LR: 0.001 [2025-07-27 23:36:24] (step=0001675) Train Loss: 0.2688, Train Steps/Sec: 0.12, Epoch: 0.03254955305091333, LR: 0.001 [2025-07-27 23:36:32] (step=0001676) Train Loss: 0.2290, Train Steps/Sec: 0.13, Epoch: 0.03256898561989895, LR: 0.001 [2025-07-27 23:36:40] (step=0001677) Train Loss: 0.1911, Train Steps/Sec: 0.12, Epoch: 0.03258841818888457, LR: 0.001 [2025-07-27 23:36:48] (step=0001678) Train Loss: 0.2188, Train Steps/Sec: 0.13, Epoch: 0.03260785075787019, LR: 0.001 [2025-07-27 23:36:56] (step=0001679) Train Loss: 0.1468, Train Steps/Sec: 0.12, Epoch: 0.03262728332685581, LR: 0.001 [2025-07-27 23:37:04] (step=0001680) Train Loss: 0.1797, Train Steps/Sec: 0.12, Epoch: 0.03264671589584143, LR: 0.001 [2025-07-27 23:37:12] (step=0001681) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.03266614846482705, LR: 0.001 [2025-07-27 23:37:20] (step=0001682) Train Loss: 0.2495, Train Steps/Sec: 0.12, Epoch: 0.03268558103381267, LR: 0.001 [2025-07-27 23:37:28] (step=0001683) Train Loss: 0.2927, Train Steps/Sec: 0.13, Epoch: 0.03270501360279829, LR: 0.001 [2025-07-27 23:37:36] (step=0001684) Train Loss: 0.2936, Train Steps/Sec: 0.12, Epoch: 0.03272444617178391, LR: 0.001 [2025-07-27 23:37:44] (step=0001685) Train Loss: 0.1709, Train Steps/Sec: 0.12, Epoch: 0.03274387874076953, LR: 0.001 [2025-07-27 23:37:52] (step=0001686) Train Loss: 0.1974, Train Steps/Sec: 0.13, Epoch: 0.03276331130975515, LR: 0.001 [2025-07-27 23:38:00] (step=0001687) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.03278274387874077, LR: 0.001 [2025-07-27 23:38:06] (step=0001688) Train Loss: 0.2686, Train Steps/Sec: 0.18, Epoch: 0.032802176447726386, LR: 0.001 [2025-07-27 23:38:14] (step=0001689) Train Loss: 0.1634, Train Steps/Sec: 0.12, Epoch: 0.03282160901671201, LR: 0.001 [2025-07-27 23:38:22] (step=0001690) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.03284104158569763, LR: 0.001 [2025-07-27 23:38:30] (step=0001691) Train Loss: 0.2051, Train Steps/Sec: 0.12, Epoch: 0.032860474154683246, LR: 0.001 [2025-07-27 23:38:38] (step=0001692) Train Loss: 0.2645, Train Steps/Sec: 0.13, Epoch: 0.03287990672366887, LR: 0.001 [2025-07-27 23:38:46] (step=0001693) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.03289933929265449, LR: 0.001 [2025-07-27 23:38:54] (step=0001694) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.032918771861640106, LR: 0.001 [2025-07-27 23:39:02] (step=0001695) Train Loss: 0.3069, Train Steps/Sec: 0.13, Epoch: 0.03293820443062573, LR: 0.001 [2025-07-27 23:39:10] (step=0001696) Train Loss: 0.2577, Train Steps/Sec: 0.12, Epoch: 0.03295763699961135, LR: 0.001 [2025-07-27 23:39:18] (step=0001697) Train Loss: 0.2733, Train Steps/Sec: 0.12, Epoch: 0.032977069568596966, LR: 0.001 [2025-07-27 23:39:26] (step=0001698) Train Loss: 0.2033, Train Steps/Sec: 0.12, Epoch: 0.03299650213758259, LR: 0.001 [2025-07-27 23:39:34] (step=0001699) Train Loss: 0.3310, Train Steps/Sec: 0.12, Epoch: 0.03301593470656821, LR: 0.001 [2025-07-27 23:39:42] (step=0001700) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.033035367275553826, LR: 0.001 [2025-07-27 23:39:50] (step=0001701) Train Loss: 0.2736, Train Steps/Sec: 0.13, Epoch: 0.03305479984453945, LR: 0.001 [2025-07-27 23:39:58] (step=0001702) Train Loss: 0.2976, Train Steps/Sec: 0.13, Epoch: 0.03307423241352507, LR: 0.001 [2025-07-27 23:40:06] (step=0001703) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.033093664982510686, LR: 0.001 [2025-07-27 23:40:14] (step=0001704) Train Loss: 0.2063, Train Steps/Sec: 0.13, Epoch: 0.03311309755149631, LR: 0.001 [2025-07-27 23:40:23] (step=0001705) Train Loss: 0.1880, Train Steps/Sec: 0.12, Epoch: 0.03313253012048193, LR: 0.001 [2025-07-27 23:40:31] (step=0001706) Train Loss: 0.1803, Train Steps/Sec: 0.12, Epoch: 0.033151962689467546, LR: 0.001 [2025-07-27 23:40:39] (step=0001707) Train Loss: 0.1802, Train Steps/Sec: 0.13, Epoch: 0.03317139525845317, LR: 0.001 [2025-07-27 23:40:47] (step=0001708) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.03319082782743879, LR: 0.001 [2025-07-27 23:40:55] (step=0001709) Train Loss: 0.2028, Train Steps/Sec: 0.13, Epoch: 0.033210260396424406, LR: 0.001 [2025-07-27 23:41:03] (step=0001710) Train Loss: 0.2237, Train Steps/Sec: 0.12, Epoch: 0.03322969296541003, LR: 0.001 [2025-07-27 23:41:11] (step=0001711) Train Loss: 0.2129, Train Steps/Sec: 0.13, Epoch: 0.03324912553439565, LR: 0.001 [2025-07-27 23:41:19] (step=0001712) Train Loss: 0.2889, Train Steps/Sec: 0.12, Epoch: 0.033268558103381266, LR: 0.001 [2025-07-27 23:41:27] (step=0001713) Train Loss: 0.1987, Train Steps/Sec: 0.12, Epoch: 0.03328799067236689, LR: 0.001 [2025-07-27 23:41:35] (step=0001714) Train Loss: 0.2257, Train Steps/Sec: 0.12, Epoch: 0.03330742324135251, LR: 0.001 [2025-07-27 23:41:43] (step=0001715) Train Loss: 0.2598, Train Steps/Sec: 0.12, Epoch: 0.033326855810338125, LR: 0.001 [2025-07-27 23:41:51] (step=0001716) Train Loss: 0.2309, Train Steps/Sec: 0.13, Epoch: 0.03334628837932375, LR: 0.001 [2025-07-27 23:41:59] (step=0001717) Train Loss: 0.2593, Train Steps/Sec: 0.12, Epoch: 0.03336572094830936, LR: 0.001 [2025-07-27 23:42:07] (step=0001718) Train Loss: 0.2426, Train Steps/Sec: 0.12, Epoch: 0.033385153517294985, LR: 0.001 [2025-07-27 23:42:15] (step=0001719) Train Loss: 0.2231, Train Steps/Sec: 0.13, Epoch: 0.03340458608628061, LR: 0.001 [2025-07-27 23:42:23] (step=0001720) Train Loss: 0.3318, Train Steps/Sec: 0.12, Epoch: 0.03342401865526622, LR: 0.001 [2025-07-27 23:42:29] (step=0001721) Train Loss: 0.2481, Train Steps/Sec: 0.16, Epoch: 0.033443451224251845, LR: 0.001 [2025-07-27 23:42:36] (step=0001722) Train Loss: 0.2364, Train Steps/Sec: 0.14, Epoch: 0.03346288379323747, LR: 0.001 [2025-07-27 23:42:45] (step=0001723) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.03348231636222308, LR: 0.001 [2025-07-27 23:42:52] (step=0001724) Train Loss: 0.3543, Train Steps/Sec: 0.13, Epoch: 0.033501748931208705, LR: 0.001 [2025-07-27 23:43:00] (step=0001725) Train Loss: 0.2707, Train Steps/Sec: 0.13, Epoch: 0.03352118150019433, LR: 0.001 [2025-07-27 23:43:08] (step=0001726) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.03354061406917994, LR: 0.001 [2025-07-27 23:43:16] (step=0001727) Train Loss: 0.2655, Train Steps/Sec: 0.13, Epoch: 0.033560046638165565, LR: 0.001 [2025-07-27 23:43:25] (step=0001728) Train Loss: 0.2990, Train Steps/Sec: 0.12, Epoch: 0.03357947920715119, LR: 0.001 [2025-07-27 23:43:33] (step=0001729) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.0335989117761368, LR: 0.001 [2025-07-27 23:43:41] (step=0001730) Train Loss: 0.2321, Train Steps/Sec: 0.12, Epoch: 0.033618344345122425, LR: 0.001 [2025-07-27 23:43:49] (step=0001731) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.03363777691410805, LR: 0.001 [2025-07-27 23:43:57] (step=0001732) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.03365720948309366, LR: 0.001 [2025-07-27 23:44:05] (step=0001733) Train Loss: 0.2086, Train Steps/Sec: 0.12, Epoch: 0.033676642052079285, LR: 0.001 [2025-07-27 23:44:13] (step=0001734) Train Loss: 0.3072, Train Steps/Sec: 0.13, Epoch: 0.03369607462106491, LR: 0.001 [2025-07-27 23:44:21] (step=0001735) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.03371550719005052, LR: 0.001 [2025-07-27 23:44:29] (step=0001736) Train Loss: 0.2489, Train Steps/Sec: 0.12, Epoch: 0.033734939759036145, LR: 0.001 [2025-07-27 23:44:37] (step=0001737) Train Loss: 0.2595, Train Steps/Sec: 0.13, Epoch: 0.03375437232802177, LR: 0.001 [2025-07-27 23:44:45] (step=0001738) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.03377380489700738, LR: 0.001 [2025-07-27 23:44:53] (step=0001739) Train Loss: 0.2690, Train Steps/Sec: 0.13, Epoch: 0.033793237465993005, LR: 0.001 [2025-07-27 23:45:01] (step=0001740) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.03381267003497863, LR: 0.001 [2025-07-27 23:45:09] (step=0001741) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.03383210260396424, LR: 0.001 [2025-07-27 23:45:17] (step=0001742) Train Loss: 0.2908, Train Steps/Sec: 0.12, Epoch: 0.033851535172949865, LR: 0.001 [2025-07-27 23:45:25] (step=0001743) Train Loss: 0.1860, Train Steps/Sec: 0.12, Epoch: 0.03387096774193549, LR: 0.001 [2025-07-27 23:45:33] (step=0001744) Train Loss: 0.3234, Train Steps/Sec: 0.12, Epoch: 0.0338904003109211, LR: 0.001 [2025-07-27 23:45:41] (step=0001745) Train Loss: 0.2257, Train Steps/Sec: 0.12, Epoch: 0.033909832879906725, LR: 0.001 [2025-07-27 23:45:49] (step=0001746) Train Loss: 0.3304, Train Steps/Sec: 0.13, Epoch: 0.03392926544889235, LR: 0.001 [2025-07-27 23:45:57] (step=0001747) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.03394869801787796, LR: 0.001 [2025-07-27 23:46:05] (step=0001748) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.033968130586863585, LR: 0.001 [2025-07-27 23:46:13] (step=0001749) Train Loss: 0.2429, Train Steps/Sec: 0.13, Epoch: 0.0339875631558492, LR: 0.001 [2025-07-27 23:46:21] (step=0001750) Train Loss: 0.2069, Train Steps/Sec: 0.12, Epoch: 0.03400699572483482, LR: 0.001 [2025-07-27 23:46:29] (step=0001751) Train Loss: 0.2700, Train Steps/Sec: 0.13, Epoch: 0.034026428293820445, LR: 0.001 [2025-07-27 23:46:38] (step=0001752) Train Loss: 0.2942, Train Steps/Sec: 0.12, Epoch: 0.03404586086280606, LR: 0.001 [2025-07-27 23:46:45] (step=0001753) Train Loss: 0.2074, Train Steps/Sec: 0.13, Epoch: 0.03406529343179168, LR: 0.001 [2025-07-27 23:46:52] (step=0001754) Train Loss: 0.2604, Train Steps/Sec: 0.14, Epoch: 0.034084726000777305, LR: 0.001 [2025-07-27 23:46:59] (step=0001755) Train Loss: 0.2545, Train Steps/Sec: 0.15, Epoch: 0.03410415856976292, LR: 0.001 [2025-07-27 23:47:07] (step=0001756) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.03412359113874854, LR: 0.001 [2025-07-27 23:47:15] (step=0001757) Train Loss: 0.1840, Train Steps/Sec: 0.13, Epoch: 0.034143023707734164, LR: 0.001 [2025-07-27 23:47:23] (step=0001758) Train Loss: 0.2541, Train Steps/Sec: 0.13, Epoch: 0.03416245627671978, LR: 0.001 [2025-07-27 23:47:31] (step=0001759) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.0341818888457054, LR: 0.001 [2025-07-27 23:47:39] (step=0001760) Train Loss: 0.2368, Train Steps/Sec: 0.13, Epoch: 0.034201321414691024, LR: 0.001 [2025-07-27 23:47:47] (step=0001761) Train Loss: 0.2674, Train Steps/Sec: 0.12, Epoch: 0.03422075398367664, LR: 0.001 [2025-07-27 23:47:55] (step=0001762) Train Loss: 0.2972, Train Steps/Sec: 0.12, Epoch: 0.03424018655266226, LR: 0.001 [2025-07-27 23:48:03] (step=0001763) Train Loss: 0.3107, Train Steps/Sec: 0.12, Epoch: 0.034259619121647884, LR: 0.001 [2025-07-27 23:48:11] (step=0001764) Train Loss: 0.2407, Train Steps/Sec: 0.12, Epoch: 0.0342790516906335, LR: 0.001 [2025-07-27 23:48:19] (step=0001765) Train Loss: 0.3064, Train Steps/Sec: 0.13, Epoch: 0.03429848425961912, LR: 0.001 [2025-07-27 23:48:27] (step=0001766) Train Loss: 0.2216, Train Steps/Sec: 0.12, Epoch: 0.034317916828604744, LR: 0.001 [2025-07-27 23:48:35] (step=0001767) Train Loss: 0.1750, Train Steps/Sec: 0.12, Epoch: 0.03433734939759036, LR: 0.001 [2025-07-27 23:48:43] (step=0001768) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.03435678196657598, LR: 0.001 [2025-07-27 23:48:51] (step=0001769) Train Loss: 0.1588, Train Steps/Sec: 0.13, Epoch: 0.034376214535561604, LR: 0.001 [2025-07-27 23:48:59] (step=0001770) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.03439564710454722, LR: 0.001 [2025-07-27 23:49:07] (step=0001771) Train Loss: 0.2253, Train Steps/Sec: 0.12, Epoch: 0.03441507967353284, LR: 0.001 [2025-07-27 23:49:15] (step=0001772) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.034434512242518464, LR: 0.001 [2025-07-27 23:49:23] (step=0001773) Train Loss: 0.2276, Train Steps/Sec: 0.12, Epoch: 0.03445394481150408, LR: 0.001 [2025-07-27 23:49:32] (step=0001774) Train Loss: 0.3018, Train Steps/Sec: 0.12, Epoch: 0.0344733773804897, LR: 0.001 [2025-07-27 23:49:40] (step=0001775) Train Loss: 0.2949, Train Steps/Sec: 0.13, Epoch: 0.034492809949475324, LR: 0.001 [2025-07-27 23:49:48] (step=0001776) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.03451224251846094, LR: 0.001 [2025-07-27 23:49:56] (step=0001777) Train Loss: 0.2709, Train Steps/Sec: 0.12, Epoch: 0.03453167508744656, LR: 0.001 [2025-07-27 23:50:04] (step=0001778) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.03455110765643218, LR: 0.001 [2025-07-27 23:50:12] (step=0001779) Train Loss: 0.2463, Train Steps/Sec: 0.13, Epoch: 0.0345705402254178, LR: 0.001 [2025-07-27 23:50:20] (step=0001780) Train Loss: 0.2854, Train Steps/Sec: 0.12, Epoch: 0.03458997279440342, LR: 0.001 [2025-07-27 23:50:28] (step=0001781) Train Loss: 0.3216, Train Steps/Sec: 0.12, Epoch: 0.03460940536338904, LR: 0.001 [2025-07-27 23:50:36] (step=0001782) Train Loss: 0.2464, Train Steps/Sec: 0.13, Epoch: 0.03462883793237466, LR: 0.001 [2025-07-27 23:50:44] (step=0001783) Train Loss: 0.2988, Train Steps/Sec: 0.12, Epoch: 0.03464827050136028, LR: 0.001 [2025-07-27 23:50:52] (step=0001784) Train Loss: 0.1862, Train Steps/Sec: 0.13, Epoch: 0.0346677030703459, LR: 0.001 [2025-07-27 23:51:00] (step=0001785) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.03468713563933152, LR: 0.001 [2025-07-27 23:51:08] (step=0001786) Train Loss: 0.3085, Train Steps/Sec: 0.13, Epoch: 0.03470656820831714, LR: 0.001 [2025-07-27 23:51:15] (step=0001787) Train Loss: 0.2689, Train Steps/Sec: 0.13, Epoch: 0.03472600077730276, LR: 0.001 [2025-07-27 23:51:21] (step=0001788) Train Loss: 0.2857, Train Steps/Sec: 0.17, Epoch: 0.03474543334628838, LR: 0.001 [2025-07-27 23:51:30] (step=0001789) Train Loss: 0.2662, Train Steps/Sec: 0.12, Epoch: 0.034764865915274, LR: 0.001 [2025-07-27 23:51:38] (step=0001790) Train Loss: 0.2016, Train Steps/Sec: 0.12, Epoch: 0.03478429848425962, LR: 0.001 [2025-07-27 23:51:46] (step=0001791) Train Loss: 0.2111, Train Steps/Sec: 0.12, Epoch: 0.03480373105324524, LR: 0.001 [2025-07-27 23:51:54] (step=0001792) Train Loss: 0.2959, Train Steps/Sec: 0.12, Epoch: 0.03482316362223086, LR: 0.001 [2025-07-27 23:52:02] (step=0001793) Train Loss: 0.1942, Train Steps/Sec: 0.13, Epoch: 0.03484259619121648, LR: 0.001 [2025-07-27 23:52:10] (step=0001794) Train Loss: 0.2824, Train Steps/Sec: 0.12, Epoch: 0.0348620287602021, LR: 0.001 [2025-07-27 23:52:18] (step=0001795) Train Loss: 0.2813, Train Steps/Sec: 0.13, Epoch: 0.03488146132918772, LR: 0.001 [2025-07-27 23:52:26] (step=0001796) Train Loss: 0.3280, Train Steps/Sec: 0.12, Epoch: 0.03490089389817334, LR: 0.001 [2025-07-27 23:52:34] (step=0001797) Train Loss: 0.1806, Train Steps/Sec: 0.13, Epoch: 0.03492032646715896, LR: 0.001 [2025-07-27 23:52:42] (step=0001798) Train Loss: 0.2367, Train Steps/Sec: 0.13, Epoch: 0.03493975903614458, LR: 0.001 [2025-07-27 23:52:50] (step=0001799) Train Loss: 0.3138, Train Steps/Sec: 0.12, Epoch: 0.034959191605130197, LR: 0.001 [2025-07-27 23:52:58] (step=0001800) Train Loss: 0.2617, Train Steps/Sec: 0.13, Epoch: 0.03497862417411582, LR: 0.001 [2025-07-27 23:53:06] (step=0001801) Train Loss: 0.2912, Train Steps/Sec: 0.12, Epoch: 0.03499805674310144, LR: 0.001 [2025-07-27 23:53:14] (step=0001802) Train Loss: 0.2749, Train Steps/Sec: 0.13, Epoch: 0.035017489312087056, LR: 0.001 [2025-07-27 23:53:22] (step=0001803) Train Loss: 0.2753, Train Steps/Sec: 0.12, Epoch: 0.03503692188107268, LR: 0.001 [2025-07-27 23:53:30] (step=0001804) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.0350563544500583, LR: 0.001 [2025-07-27 23:53:38] (step=0001805) Train Loss: 0.2972, Train Steps/Sec: 0.13, Epoch: 0.035075787019043916, LR: 0.001 [2025-07-27 23:53:46] (step=0001806) Train Loss: 0.1710, Train Steps/Sec: 0.12, Epoch: 0.03509521958802954, LR: 0.001 [2025-07-27 23:53:54] (step=0001807) Train Loss: 0.2030, Train Steps/Sec: 0.13, Epoch: 0.035114652157015154, LR: 0.001 [2025-07-27 23:54:02] (step=0001808) Train Loss: 0.2549, Train Steps/Sec: 0.12, Epoch: 0.035134084726000776, LR: 0.001 [2025-07-27 23:54:10] (step=0001809) Train Loss: 0.2895, Train Steps/Sec: 0.12, Epoch: 0.0351535172949864, LR: 0.001 [2025-07-27 23:54:18] (step=0001810) Train Loss: 0.2723, Train Steps/Sec: 0.13, Epoch: 0.035172949863972014, LR: 0.001 [2025-07-27 23:54:26] (step=0001811) Train Loss: 0.2957, Train Steps/Sec: 0.12, Epoch: 0.035192382432957636, LR: 0.001 [2025-07-27 23:54:34] (step=0001812) Train Loss: 0.2035, Train Steps/Sec: 0.13, Epoch: 0.03521181500194326, LR: 0.001 [2025-07-27 23:54:42] (step=0001813) Train Loss: 0.2669, Train Steps/Sec: 0.12, Epoch: 0.035231247570928874, LR: 0.001 [2025-07-27 23:54:50] (step=0001814) Train Loss: 0.2276, Train Steps/Sec: 0.13, Epoch: 0.035250680139914496, LR: 0.001 [2025-07-27 23:54:58] (step=0001815) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.03527011270890012, LR: 0.001 [2025-07-27 23:55:06] (step=0001816) Train Loss: 0.2761, Train Steps/Sec: 0.12, Epoch: 0.035289545277885734, LR: 0.001 [2025-07-27 23:55:14] (step=0001817) Train Loss: 0.2019, Train Steps/Sec: 0.12, Epoch: 0.035308977846871356, LR: 0.001 [2025-07-27 23:55:22] (step=0001818) Train Loss: 0.2872, Train Steps/Sec: 0.12, Epoch: 0.03532841041585698, LR: 0.001 [2025-07-27 23:55:30] (step=0001819) Train Loss: 0.2204, Train Steps/Sec: 0.13, Epoch: 0.035347842984842594, LR: 0.001 [2025-07-27 23:55:38] (step=0001820) Train Loss: 0.2820, Train Steps/Sec: 0.12, Epoch: 0.035367275553828216, LR: 0.001 [2025-07-27 23:55:44] (step=0001821) Train Loss: 0.2822, Train Steps/Sec: 0.18, Epoch: 0.03538670812281384, LR: 0.001 [2025-07-27 23:55:52] (step=0001822) Train Loss: 0.1852, Train Steps/Sec: 0.12, Epoch: 0.035406140691799454, LR: 0.001 [2025-07-27 23:56:00] (step=0001823) Train Loss: 0.1923, Train Steps/Sec: 0.13, Epoch: 0.035425573260785076, LR: 0.001 [2025-07-27 23:56:08] (step=0001824) Train Loss: 0.2068, Train Steps/Sec: 0.12, Epoch: 0.0354450058297707, LR: 0.001 [2025-07-27 23:56:16] (step=0001825) Train Loss: 0.2620, Train Steps/Sec: 0.12, Epoch: 0.035464438398756314, LR: 0.001 [2025-07-27 23:56:24] (step=0001826) Train Loss: 0.2116, Train Steps/Sec: 0.12, Epoch: 0.035483870967741936, LR: 0.001 [2025-07-27 23:56:32] (step=0001827) Train Loss: 0.2088, Train Steps/Sec: 0.12, Epoch: 0.03550330353672756, LR: 0.001 [2025-07-27 23:56:40] (step=0001828) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.035522736105713174, LR: 0.001 [2025-07-27 23:56:48] (step=0001829) Train Loss: 0.1940, Train Steps/Sec: 0.12, Epoch: 0.035542168674698796, LR: 0.001 [2025-07-27 23:56:56] (step=0001830) Train Loss: 0.2994, Train Steps/Sec: 0.13, Epoch: 0.03556160124368442, LR: 0.001 [2025-07-27 23:57:04] (step=0001831) Train Loss: 0.2084, Train Steps/Sec: 0.13, Epoch: 0.03558103381267003, LR: 0.001 [2025-07-27 23:57:12] (step=0001832) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.035600466381655656, LR: 0.001 [2025-07-27 23:57:20] (step=0001833) Train Loss: 0.2628, Train Steps/Sec: 0.12, Epoch: 0.03561989895064128, LR: 0.001 [2025-07-27 23:57:28] (step=0001834) Train Loss: 0.2944, Train Steps/Sec: 0.12, Epoch: 0.03563933151962689, LR: 0.001 [2025-07-27 23:57:37] (step=0001835) Train Loss: 0.2192, Train Steps/Sec: 0.12, Epoch: 0.035658764088612516, LR: 0.001 [2025-07-27 23:57:45] (step=0001836) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.03567819665759813, LR: 0.001 [2025-07-27 23:57:53] (step=0001837) Train Loss: 0.2896, Train Steps/Sec: 0.12, Epoch: 0.03569762922658375, LR: 0.001 [2025-07-27 23:58:01] (step=0001838) Train Loss: 0.2375, Train Steps/Sec: 0.13, Epoch: 0.035717061795569376, LR: 0.001 [2025-07-27 23:58:09] (step=0001839) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.03573649436455499, LR: 0.001 [2025-07-27 23:58:17] (step=0001840) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.03575592693354061, LR: 0.001 [2025-07-27 23:58:25] (step=0001841) Train Loss: 0.2074, Train Steps/Sec: 0.12, Epoch: 0.035775359502526236, LR: 0.001 [2025-07-27 23:58:33] (step=0001842) Train Loss: 0.2880, Train Steps/Sec: 0.12, Epoch: 0.03579479207151185, LR: 0.001 [2025-07-27 23:58:41] (step=0001843) Train Loss: 0.3101, Train Steps/Sec: 0.12, Epoch: 0.03581422464049747, LR: 0.001 [2025-07-27 23:58:49] (step=0001844) Train Loss: 0.2293, Train Steps/Sec: 0.12, Epoch: 0.035833657209483095, LR: 0.001 [2025-07-27 23:58:57] (step=0001845) Train Loss: 0.2935, Train Steps/Sec: 0.12, Epoch: 0.03585308977846871, LR: 0.001 [2025-07-27 23:59:05] (step=0001846) Train Loss: 0.2129, Train Steps/Sec: 0.12, Epoch: 0.03587252234745433, LR: 0.001 [2025-07-27 23:59:13] (step=0001847) Train Loss: 0.2720, Train Steps/Sec: 0.13, Epoch: 0.035891954916439955, LR: 0.001 [2025-07-27 23:59:21] (step=0001848) Train Loss: 0.2404, Train Steps/Sec: 0.12, Epoch: 0.03591138748542557, LR: 0.001 [2025-07-27 23:59:29] (step=0001849) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.03593082005441119, LR: 0.001 [2025-07-27 23:59:37] (step=0001850) Train Loss: 0.2429, Train Steps/Sec: 0.13, Epoch: 0.035950252623396815, LR: 0.001 [2025-07-27 23:59:45] (step=0001851) Train Loss: 0.3061, Train Steps/Sec: 0.12, Epoch: 0.03596968519238243, LR: 0.001 [2025-07-27 23:59:53] (step=0001852) Train Loss: 0.2473, Train Steps/Sec: 0.13, Epoch: 0.03598911776136805, LR: 0.001 [2025-07-28 00:00:01] (step=0001853) Train Loss: 0.2648, Train Steps/Sec: 0.12, Epoch: 0.036008550330353675, LR: 0.001 [2025-07-28 00:00:06] (step=0001854) Train Loss: 0.1902, Train Steps/Sec: 0.18, Epoch: 0.03602798289933929, LR: 0.001 [2025-07-28 00:00:14] (step=0001855) Train Loss: 0.2321, Train Steps/Sec: 0.12, Epoch: 0.03604741546832491, LR: 0.001 [2025-07-28 00:00:22] (step=0001856) Train Loss: 0.2097, Train Steps/Sec: 0.12, Epoch: 0.036066848037310535, LR: 0.001 [2025-07-28 00:00:30] (step=0001857) Train Loss: 0.2163, Train Steps/Sec: 0.12, Epoch: 0.03608628060629615, LR: 0.001 [2025-07-28 00:00:38] (step=0001858) Train Loss: 0.3070, Train Steps/Sec: 0.13, Epoch: 0.03610571317528177, LR: 0.001 [2025-07-28 00:00:46] (step=0001859) Train Loss: 0.2891, Train Steps/Sec: 0.12, Epoch: 0.036125145744267395, LR: 0.001 [2025-07-28 00:00:54] (step=0001860) Train Loss: 0.2851, Train Steps/Sec: 0.12, Epoch: 0.03614457831325301, LR: 0.001 [2025-07-28 00:01:02] (step=0001861) Train Loss: 0.2191, Train Steps/Sec: 0.13, Epoch: 0.03616401088223863, LR: 0.001 [2025-07-28 00:01:11] (step=0001862) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.036183443451224255, LR: 0.001 [2025-07-28 00:01:19] (step=0001863) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.03620287602020987, LR: 0.001 [2025-07-28 00:01:27] (step=0001864) Train Loss: 0.2309, Train Steps/Sec: 0.12, Epoch: 0.03622230858919549, LR: 0.001 [2025-07-28 00:01:35] (step=0001865) Train Loss: 0.2304, Train Steps/Sec: 0.13, Epoch: 0.036241741158181115, LR: 0.001 [2025-07-28 00:01:43] (step=0001866) Train Loss: 0.2294, Train Steps/Sec: 0.12, Epoch: 0.03626117372716673, LR: 0.001 [2025-07-28 00:01:51] (step=0001867) Train Loss: 0.1846, Train Steps/Sec: 0.12, Epoch: 0.03628060629615235, LR: 0.001 [2025-07-28 00:01:59] (step=0001868) Train Loss: 0.1839, Train Steps/Sec: 0.13, Epoch: 0.03630003886513797, LR: 0.001 [2025-07-28 00:02:07] (step=0001869) Train Loss: 0.2026, Train Steps/Sec: 0.12, Epoch: 0.03631947143412359, LR: 0.001 [2025-07-28 00:02:15] (step=0001870) Train Loss: 0.1782, Train Steps/Sec: 0.13, Epoch: 0.03633890400310921, LR: 0.001 [2025-07-28 00:02:23] (step=0001871) Train Loss: 0.1873, Train Steps/Sec: 0.12, Epoch: 0.03635833657209483, LR: 0.001 [2025-07-28 00:02:31] (step=0001872) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.03637776914108045, LR: 0.001 [2025-07-28 00:02:39] (step=0001873) Train Loss: 0.2317, Train Steps/Sec: 0.13, Epoch: 0.03639720171006607, LR: 0.001 [2025-07-28 00:02:47] (step=0001874) Train Loss: 0.1962, Train Steps/Sec: 0.12, Epoch: 0.03641663427905169, LR: 0.001 [2025-07-28 00:02:55] (step=0001875) Train Loss: 0.1727, Train Steps/Sec: 0.13, Epoch: 0.03643606684803731, LR: 0.001 [2025-07-28 00:03:03] (step=0001876) Train Loss: 0.1899, Train Steps/Sec: 0.12, Epoch: 0.03645549941702293, LR: 0.001 [2025-07-28 00:03:11] (step=0001877) Train Loss: 0.2422, Train Steps/Sec: 0.13, Epoch: 0.03647493198600855, LR: 0.001 [2025-07-28 00:03:19] (step=0001878) Train Loss: 0.2064, Train Steps/Sec: 0.12, Epoch: 0.03649436455499417, LR: 0.001 [2025-07-28 00:03:27] (step=0001879) Train Loss: 0.2656, Train Steps/Sec: 0.13, Epoch: 0.03651379712397979, LR: 0.001 [2025-07-28 00:03:35] (step=0001880) Train Loss: 0.2415, Train Steps/Sec: 0.12, Epoch: 0.03653322969296541, LR: 0.001 [2025-07-28 00:03:43] (step=0001881) Train Loss: 0.2912, Train Steps/Sec: 0.12, Epoch: 0.03655266226195103, LR: 0.001 [2025-07-28 00:03:51] (step=0001882) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.03657209483093665, LR: 0.001 [2025-07-28 00:03:59] (step=0001883) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.03659152739992227, LR: 0.001 [2025-07-28 00:04:07] (step=0001884) Train Loss: 0.1910, Train Steps/Sec: 0.12, Epoch: 0.03661095996890789, LR: 0.001 [2025-07-28 00:04:15] (step=0001885) Train Loss: 0.2003, Train Steps/Sec: 0.13, Epoch: 0.03663039253789351, LR: 0.001 [2025-07-28 00:04:23] (step=0001886) Train Loss: 0.1621, Train Steps/Sec: 0.12, Epoch: 0.03664982510687913, LR: 0.001 [2025-07-28 00:04:28] (step=0001887) Train Loss: 0.2568, Train Steps/Sec: 0.18, Epoch: 0.03666925767586475, LR: 0.001 [2025-07-28 00:04:36] (step=0001888) Train Loss: 0.2850, Train Steps/Sec: 0.13, Epoch: 0.03668869024485037, LR: 0.001 [2025-07-28 00:04:44] (step=0001889) Train Loss: 0.2528, Train Steps/Sec: 0.12, Epoch: 0.03670812281383599, LR: 0.001 [2025-07-28 00:04:53] (step=0001890) Train Loss: 0.2332, Train Steps/Sec: 0.12, Epoch: 0.03672755538282161, LR: 0.001 [2025-07-28 00:05:01] (step=0001891) Train Loss: 0.1745, Train Steps/Sec: 0.12, Epoch: 0.03674698795180723, LR: 0.001 [2025-07-28 00:05:09] (step=0001892) Train Loss: 0.2368, Train Steps/Sec: 0.12, Epoch: 0.03676642052079285, LR: 0.001 [2025-07-28 00:05:17] (step=0001893) Train Loss: 0.1858, Train Steps/Sec: 0.12, Epoch: 0.03678585308977847, LR: 0.001 [2025-07-28 00:05:25] (step=0001894) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.03680528565876409, LR: 0.001 [2025-07-28 00:05:33] (step=0001895) Train Loss: 0.3307, Train Steps/Sec: 0.12, Epoch: 0.03682471822774971, LR: 0.001 [2025-07-28 00:05:41] (step=0001896) Train Loss: 0.3358, Train Steps/Sec: 0.12, Epoch: 0.03684415079673533, LR: 0.001 [2025-07-28 00:05:49] (step=0001897) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.036863583365720945, LR: 0.001 [2025-07-28 00:05:57] (step=0001898) Train Loss: 0.1887, Train Steps/Sec: 0.12, Epoch: 0.03688301593470657, LR: 0.001 [2025-07-28 00:06:05] (step=0001899) Train Loss: 0.2270, Train Steps/Sec: 0.12, Epoch: 0.03690244850369219, LR: 0.001 [2025-07-28 00:06:13] (step=0001900) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.036921881072677805, LR: 0.001 [2025-07-28 00:06:21] (step=0001901) Train Loss: 0.2058, Train Steps/Sec: 0.12, Epoch: 0.03694131364166343, LR: 0.001 [2025-07-28 00:06:29] (step=0001902) Train Loss: 0.2094, Train Steps/Sec: 0.12, Epoch: 0.03696074621064905, LR: 0.001 [2025-07-28 00:06:37] (step=0001903) Train Loss: 0.2205, Train Steps/Sec: 0.13, Epoch: 0.036980178779634665, LR: 0.001 [2025-07-28 00:06:45] (step=0001904) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.03699961134862029, LR: 0.001 [2025-07-28 00:06:53] (step=0001905) Train Loss: 0.2217, Train Steps/Sec: 0.12, Epoch: 0.03701904391760591, LR: 0.001 [2025-07-28 00:07:01] (step=0001906) Train Loss: 0.2705, Train Steps/Sec: 0.12, Epoch: 0.037038476486591525, LR: 0.001 [2025-07-28 00:07:09] (step=0001907) Train Loss: 0.2638, Train Steps/Sec: 0.12, Epoch: 0.03705790905557715, LR: 0.001 [2025-07-28 00:07:17] (step=0001908) Train Loss: 0.1456, Train Steps/Sec: 0.13, Epoch: 0.03707734162456277, LR: 0.001 [2025-07-28 00:07:26] (step=0001909) Train Loss: 0.2727, Train Steps/Sec: 0.12, Epoch: 0.037096774193548385, LR: 0.001 [2025-07-28 00:07:34] (step=0001910) Train Loss: 0.2358, Train Steps/Sec: 0.12, Epoch: 0.03711620676253401, LR: 0.001 [2025-07-28 00:07:42] (step=0001911) Train Loss: 0.1726, Train Steps/Sec: 0.12, Epoch: 0.03713563933151963, LR: 0.001 [2025-07-28 00:07:50] (step=0001912) Train Loss: 0.2892, Train Steps/Sec: 0.12, Epoch: 0.037155071900505245, LR: 0.001 [2025-07-28 00:07:58] (step=0001913) Train Loss: 0.2723, Train Steps/Sec: 0.12, Epoch: 0.03717450446949087, LR: 0.001 [2025-07-28 00:08:06] (step=0001914) Train Loss: 0.1879, Train Steps/Sec: 0.12, Epoch: 0.03719393703847649, LR: 0.001 [2025-07-28 00:08:14] (step=0001915) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.037213369607462105, LR: 0.001 [2025-07-28 00:08:22] (step=0001916) Train Loss: 0.2063, Train Steps/Sec: 0.12, Epoch: 0.03723280217644773, LR: 0.001 [2025-07-28 00:08:30] (step=0001917) Train Loss: 0.2510, Train Steps/Sec: 0.13, Epoch: 0.03725223474543335, LR: 0.001 [2025-07-28 00:08:38] (step=0001918) Train Loss: 0.2924, Train Steps/Sec: 0.13, Epoch: 0.037271667314418964, LR: 0.001 [2025-07-28 00:08:46] (step=0001919) Train Loss: 0.2073, Train Steps/Sec: 0.12, Epoch: 0.03729109988340459, LR: 0.001 [2025-07-28 00:08:52] (step=0001920) Train Loss: 0.2081, Train Steps/Sec: 0.16, Epoch: 0.03731053245239021, LR: 0.001 [2025-07-28 00:09:00] (step=0001921) Train Loss: 0.2911, Train Steps/Sec: 0.13, Epoch: 0.037329965021375824, LR: 0.001 [2025-07-28 00:09:08] (step=0001922) Train Loss: 0.2775, Train Steps/Sec: 0.12, Epoch: 0.03734939759036145, LR: 0.001 [2025-07-28 00:09:16] (step=0001923) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.03736883015934707, LR: 0.001 [2025-07-28 00:09:24] (step=0001924) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.037388262728332684, LR: 0.001 [2025-07-28 00:09:32] (step=0001925) Train Loss: 0.3154, Train Steps/Sec: 0.12, Epoch: 0.03740769529731831, LR: 0.001 [2025-07-28 00:09:40] (step=0001926) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.03742712786630392, LR: 0.001 [2025-07-28 00:09:48] (step=0001927) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.037446560435289544, LR: 0.001 [2025-07-28 00:09:56] (step=0001928) Train Loss: 0.2473, Train Steps/Sec: 0.12, Epoch: 0.03746599300427517, LR: 0.001 [2025-07-28 00:10:04] (step=0001929) Train Loss: 0.2705, Train Steps/Sec: 0.12, Epoch: 0.03748542557326078, LR: 0.001 [2025-07-28 00:10:12] (step=0001930) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.037504858142246404, LR: 0.001 [2025-07-28 00:10:20] (step=0001931) Train Loss: 0.2545, Train Steps/Sec: 0.13, Epoch: 0.037524290711232026, LR: 0.001 [2025-07-28 00:10:28] (step=0001932) Train Loss: 0.2299, Train Steps/Sec: 0.12, Epoch: 0.03754372328021764, LR: 0.001 [2025-07-28 00:10:36] (step=0001933) Train Loss: 0.2565, Train Steps/Sec: 0.13, Epoch: 0.037563155849203264, LR: 0.001 [2025-07-28 00:10:45] (step=0001934) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.037582588418188886, LR: 0.001 [2025-07-28 00:10:53] (step=0001935) Train Loss: 0.2737, Train Steps/Sec: 0.12, Epoch: 0.0376020209871745, LR: 0.001 [2025-07-28 00:11:01] (step=0001936) Train Loss: 0.2548, Train Steps/Sec: 0.12, Epoch: 0.037621453556160124, LR: 0.001 [2025-07-28 00:11:09] (step=0001937) Train Loss: 0.2554, Train Steps/Sec: 0.12, Epoch: 0.037640886125145746, LR: 0.001 [2025-07-28 00:11:17] (step=0001938) Train Loss: 0.2068, Train Steps/Sec: 0.13, Epoch: 0.03766031869413136, LR: 0.001 [2025-07-28 00:11:25] (step=0001939) Train Loss: 0.3421, Train Steps/Sec: 0.12, Epoch: 0.037679751263116984, LR: 0.001 [2025-07-28 00:11:33] (step=0001940) Train Loss: 0.2885, Train Steps/Sec: 0.12, Epoch: 0.037699183832102606, LR: 0.001 [2025-07-28 00:11:41] (step=0001941) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.03771861640108822, LR: 0.001 [2025-07-28 00:11:49] (step=0001942) Train Loss: 0.2665, Train Steps/Sec: 0.12, Epoch: 0.037738048970073844, LR: 0.001 [2025-07-28 00:11:57] (step=0001943) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.037757481539059466, LR: 0.001 [2025-07-28 00:12:05] (step=0001944) Train Loss: 0.2714, Train Steps/Sec: 0.12, Epoch: 0.03777691410804508, LR: 0.001 [2025-07-28 00:12:13] (step=0001945) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.037796346677030704, LR: 0.001 [2025-07-28 00:12:21] (step=0001946) Train Loss: 0.2653, Train Steps/Sec: 0.13, Epoch: 0.037815779246016326, LR: 0.001 [2025-07-28 00:12:29] (step=0001947) Train Loss: 0.2128, Train Steps/Sec: 0.12, Epoch: 0.03783521181500194, LR: 0.001 [2025-07-28 00:12:37] (step=0001948) Train Loss: 0.2844, Train Steps/Sec: 0.12, Epoch: 0.037854644383987564, LR: 0.001 [2025-07-28 00:12:45] (step=0001949) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.037874076952973186, LR: 0.001 [2025-07-28 00:12:53] (step=0001950) Train Loss: 0.2288, Train Steps/Sec: 0.13, Epoch: 0.0378935095219588, LR: 0.001 [2025-07-28 00:13:01] (step=0001951) Train Loss: 0.1973, Train Steps/Sec: 0.13, Epoch: 0.037912942090944424, LR: 0.001 [2025-07-28 00:13:09] (step=0001952) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.037932374659930046, LR: 0.001 [2025-07-28 00:13:16] (step=0001953) Train Loss: 0.2045, Train Steps/Sec: 0.15, Epoch: 0.03795180722891566, LR: 0.001 [2025-07-28 00:13:23] (step=0001954) Train Loss: 0.2368, Train Steps/Sec: 0.14, Epoch: 0.037971239797901284, LR: 0.001 [2025-07-28 00:13:31] (step=0001955) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.037990672366886906, LR: 0.001 [2025-07-28 00:13:39] (step=0001956) Train Loss: 0.1823, Train Steps/Sec: 0.13, Epoch: 0.03801010493587252, LR: 0.001 [2025-07-28 00:13:47] (step=0001957) Train Loss: 0.2608, Train Steps/Sec: 0.13, Epoch: 0.038029537504858144, LR: 0.001 [2025-07-28 00:13:55] (step=0001958) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.03804897007384376, LR: 0.001 [2025-07-28 00:14:03] (step=0001959) Train Loss: 0.1981, Train Steps/Sec: 0.12, Epoch: 0.03806840264282938, LR: 0.001 [2025-07-28 00:14:11] (step=0001960) Train Loss: 0.2735, Train Steps/Sec: 0.12, Epoch: 0.038087835211815003, LR: 0.001 [2025-07-28 00:14:19] (step=0001961) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.03810726778080062, LR: 0.001 [2025-07-28 00:14:27] (step=0001962) Train Loss: 0.2416, Train Steps/Sec: 0.13, Epoch: 0.03812670034978624, LR: 0.001 [2025-07-28 00:14:35] (step=0001963) Train Loss: 0.2572, Train Steps/Sec: 0.12, Epoch: 0.03814613291877186, LR: 0.001 [2025-07-28 00:14:43] (step=0001964) Train Loss: 0.1824, Train Steps/Sec: 0.13, Epoch: 0.03816556548775748, LR: 0.001 [2025-07-28 00:14:51] (step=0001965) Train Loss: 0.2350, Train Steps/Sec: 0.12, Epoch: 0.0381849980567431, LR: 0.001 [2025-07-28 00:15:00] (step=0001966) Train Loss: 0.2936, Train Steps/Sec: 0.12, Epoch: 0.03820443062572872, LR: 0.001 [2025-07-28 00:15:08] (step=0001967) Train Loss: 0.1920, Train Steps/Sec: 0.12, Epoch: 0.03822386319471434, LR: 0.001 [2025-07-28 00:15:16] (step=0001968) Train Loss: 0.2925, Train Steps/Sec: 0.12, Epoch: 0.03824329576369996, LR: 0.001 [2025-07-28 00:15:24] (step=0001969) Train Loss: 0.3209, Train Steps/Sec: 0.12, Epoch: 0.03826272833268558, LR: 0.001 [2025-07-28 00:15:32] (step=0001970) Train Loss: 0.2874, Train Steps/Sec: 0.12, Epoch: 0.0382821609016712, LR: 0.001 [2025-07-28 00:15:40] (step=0001971) Train Loss: 0.2605, Train Steps/Sec: 0.13, Epoch: 0.03830159347065682, LR: 0.001 [2025-07-28 00:15:48] (step=0001972) Train Loss: 0.1957, Train Steps/Sec: 0.12, Epoch: 0.03832102603964244, LR: 0.001 [2025-07-28 00:15:56] (step=0001973) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.03834045860862806, LR: 0.001 [2025-07-28 00:16:04] (step=0001974) Train Loss: 0.2501, Train Steps/Sec: 0.13, Epoch: 0.03835989117761368, LR: 0.001 [2025-07-28 00:16:12] (step=0001975) Train Loss: 0.2612, Train Steps/Sec: 0.13, Epoch: 0.0383793237465993, LR: 0.001 [2025-07-28 00:16:20] (step=0001976) Train Loss: 0.2559, Train Steps/Sec: 0.13, Epoch: 0.03839875631558492, LR: 0.001 [2025-07-28 00:16:28] (step=0001977) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.03841818888457054, LR: 0.001 [2025-07-28 00:16:36] (step=0001978) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.03843762145355616, LR: 0.001 [2025-07-28 00:16:44] (step=0001979) Train Loss: 0.2134, Train Steps/Sec: 0.12, Epoch: 0.03845705402254178, LR: 0.001 [2025-07-28 00:16:52] (step=0001980) Train Loss: 0.2208, Train Steps/Sec: 0.12, Epoch: 0.0384764865915274, LR: 0.001 [2025-07-28 00:17:00] (step=0001981) Train Loss: 0.1836, Train Steps/Sec: 0.12, Epoch: 0.03849591916051302, LR: 0.001 [2025-07-28 00:17:08] (step=0001982) Train Loss: 0.2509, Train Steps/Sec: 0.12, Epoch: 0.03851535172949864, LR: 0.001 [2025-07-28 00:17:16] (step=0001983) Train Loss: 0.1991, Train Steps/Sec: 0.13, Epoch: 0.03853478429848426, LR: 0.001 [2025-07-28 00:17:24] (step=0001984) Train Loss: 0.2672, Train Steps/Sec: 0.12, Epoch: 0.03855421686746988, LR: 0.001 [2025-07-28 00:17:32] (step=0001985) Train Loss: 0.2465, Train Steps/Sec: 0.13, Epoch: 0.0385736494364555, LR: 0.001 [2025-07-28 00:17:39] (step=0001986) Train Loss: 0.2738, Train Steps/Sec: 0.14, Epoch: 0.03859308200544112, LR: 0.001 [2025-07-28 00:17:45] (step=0001987) Train Loss: 0.3151, Train Steps/Sec: 0.17, Epoch: 0.038612514574426736, LR: 0.001 [2025-07-28 00:17:53] (step=0001988) Train Loss: 0.2358, Train Steps/Sec: 0.12, Epoch: 0.03863194714341236, LR: 0.001 [2025-07-28 00:18:01] (step=0001989) Train Loss: 0.2840, Train Steps/Sec: 0.13, Epoch: 0.03865137971239798, LR: 0.001 [2025-07-28 00:18:09] (step=0001990) Train Loss: 0.2110, Train Steps/Sec: 0.12, Epoch: 0.038670812281383596, LR: 0.001 [2025-07-28 00:18:17] (step=0001991) Train Loss: 0.3786, Train Steps/Sec: 0.12, Epoch: 0.03869024485036922, LR: 0.001 [2025-07-28 00:18:25] (step=0001992) Train Loss: 0.1942, Train Steps/Sec: 0.13, Epoch: 0.03870967741935484, LR: 0.001 [2025-07-28 00:18:33] (step=0001993) Train Loss: 0.2381, Train Steps/Sec: 0.12, Epoch: 0.038729109988340456, LR: 0.001 [2025-07-28 00:18:41] (step=0001994) Train Loss: 0.2148, Train Steps/Sec: 0.12, Epoch: 0.03874854255732608, LR: 0.001 [2025-07-28 00:18:49] (step=0001995) Train Loss: 0.2155, Train Steps/Sec: 0.12, Epoch: 0.0387679751263117, LR: 0.001 [2025-07-28 00:18:57] (step=0001996) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.038787407695297316, LR: 0.001 [2025-07-28 00:19:05] (step=0001997) Train Loss: 0.2545, Train Steps/Sec: 0.12, Epoch: 0.03880684026428294, LR: 0.001 [2025-07-28 00:19:13] (step=0001998) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.03882627283326856, LR: 0.001 [2025-07-28 00:19:22] (step=0001999) Train Loss: 0.2871, Train Steps/Sec: 0.12, Epoch: 0.038845705402254176, LR: 0.001 [2025-07-28 00:19:30] (step=0002000) Train Loss: 0.2841, Train Steps/Sec: 0.12, Epoch: 0.0388651379712398, LR: 0.001 [2025-07-28 00:19:30] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000/ [2025-07-28 00:19:38] (step=0002001) Train Loss: 0.1918, Train Steps/Sec: 0.12, Epoch: 0.03888457054022542, LR: 0.001 [2025-07-28 00:19:46] (step=0002002) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.038904003109211036, LR: 0.001 [2025-07-28 00:19:54] (step=0002003) Train Loss: 0.1723, Train Steps/Sec: 0.12, Epoch: 0.03892343567819666, LR: 0.001 [2025-07-28 00:20:02] (step=0002004) Train Loss: 0.2396, Train Steps/Sec: 0.13, Epoch: 0.03894286824718228, LR: 0.001 [2025-07-28 00:20:10] (step=0002005) Train Loss: 0.2628, Train Steps/Sec: 0.12, Epoch: 0.038962300816167895, LR: 0.001 [2025-07-28 00:20:18] (step=0002006) Train Loss: 0.1934, Train Steps/Sec: 0.12, Epoch: 0.03898173338515352, LR: 0.001 [2025-07-28 00:20:26] (step=0002007) Train Loss: 0.2881, Train Steps/Sec: 0.12, Epoch: 0.03900116595413914, LR: 0.001 [2025-07-28 00:20:34] (step=0002008) Train Loss: 0.2565, Train Steps/Sec: 0.12, Epoch: 0.039020598523124755, LR: 0.001 [2025-07-28 00:20:42] (step=0002009) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.03904003109211038, LR: 0.001 [2025-07-28 00:20:50] (step=0002010) Train Loss: 0.3064, Train Steps/Sec: 0.12, Epoch: 0.039059463661096, LR: 0.001 [2025-07-28 00:20:58] (step=0002011) Train Loss: 0.2458, Train Steps/Sec: 0.13, Epoch: 0.039078896230081615, LR: 0.001 [2025-07-28 00:21:06] (step=0002012) Train Loss: 0.2145, Train Steps/Sec: 0.12, Epoch: 0.03909832879906724, LR: 0.001 [2025-07-28 00:21:14] (step=0002013) Train Loss: 0.3170, Train Steps/Sec: 0.12, Epoch: 0.03911776136805286, LR: 0.001 [2025-07-28 00:21:23] (step=0002014) Train Loss: 0.2386, Train Steps/Sec: 0.12, Epoch: 0.039137193937038475, LR: 0.001 [2025-07-28 00:21:31] (step=0002015) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.0391566265060241, LR: 0.001 [2025-07-28 00:21:39] (step=0002016) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.03917605907500971, LR: 0.001 [2025-07-28 00:21:47] (step=0002017) Train Loss: 0.3172, Train Steps/Sec: 0.12, Epoch: 0.039195491643995335, LR: 0.001 [2025-07-28 00:21:55] (step=0002018) Train Loss: 0.2774, Train Steps/Sec: 0.13, Epoch: 0.03921492421298096, LR: 0.001 [2025-07-28 00:22:03] (step=0002019) Train Loss: 0.2255, Train Steps/Sec: 0.13, Epoch: 0.03923435678196657, LR: 0.001 [2025-07-28 00:22:08] (step=0002020) Train Loss: 0.2108, Train Steps/Sec: 0.18, Epoch: 0.039253789350952195, LR: 0.001 [2025-07-28 00:22:16] (step=0002021) Train Loss: 0.1995, Train Steps/Sec: 0.12, Epoch: 0.03927322191993782, LR: 0.001 [2025-07-28 00:22:24] (step=0002022) Train Loss: 0.2786, Train Steps/Sec: 0.12, Epoch: 0.03929265448892343, LR: 0.001 [2025-07-28 00:22:32] (step=0002023) Train Loss: 0.1977, Train Steps/Sec: 0.12, Epoch: 0.039312087057909055, LR: 0.001 [2025-07-28 00:22:40] (step=0002024) Train Loss: 0.2730, Train Steps/Sec: 0.13, Epoch: 0.03933151962689468, LR: 0.001 [2025-07-28 00:22:48] (step=0002025) Train Loss: 0.2676, Train Steps/Sec: 0.12, Epoch: 0.03935095219588029, LR: 0.001 [2025-07-28 00:22:56] (step=0002026) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.039370384764865915, LR: 0.001 [2025-07-28 00:23:05] (step=0002027) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.03938981733385154, LR: 0.001 [2025-07-28 00:23:13] (step=0002028) Train Loss: 0.3064, Train Steps/Sec: 0.12, Epoch: 0.03940924990283715, LR: 0.001 [2025-07-28 00:23:20] (step=0002029) Train Loss: 0.3128, Train Steps/Sec: 0.13, Epoch: 0.039428682471822775, LR: 0.001 [2025-07-28 00:23:29] (step=0002030) Train Loss: 0.2645, Train Steps/Sec: 0.12, Epoch: 0.0394481150408084, LR: 0.001 [2025-07-28 00:23:37] (step=0002031) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.03946754760979401, LR: 0.001 [2025-07-28 00:23:45] (step=0002032) Train Loss: 0.2903, Train Steps/Sec: 0.12, Epoch: 0.039486980178779635, LR: 0.001 [2025-07-28 00:23:53] (step=0002033) Train Loss: 0.2208, Train Steps/Sec: 0.12, Epoch: 0.03950641274776526, LR: 0.001 [2025-07-28 00:24:01] (step=0002034) Train Loss: 0.2432, Train Steps/Sec: 0.13, Epoch: 0.03952584531675087, LR: 0.001 [2025-07-28 00:24:09] (step=0002035) Train Loss: 0.2099, Train Steps/Sec: 0.12, Epoch: 0.039545277885736495, LR: 0.001 [2025-07-28 00:24:17] (step=0002036) Train Loss: 0.2428, Train Steps/Sec: 0.12, Epoch: 0.03956471045472212, LR: 0.001 [2025-07-28 00:24:25] (step=0002037) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.03958414302370773, LR: 0.001 [2025-07-28 00:24:33] (step=0002038) Train Loss: 0.2438, Train Steps/Sec: 0.12, Epoch: 0.039603575592693355, LR: 0.001 [2025-07-28 00:24:41] (step=0002039) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.03962300816167898, LR: 0.001 [2025-07-28 00:24:49] (step=0002040) Train Loss: 0.1878, Train Steps/Sec: 0.12, Epoch: 0.03964244073066459, LR: 0.001 [2025-07-28 00:24:57] (step=0002041) Train Loss: 0.2886, Train Steps/Sec: 0.13, Epoch: 0.039661873299650215, LR: 0.001 [2025-07-28 00:25:05] (step=0002042) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.03968130586863584, LR: 0.001 [2025-07-28 00:25:13] (step=0002043) Train Loss: 0.1668, Train Steps/Sec: 0.12, Epoch: 0.03970073843762145, LR: 0.001 [2025-07-28 00:25:21] (step=0002044) Train Loss: 0.2677, Train Steps/Sec: 0.12, Epoch: 0.039720171006607075, LR: 0.001 [2025-07-28 00:25:29] (step=0002045) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.03973960357559269, LR: 0.001 [2025-07-28 00:25:37] (step=0002046) Train Loss: 0.2338, Train Steps/Sec: 0.12, Epoch: 0.03975903614457831, LR: 0.001 [2025-07-28 00:25:45] (step=0002047) Train Loss: 0.2163, Train Steps/Sec: 0.12, Epoch: 0.039778468713563934, LR: 0.001 [2025-07-28 00:25:54] (step=0002048) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.03979790128254955, LR: 0.001 [2025-07-28 00:26:02] (step=0002049) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.03981733385153517, LR: 0.001 [2025-07-28 00:26:10] (step=0002050) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.039836766420520794, LR: 0.001 [2025-07-28 00:26:17] (step=0002051) Train Loss: 0.2817, Train Steps/Sec: 0.13, Epoch: 0.03985619898950641, LR: 0.001 [2025-07-28 00:26:26] (step=0002052) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.03987563155849203, LR: 0.001 [2025-07-28 00:26:31] (step=0002053) Train Loss: 0.2381, Train Steps/Sec: 0.18, Epoch: 0.039895064127477654, LR: 0.001 [2025-07-28 00:26:39] (step=0002054) Train Loss: 0.2865, Train Steps/Sec: 0.12, Epoch: 0.03991449669646327, LR: 0.001 [2025-07-28 00:26:47] (step=0002055) Train Loss: 0.2461, Train Steps/Sec: 0.13, Epoch: 0.03993392926544889, LR: 0.001 [2025-07-28 00:26:55] (step=0002056) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.039953361834434514, LR: 0.001 [2025-07-28 00:27:03] (step=0002057) Train Loss: 0.2827, Train Steps/Sec: 0.12, Epoch: 0.03997279440342013, LR: 0.001 [2025-07-28 00:27:11] (step=0002058) Train Loss: 0.2345, Train Steps/Sec: 0.12, Epoch: 0.03999222697240575, LR: 0.001 [2025-07-28 00:27:19] (step=0002059) Train Loss: 0.2337, Train Steps/Sec: 0.12, Epoch: 0.040011659541391374, LR: 0.001 [2025-07-28 00:27:28] (step=0002060) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.04003109211037699, LR: 0.001 [2025-07-28 00:27:36] (step=0002061) Train Loss: 0.2397, Train Steps/Sec: 0.12, Epoch: 0.04005052467936261, LR: 0.001 [2025-07-28 00:27:44] (step=0002062) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.040069957248348234, LR: 0.001 [2025-07-28 00:27:52] (step=0002063) Train Loss: 0.3321, Train Steps/Sec: 0.12, Epoch: 0.04008938981733385, LR: 0.001 [2025-07-28 00:28:00] (step=0002064) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.04010882238631947, LR: 0.001 [2025-07-28 00:28:08] (step=0002065) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.040128254955305094, LR: 0.001 [2025-07-28 00:28:16] (step=0002066) Train Loss: 0.2455, Train Steps/Sec: 0.12, Epoch: 0.04014768752429071, LR: 0.001 [2025-07-28 00:28:24] (step=0002067) Train Loss: 0.2517, Train Steps/Sec: 0.13, Epoch: 0.04016712009327633, LR: 0.001 [2025-07-28 00:28:32] (step=0002068) Train Loss: 0.3006, Train Steps/Sec: 0.12, Epoch: 0.040186552662261954, LR: 0.001 [2025-07-28 00:28:40] (step=0002069) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.04020598523124757, LR: 0.001 [2025-07-28 00:28:48] (step=0002070) Train Loss: 0.2373, Train Steps/Sec: 0.12, Epoch: 0.04022541780023319, LR: 0.001 [2025-07-28 00:28:56] (step=0002071) Train Loss: 0.1919, Train Steps/Sec: 0.12, Epoch: 0.040244850369218814, LR: 0.001 [2025-07-28 00:29:04] (step=0002072) Train Loss: 0.2551, Train Steps/Sec: 0.13, Epoch: 0.04026428293820443, LR: 0.001 [2025-07-28 00:29:12] (step=0002073) Train Loss: 0.3228, Train Steps/Sec: 0.12, Epoch: 0.04028371550719005, LR: 0.001 [2025-07-28 00:29:20] (step=0002074) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.040303148076175674, LR: 0.001 [2025-07-28 00:29:24] (step=0002075) Train Loss: 0.2653, Train Steps/Sec: 0.24, Epoch: 0.04032258064516129, LR: 0.001 [2025-07-28 00:29:28] (step=0002076) Train Loss: 0.2702, Train Steps/Sec: 0.27, Epoch: 0.04034201321414691, LR: 0.001 [2025-07-28 00:29:32] (step=0002077) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.04036144578313253, LR: 0.001 [2025-07-28 00:29:35] (step=0002078) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.04038087835211815, LR: 0.001 [2025-07-28 00:29:39] (step=0002079) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.04040031092110377, LR: 0.001 [2025-07-28 00:29:43] (step=0002080) Train Loss: 0.3161, Train Steps/Sec: 0.28, Epoch: 0.04041974349008939, LR: 0.001 [2025-07-28 00:29:46] (step=0002081) Train Loss: 0.2732, Train Steps/Sec: 0.28, Epoch: 0.04043917605907501, LR: 0.001 [2025-07-28 00:29:50] (step=0002082) Train Loss: 0.2626, Train Steps/Sec: 0.28, Epoch: 0.04045860862806063, LR: 0.001 [2025-07-28 00:29:53] (step=0002083) Train Loss: 0.1747, Train Steps/Sec: 0.28, Epoch: 0.04047804119704625, LR: 0.001 [2025-07-28 00:29:57] (step=0002084) Train Loss: 0.1542, Train Steps/Sec: 0.28, Epoch: 0.04049747376603187, LR: 0.001 [2025-07-28 00:30:01] (step=0002085) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.04051690633501749, LR: 0.001 [2025-07-28 00:30:04] (step=0002086) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.04053633890400311, LR: 0.001 [2025-07-28 00:30:08] (step=0002087) Train Loss: 0.2241, Train Steps/Sec: 0.27, Epoch: 0.04055577147298873, LR: 0.001 [2025-07-28 00:30:12] (step=0002088) Train Loss: 0.3127, Train Steps/Sec: 0.27, Epoch: 0.04057520404197435, LR: 0.001 [2025-07-28 00:30:15] (step=0002089) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.04059463661095997, LR: 0.001 [2025-07-28 00:30:19] (step=0002090) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.04061406917994559, LR: 0.001 [2025-07-28 00:30:23] (step=0002091) Train Loss: 0.2247, Train Steps/Sec: 0.27, Epoch: 0.04063350174893121, LR: 0.001 [2025-07-28 00:30:26] (step=0002092) Train Loss: 0.2289, Train Steps/Sec: 0.28, Epoch: 0.040652934317916826, LR: 0.001 [2025-07-28 00:30:30] (step=0002093) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.04067236688690245, LR: 0.001 [2025-07-28 00:30:33] (step=0002094) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.04069179945588807, LR: 0.001 [2025-07-28 00:30:37] (step=0002095) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.040711232024873686, LR: 0.001 [2025-07-28 00:30:41] (step=0002096) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.04073066459385931, LR: 0.001 [2025-07-28 00:30:44] (step=0002097) Train Loss: 0.2112, Train Steps/Sec: 0.27, Epoch: 0.04075009716284493, LR: 0.001 [2025-07-28 00:30:48] (step=0002098) Train Loss: 0.1655, Train Steps/Sec: 0.27, Epoch: 0.040769529731830546, LR: 0.001 [2025-07-28 00:30:52] (step=0002099) Train Loss: 0.2001, Train Steps/Sec: 0.27, Epoch: 0.04078896230081617, LR: 0.001 [2025-07-28 00:30:55] (step=0002100) Train Loss: 0.3297, Train Steps/Sec: 0.28, Epoch: 0.04080839486980179, LR: 0.001 [2025-07-28 00:30:59] (step=0002101) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.040827827438787406, LR: 0.001 [2025-07-28 00:31:03] (step=0002102) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.04084726000777303, LR: 0.001 [2025-07-28 00:31:06] (step=0002103) Train Loss: 0.2369, Train Steps/Sec: 0.27, Epoch: 0.04086669257675865, LR: 0.001 [2025-07-28 00:31:10] (step=0002104) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.040886125145744266, LR: 0.001 [2025-07-28 00:31:13] (step=0002105) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.04090555771472989, LR: 0.001 [2025-07-28 00:31:17] (step=0002106) Train Loss: 0.2971, Train Steps/Sec: 0.27, Epoch: 0.040924990283715504, LR: 0.001 [2025-07-28 00:31:21] (step=0002107) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.040944422852701126, LR: 0.001 [2025-07-28 00:31:24] (step=0002108) Train Loss: 0.2727, Train Steps/Sec: 0.26, Epoch: 0.04096385542168675, LR: 0.001 [2025-07-28 00:31:28] (step=0002109) Train Loss: 0.2909, Train Steps/Sec: 0.28, Epoch: 0.040983287990672364, LR: 0.001 [2025-07-28 00:31:32] (step=0002110) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.041002720559657986, LR: 0.001 [2025-07-28 00:31:35] (step=0002111) Train Loss: 0.2960, Train Steps/Sec: 0.28, Epoch: 0.04102215312864361, LR: 0.001 [2025-07-28 00:31:39] (step=0002112) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.041041585697629224, LR: 0.001 [2025-07-28 00:31:43] (step=0002113) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.041061018266614846, LR: 0.001 [2025-07-28 00:31:46] (step=0002114) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.04108045083560047, LR: 0.001 [2025-07-28 00:31:50] (step=0002115) Train Loss: 0.2622, Train Steps/Sec: 0.28, Epoch: 0.041099883404586084, LR: 0.001 [2025-07-28 00:31:54] (step=0002116) Train Loss: 0.2583, Train Steps/Sec: 0.27, Epoch: 0.041119315973571706, LR: 0.001 [2025-07-28 00:31:57] (step=0002117) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.04113874854255733, LR: 0.001 [2025-07-28 00:32:01] (step=0002118) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.041158181111542944, LR: 0.001 [2025-07-28 00:32:04] (step=0002119) Train Loss: 0.2843, Train Steps/Sec: 0.28, Epoch: 0.041177613680528566, LR: 0.001 [2025-07-28 00:32:08] (step=0002120) Train Loss: 0.2785, Train Steps/Sec: 0.27, Epoch: 0.04119704624951419, LR: 0.001 [2025-07-28 00:32:12] (step=0002121) Train Loss: 0.2312, Train Steps/Sec: 0.27, Epoch: 0.041216478818499803, LR: 0.001 [2025-07-28 00:32:15] (step=0002122) Train Loss: 0.1703, Train Steps/Sec: 0.27, Epoch: 0.041235911387485426, LR: 0.001 [2025-07-28 00:32:19] (step=0002123) Train Loss: 0.3154, Train Steps/Sec: 0.27, Epoch: 0.04125534395647105, LR: 0.001 [2025-07-28 00:32:23] (step=0002124) Train Loss: 0.2945, Train Steps/Sec: 0.27, Epoch: 0.04127477652545666, LR: 0.001 [2025-07-28 00:32:26] (step=0002125) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.041294209094442286, LR: 0.001 [2025-07-28 00:32:30] (step=0002126) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.04131364166342791, LR: 0.001 [2025-07-28 00:32:33] (step=0002127) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.04133307423241352, LR: 0.001 [2025-07-28 00:32:37] (step=0002128) Train Loss: 0.2435, Train Steps/Sec: 0.27, Epoch: 0.041352506801399146, LR: 0.001 [2025-07-28 00:32:41] (step=0002129) Train Loss: 0.2077, Train Steps/Sec: 0.27, Epoch: 0.04137193937038477, LR: 0.001 [2025-07-28 00:32:44] (step=0002130) Train Loss: 0.2066, Train Steps/Sec: 0.27, Epoch: 0.04139137193937038, LR: 0.001 [2025-07-28 00:32:48] (step=0002131) Train Loss: 0.2395, Train Steps/Sec: 0.27, Epoch: 0.041410804508356006, LR: 0.001 [2025-07-28 00:32:52] (step=0002132) Train Loss: 0.1522, Train Steps/Sec: 0.27, Epoch: 0.04143023707734163, LR: 0.001 [2025-07-28 00:32:55] (step=0002133) Train Loss: 0.2346, Train Steps/Sec: 0.27, Epoch: 0.04144966964632724, LR: 0.001 [2025-07-28 00:32:59] (step=0002134) Train Loss: 0.2706, Train Steps/Sec: 0.28, Epoch: 0.041469102215312866, LR: 0.001 [2025-07-28 00:33:03] (step=0002135) Train Loss: 0.2413, Train Steps/Sec: 0.27, Epoch: 0.04148853478429848, LR: 0.001 [2025-07-28 00:33:06] (step=0002136) Train Loss: 0.2255, Train Steps/Sec: 0.27, Epoch: 0.0415079673532841, LR: 0.001 [2025-07-28 00:33:10] (step=0002137) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.041527399922269725, LR: 0.001 [2025-07-28 00:33:14] (step=0002138) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.04154683249125534, LR: 0.001 [2025-07-28 00:33:17] (step=0002139) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.04156626506024096, LR: 0.001 [2025-07-28 00:33:21] (step=0002140) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.041585697629226585, LR: 0.001 [2025-07-28 00:33:24] (step=0002141) Train Loss: 0.1932, Train Steps/Sec: 0.28, Epoch: 0.0416051301982122, LR: 0.001 [2025-07-28 00:33:28] (step=0002142) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.04162456276719782, LR: 0.001 [2025-07-28 00:33:32] (step=0002143) Train Loss: 0.2445, Train Steps/Sec: 0.27, Epoch: 0.041643995336183445, LR: 0.001 [2025-07-28 00:33:35] (step=0002144) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.04166342790516906, LR: 0.001 [2025-07-28 00:33:39] (step=0002145) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.04168286047415468, LR: 0.001 [2025-07-28 00:33:43] (step=0002146) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.041702293043140305, LR: 0.001 [2025-07-28 00:33:46] (step=0002147) Train Loss: 0.1829, Train Steps/Sec: 0.28, Epoch: 0.04172172561212592, LR: 0.001 [2025-07-28 00:33:50] (step=0002148) Train Loss: 0.2708, Train Steps/Sec: 0.27, Epoch: 0.04174115818111154, LR: 0.001 [2025-07-28 00:33:54] (step=0002149) Train Loss: 0.2436, Train Steps/Sec: 0.27, Epoch: 0.041760590750097165, LR: 0.001 [2025-07-28 00:33:57] (step=0002150) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.04178002331908278, LR: 0.001 [2025-07-28 00:34:01] (step=0002151) Train Loss: 0.2830, Train Steps/Sec: 0.27, Epoch: 0.0417994558880684, LR: 0.001 [2025-07-28 00:34:04] (step=0002152) Train Loss: 0.2808, Train Steps/Sec: 0.27, Epoch: 0.041818888457054025, LR: 0.001 [2025-07-28 00:34:08] (step=0002153) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.04183832102603964, LR: 0.001 [2025-07-28 00:34:12] (step=0002154) Train Loss: 0.3467, Train Steps/Sec: 0.28, Epoch: 0.04185775359502526, LR: 0.001 [2025-07-28 00:34:15] (step=0002155) Train Loss: 0.1880, Train Steps/Sec: 0.28, Epoch: 0.041877186164010885, LR: 0.001 [2025-07-28 00:34:19] (step=0002156) Train Loss: 0.2572, Train Steps/Sec: 0.26, Epoch: 0.0418966187329965, LR: 0.001 [2025-07-28 00:34:23] (step=0002157) Train Loss: 0.3085, Train Steps/Sec: 0.28, Epoch: 0.04191605130198212, LR: 0.001 [2025-07-28 00:34:26] (step=0002158) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.041935483870967745, LR: 0.001 [2025-07-28 00:34:30] (step=0002159) Train Loss: 0.2314, Train Steps/Sec: 0.27, Epoch: 0.04195491643995336, LR: 0.001 [2025-07-28 00:34:34] (step=0002160) Train Loss: 0.1576, Train Steps/Sec: 0.27, Epoch: 0.04197434900893898, LR: 0.001 [2025-07-28 00:34:37] (step=0002161) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.041993781577924605, LR: 0.001 [2025-07-28 00:34:41] (step=0002162) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.04201321414691022, LR: 0.001 [2025-07-28 00:34:45] (step=0002163) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.04203264671589584, LR: 0.001 [2025-07-28 00:34:48] (step=0002164) Train Loss: 0.2704, Train Steps/Sec: 0.27, Epoch: 0.042052079284881465, LR: 0.001 [2025-07-28 00:34:52] (step=0002165) Train Loss: 0.1974, Train Steps/Sec: 0.27, Epoch: 0.04207151185386708, LR: 0.001 [2025-07-28 00:34:55] (step=0002166) Train Loss: 0.3222, Train Steps/Sec: 0.27, Epoch: 0.0420909444228527, LR: 0.001 [2025-07-28 00:34:59] (step=0002167) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.04211037699183832, LR: 0.001 [2025-07-28 00:35:03] (step=0002168) Train Loss: 0.2343, Train Steps/Sec: 0.27, Epoch: 0.04212980956082394, LR: 0.001 [2025-07-28 00:35:06] (step=0002169) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.04214924212980956, LR: 0.001 [2025-07-28 00:35:10] (step=0002170) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.04216867469879518, LR: 0.001 [2025-07-28 00:35:14] (step=0002171) Train Loss: 0.2359, Train Steps/Sec: 0.27, Epoch: 0.0421881072677808, LR: 0.001 [2025-07-28 00:35:17] (step=0002172) Train Loss: 0.3045, Train Steps/Sec: 0.27, Epoch: 0.04220753983676642, LR: 0.001 [2025-07-28 00:35:21] (step=0002173) Train Loss: 0.3002, Train Steps/Sec: 0.27, Epoch: 0.04222697240575204, LR: 0.001 [2025-07-28 00:35:25] (step=0002174) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.04224640497473766, LR: 0.001 [2025-07-28 00:35:28] (step=0002175) Train Loss: 0.2664, Train Steps/Sec: 0.27, Epoch: 0.04226583754372328, LR: 0.001 [2025-07-28 00:35:32] (step=0002176) Train Loss: 0.2199, Train Steps/Sec: 0.27, Epoch: 0.0422852701127089, LR: 0.001 [2025-07-28 00:35:36] (step=0002177) Train Loss: 0.2029, Train Steps/Sec: 0.27, Epoch: 0.04230470268169452, LR: 0.001 [2025-07-28 00:35:39] (step=0002178) Train Loss: 0.2188, Train Steps/Sec: 0.27, Epoch: 0.04232413525068014, LR: 0.001 [2025-07-28 00:35:43] (step=0002179) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.04234356781966576, LR: 0.001 [2025-07-28 00:35:46] (step=0002180) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.04236300038865138, LR: 0.001 [2025-07-28 00:35:50] (step=0002181) Train Loss: 0.2623, Train Steps/Sec: 0.27, Epoch: 0.042382432957637, LR: 0.001 [2025-07-28 00:35:54] (step=0002182) Train Loss: 0.3727, Train Steps/Sec: 0.27, Epoch: 0.04240186552662262, LR: 0.001 [2025-07-28 00:35:57] (step=0002183) Train Loss: 0.1244, Train Steps/Sec: 0.27, Epoch: 0.04242129809560824, LR: 0.001 [2025-07-28 00:36:01] (step=0002184) Train Loss: 0.2161, Train Steps/Sec: 0.27, Epoch: 0.04244073066459386, LR: 0.001 [2025-07-28 00:36:05] (step=0002185) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.04246016323357948, LR: 0.001 [2025-07-28 00:36:08] (step=0002186) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.0424795958025651, LR: 0.001 [2025-07-28 00:36:12] (step=0002187) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.04249902837155072, LR: 0.001 [2025-07-28 00:36:16] (step=0002188) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.04251846094053634, LR: 0.001 [2025-07-28 00:36:19] (step=0002189) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.04253789350952196, LR: 0.001 [2025-07-28 00:36:23] (step=0002190) Train Loss: 0.2043, Train Steps/Sec: 0.28, Epoch: 0.04255732607850758, LR: 0.001 [2025-07-28 00:36:26] (step=0002191) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.0425767586474932, LR: 0.001 [2025-07-28 00:36:30] (step=0002192) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.04259619121647882, LR: 0.001 [2025-07-28 00:36:34] (step=0002193) Train Loss: 0.3020, Train Steps/Sec: 0.28, Epoch: 0.04261562378546444, LR: 0.001 [2025-07-28 00:36:37] (step=0002194) Train Loss: 0.1816, Train Steps/Sec: 0.28, Epoch: 0.04263505635445006, LR: 0.001 [2025-07-28 00:36:41] (step=0002195) Train Loss: 0.2477, Train Steps/Sec: 0.27, Epoch: 0.04265448892343568, LR: 0.001 [2025-07-28 00:36:45] (step=0002196) Train Loss: 0.2986, Train Steps/Sec: 0.28, Epoch: 0.042673921492421295, LR: 0.001 [2025-07-28 00:36:48] (step=0002197) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.04269335406140692, LR: 0.001 [2025-07-28 00:36:52] (step=0002198) Train Loss: 0.3115, Train Steps/Sec: 0.28, Epoch: 0.04271278663039254, LR: 0.001 [2025-07-28 00:36:55] (step=0002199) Train Loss: 0.2878, Train Steps/Sec: 0.28, Epoch: 0.042732219199378155, LR: 0.001 [2025-07-28 00:36:59] (step=0002200) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.04275165176836378, LR: 0.001 [2025-07-28 00:37:03] (step=0002201) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.0427710843373494, LR: 0.001 [2025-07-28 00:37:06] (step=0002202) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.042790516906335015, LR: 0.001 [2025-07-28 00:37:10] (step=0002203) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.04280994947532064, LR: 0.001 [2025-07-28 00:37:14] (step=0002204) Train Loss: 0.1992, Train Steps/Sec: 0.27, Epoch: 0.04282938204430626, LR: 0.001 [2025-07-28 00:37:17] (step=0002205) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.042848814613291875, LR: 0.001 [2025-07-28 00:37:21] (step=0002206) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.0428682471822775, LR: 0.001 [2025-07-28 00:37:25] (step=0002207) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.04288767975126312, LR: 0.001 [2025-07-28 00:37:28] (step=0002208) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.042907112320248735, LR: 0.001 [2025-07-28 00:37:32] (step=0002209) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.04292654488923436, LR: 0.001 [2025-07-28 00:37:35] (step=0002210) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.04294597745821998, LR: 0.001 [2025-07-28 00:37:39] (step=0002211) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.042965410027205594, LR: 0.001 [2025-07-28 00:37:43] (step=0002212) Train Loss: 0.3255, Train Steps/Sec: 0.28, Epoch: 0.04298484259619122, LR: 0.001 [2025-07-28 00:37:46] (step=0002213) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.04300427516517684, LR: 0.001 [2025-07-28 00:37:50] (step=0002214) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.043023707734162454, LR: 0.001 [2025-07-28 00:37:54] (step=0002215) Train Loss: 0.2767, Train Steps/Sec: 0.28, Epoch: 0.04304314030314808, LR: 0.001 [2025-07-28 00:37:57] (step=0002216) Train Loss: 0.1881, Train Steps/Sec: 0.28, Epoch: 0.0430625728721337, LR: 0.001 [2025-07-28 00:38:01] (step=0002217) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.043082005441119314, LR: 0.001 [2025-07-28 00:38:04] (step=0002218) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.04310143801010494, LR: 0.001 [2025-07-28 00:38:08] (step=0002219) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.04312087057909056, LR: 0.001 [2025-07-28 00:38:12] (step=0002220) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.043140303148076174, LR: 0.001 [2025-07-28 00:38:15] (step=0002221) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.043159735717061797, LR: 0.001 [2025-07-28 00:38:19] (step=0002222) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.04317916828604742, LR: 0.001 [2025-07-28 00:38:22] (step=0002223) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.043198600855033034, LR: 0.001 [2025-07-28 00:38:26] (step=0002224) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.043218033424018656, LR: 0.001 [2025-07-28 00:38:30] (step=0002225) Train Loss: 0.3263, Train Steps/Sec: 0.28, Epoch: 0.04323746599300427, LR: 0.001 [2025-07-28 00:38:33] (step=0002226) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.043256898561989894, LR: 0.001 [2025-07-28 00:38:37] (step=0002227) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.043276331130975516, LR: 0.001 [2025-07-28 00:38:41] (step=0002228) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.04329576369996113, LR: 0.001 [2025-07-28 00:38:44] (step=0002229) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.043315196268946754, LR: 0.001 [2025-07-28 00:38:48] (step=0002230) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.043334628837932376, LR: 0.001 [2025-07-28 00:38:51] (step=0002231) Train Loss: 0.2467, Train Steps/Sec: 0.28, Epoch: 0.04335406140691799, LR: 0.001 [2025-07-28 00:38:55] (step=0002232) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.043373493975903614, LR: 0.001 [2025-07-28 00:38:59] (step=0002233) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.043392926544889236, LR: 0.001 [2025-07-28 00:39:02] (step=0002234) Train Loss: 0.2945, Train Steps/Sec: 0.27, Epoch: 0.04341235911387485, LR: 0.001 [2025-07-28 00:39:06] (step=0002235) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.043431791682860474, LR: 0.001 [2025-07-28 00:39:10] (step=0002236) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.043451224251846096, LR: 0.001 [2025-07-28 00:39:13] (step=0002237) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.04347065682083171, LR: 0.001 [2025-07-28 00:39:17] (step=0002238) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.043490089389817334, LR: 0.001 [2025-07-28 00:39:20] (step=0002239) Train Loss: 0.1851, Train Steps/Sec: 0.28, Epoch: 0.043509521958802956, LR: 0.001 [2025-07-28 00:39:24] (step=0002240) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.04352895452778857, LR: 0.001 [2025-07-28 00:39:28] (step=0002241) Train Loss: 0.1585, Train Steps/Sec: 0.28, Epoch: 0.043548387096774194, LR: 0.001 [2025-07-28 00:39:31] (step=0002242) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.043567819665759816, LR: 0.001 [2025-07-28 00:39:35] (step=0002243) Train Loss: 0.3273, Train Steps/Sec: 0.28, Epoch: 0.04358725223474543, LR: 0.001 [2025-07-28 00:39:39] (step=0002244) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.043606684803731054, LR: 0.001 [2025-07-28 00:39:42] (step=0002245) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.043626117372716676, LR: 0.001 [2025-07-28 00:39:46] (step=0002246) Train Loss: 0.1863, Train Steps/Sec: 0.28, Epoch: 0.04364554994170229, LR: 0.001 [2025-07-28 00:39:49] (step=0002247) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.043664982510687914, LR: 0.001 [2025-07-28 00:39:53] (step=0002248) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.043684415079673536, LR: 0.001 [2025-07-28 00:39:57] (step=0002249) Train Loss: 0.3716, Train Steps/Sec: 0.28, Epoch: 0.04370384764865915, LR: 0.001 [2025-07-28 00:40:00] (step=0002250) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.043723280217644774, LR: 0.001 [2025-07-28 00:40:04] (step=0002251) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.043742712786630396, LR: 0.001 [2025-07-28 00:40:08] (step=0002252) Train Loss: 0.2365, Train Steps/Sec: 0.26, Epoch: 0.04376214535561601, LR: 0.001 [2025-07-28 00:40:11] (step=0002253) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.04378157792460163, LR: 0.001 [2025-07-28 00:40:15] (step=0002254) Train Loss: 0.3628, Train Steps/Sec: 0.28, Epoch: 0.04380101049358725, LR: 0.001 [2025-07-28 00:40:19] (step=0002255) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.04382044306257287, LR: 0.001 [2025-07-28 00:40:22] (step=0002256) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.04383987563155849, LR: 0.001 [2025-07-28 00:40:26] (step=0002257) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.04385930820054411, LR: 0.001 [2025-07-28 00:40:29] (step=0002258) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.04387874076952973, LR: 0.001 [2025-07-28 00:40:33] (step=0002259) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.04389817333851535, LR: 0.001 [2025-07-28 00:40:37] (step=0002260) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.04391760590750097, LR: 0.001 [2025-07-28 00:40:40] (step=0002261) Train Loss: 0.2527, Train Steps/Sec: 0.28, Epoch: 0.04393703847648659, LR: 0.001 [2025-07-28 00:40:44] (step=0002262) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.04395647104547221, LR: 0.001 [2025-07-28 00:40:48] (step=0002263) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.04397590361445783, LR: 0.001 [2025-07-28 00:40:51] (step=0002264) Train Loss: 0.2107, Train Steps/Sec: 0.28, Epoch: 0.04399533618344345, LR: 0.001 [2025-07-28 00:40:55] (step=0002265) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.04401476875242907, LR: 0.001 [2025-07-28 00:40:58] (step=0002266) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.04403420132141469, LR: 0.001 [2025-07-28 00:41:02] (step=0002267) Train Loss: 0.3282, Train Steps/Sec: 0.28, Epoch: 0.04405363389040031, LR: 0.001 [2025-07-28 00:41:06] (step=0002268) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.04407306645938593, LR: 0.001 [2025-07-28 00:41:09] (step=0002269) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.04409249902837155, LR: 0.001 [2025-07-28 00:41:13] (step=0002270) Train Loss: 0.3045, Train Steps/Sec: 0.28, Epoch: 0.04411193159735717, LR: 0.001 [2025-07-28 00:41:17] (step=0002271) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.04413136416634279, LR: 0.001 [2025-07-28 00:41:20] (step=0002272) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.04415079673532841, LR: 0.001 [2025-07-28 00:41:24] (step=0002273) Train Loss: 0.2599, Train Steps/Sec: 0.28, Epoch: 0.04417022930431403, LR: 0.001 [2025-07-28 00:41:27] (step=0002274) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.04418966187329965, LR: 0.001 [2025-07-28 00:41:31] (step=0002275) Train Loss: 0.2953, Train Steps/Sec: 0.28, Epoch: 0.04420909444228527, LR: 0.001 [2025-07-28 00:41:35] (step=0002276) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.04422852701127089, LR: 0.001 [2025-07-28 00:41:38] (step=0002277) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.04424795958025651, LR: 0.001 [2025-07-28 00:41:42] (step=0002278) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.04426739214924213, LR: 0.001 [2025-07-28 00:41:46] (step=0002279) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.04428682471822775, LR: 0.001 [2025-07-28 00:41:49] (step=0002280) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.04430625728721337, LR: 0.001 [2025-07-28 00:41:53] (step=0002281) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.04432568985619899, LR: 0.001 [2025-07-28 00:41:56] (step=0002282) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.04434512242518461, LR: 0.001 [2025-07-28 00:42:00] (step=0002283) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.04436455499417023, LR: 0.001 [2025-07-28 00:42:04] (step=0002284) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.04438398756315585, LR: 0.001 [2025-07-28 00:42:07] (step=0002285) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.04440342013214147, LR: 0.001 [2025-07-28 00:42:11] (step=0002286) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.044422852701127086, LR: 0.001 [2025-07-28 00:42:15] (step=0002287) Train Loss: 0.1704, Train Steps/Sec: 0.28, Epoch: 0.04444228527011271, LR: 0.001 [2025-07-28 00:42:18] (step=0002288) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.04446171783909833, LR: 0.001 [2025-07-28 00:42:22] (step=0002289) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.044481150408083946, LR: 0.001 [2025-07-28 00:42:25] (step=0002290) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.04450058297706957, LR: 0.001 [2025-07-28 00:42:29] (step=0002291) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.04452001554605519, LR: 0.001 [2025-07-28 00:42:33] (step=0002292) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.044539448115040806, LR: 0.001 [2025-07-28 00:42:36] (step=0002293) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.04455888068402643, LR: 0.001 [2025-07-28 00:42:40] (step=0002294) Train Loss: 0.2000, Train Steps/Sec: 0.28, Epoch: 0.04457831325301205, LR: 0.001 [2025-07-28 00:42:43] (step=0002295) Train Loss: 0.1668, Train Steps/Sec: 0.28, Epoch: 0.044597745821997666, LR: 0.001 [2025-07-28 00:42:47] (step=0002296) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.04461717839098329, LR: 0.001 [2025-07-28 00:42:51] (step=0002297) Train Loss: 0.3021, Train Steps/Sec: 0.28, Epoch: 0.04463661095996891, LR: 0.001 [2025-07-28 00:42:54] (step=0002298) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.044656043528954525, LR: 0.001 [2025-07-28 00:42:58] (step=0002299) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.04467547609794015, LR: 0.001 [2025-07-28 00:43:02] (step=0002300) Train Loss: 0.2701, Train Steps/Sec: 0.27, Epoch: 0.04469490866692577, LR: 0.001 [2025-07-28 00:43:05] (step=0002301) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.044714341235911385, LR: 0.001 [2025-07-28 00:43:09] (step=0002302) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.04473377380489701, LR: 0.001 [2025-07-28 00:43:13] (step=0002303) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.04475320637388263, LR: 0.001 [2025-07-28 00:43:16] (step=0002304) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.044772638942868245, LR: 0.001 [2025-07-28 00:43:20] (step=0002305) Train Loss: 0.1727, Train Steps/Sec: 0.28, Epoch: 0.04479207151185387, LR: 0.001 [2025-07-28 00:43:23] (step=0002306) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.04481150408083949, LR: 0.001 [2025-07-28 00:43:27] (step=0002307) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.044830936649825105, LR: 0.001 [2025-07-28 00:43:31] (step=0002308) Train Loss: 0.3191, Train Steps/Sec: 0.28, Epoch: 0.04485036921881073, LR: 0.001 [2025-07-28 00:43:34] (step=0002309) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.04486980178779635, LR: 0.001 [2025-07-28 00:43:38] (step=0002310) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.044889234356781965, LR: 0.001 [2025-07-28 00:43:42] (step=0002311) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.04490866692576759, LR: 0.001 [2025-07-28 00:43:45] (step=0002312) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.04492809949475321, LR: 0.001 [2025-07-28 00:43:49] (step=0002313) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.044947532063738825, LR: 0.001 [2025-07-28 00:43:52] (step=0002314) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.04496696463272445, LR: 0.001 [2025-07-28 00:43:56] (step=0002315) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.04498639720171006, LR: 0.001 [2025-07-28 00:44:00] (step=0002316) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.045005829770695685, LR: 0.001 [2025-07-28 00:44:03] (step=0002317) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.04502526233968131, LR: 0.001 [2025-07-28 00:44:07] (step=0002318) Train Loss: 0.2350, Train Steps/Sec: 0.28, Epoch: 0.04504469490866692, LR: 0.001 [2025-07-28 00:44:11] (step=0002319) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.045064127477652545, LR: 0.001 [2025-07-28 00:44:14] (step=0002320) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.04508356004663817, LR: 0.001 [2025-07-28 00:44:18] (step=0002321) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.04510299261562378, LR: 0.001 [2025-07-28 00:44:21] (step=0002322) Train Loss: 0.2393, Train Steps/Sec: 0.27, Epoch: 0.045122425184609405, LR: 0.001 [2025-07-28 00:44:25] (step=0002323) Train Loss: 0.2994, Train Steps/Sec: 0.27, Epoch: 0.04514185775359503, LR: 0.001 [2025-07-28 00:44:29] (step=0002324) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.04516129032258064, LR: 0.001 [2025-07-28 00:44:32] (step=0002325) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.045180722891566265, LR: 0.001 [2025-07-28 00:44:36] (step=0002326) Train Loss: 0.3408, Train Steps/Sec: 0.28, Epoch: 0.04520015546055189, LR: 0.001 [2025-07-28 00:44:40] (step=0002327) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.0452195880295375, LR: 0.001 [2025-07-28 00:44:43] (step=0002328) Train Loss: 0.1699, Train Steps/Sec: 0.28, Epoch: 0.045239020598523125, LR: 0.001 [2025-07-28 00:44:47] (step=0002329) Train Loss: 0.2629, Train Steps/Sec: 0.28, Epoch: 0.04525845316750875, LR: 0.001 [2025-07-28 00:44:51] (step=0002330) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.04527788573649436, LR: 0.001 [2025-07-28 00:44:54] (step=0002331) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.045297318305479985, LR: 0.001 [2025-07-28 00:44:58] (step=0002332) Train Loss: 0.2933, Train Steps/Sec: 0.27, Epoch: 0.04531675087446561, LR: 0.001 [2025-07-28 00:45:01] (step=0002333) Train Loss: 0.2036, Train Steps/Sec: 0.27, Epoch: 0.04533618344345122, LR: 0.001 [2025-07-28 00:45:05] (step=0002334) Train Loss: 0.1937, Train Steps/Sec: 0.27, Epoch: 0.045355616012436845, LR: 0.001 [2025-07-28 00:45:09] (step=0002335) Train Loss: 0.2356, Train Steps/Sec: 0.27, Epoch: 0.04537504858142247, LR: 0.001 [2025-07-28 00:45:12] (step=0002336) Train Loss: 0.2925, Train Steps/Sec: 0.27, Epoch: 0.04539448115040808, LR: 0.001 [2025-07-28 00:45:16] (step=0002337) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.045413913719393705, LR: 0.001 [2025-07-28 00:45:20] (step=0002338) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.04543334628837933, LR: 0.001 [2025-07-28 00:45:23] (step=0002339) Train Loss: 0.2743, Train Steps/Sec: 0.27, Epoch: 0.04545277885736494, LR: 0.001 [2025-07-28 00:45:27] (step=0002340) Train Loss: 0.2841, Train Steps/Sec: 0.27, Epoch: 0.045472211426350564, LR: 0.001 [2025-07-28 00:45:31] (step=0002341) Train Loss: 0.3049, Train Steps/Sec: 0.27, Epoch: 0.04549164399533619, LR: 0.001 [2025-07-28 00:45:34] (step=0002342) Train Loss: 0.2902, Train Steps/Sec: 0.27, Epoch: 0.0455110765643218, LR: 0.001 [2025-07-28 00:45:38] (step=0002343) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.045530509133307424, LR: 0.001 [2025-07-28 00:45:41] (step=0002344) Train Loss: 0.2811, Train Steps/Sec: 0.27, Epoch: 0.04554994170229304, LR: 0.001 [2025-07-28 00:45:45] (step=0002345) Train Loss: 0.2494, Train Steps/Sec: 0.27, Epoch: 0.04556937427127866, LR: 0.001 [2025-07-28 00:45:49] (step=0002346) Train Loss: 0.2847, Train Steps/Sec: 0.27, Epoch: 0.045588806840264284, LR: 0.001 [2025-07-28 00:45:52] (step=0002347) Train Loss: 0.2477, Train Steps/Sec: 0.27, Epoch: 0.0456082394092499, LR: 0.001 [2025-07-28 00:45:56] (step=0002348) Train Loss: 0.1957, Train Steps/Sec: 0.27, Epoch: 0.04562767197823552, LR: 0.001 [2025-07-28 00:46:00] (step=0002349) Train Loss: 0.2652, Train Steps/Sec: 0.27, Epoch: 0.045647104547221144, LR: 0.001 [2025-07-28 00:46:03] (step=0002350) Train Loss: 0.2838, Train Steps/Sec: 0.27, Epoch: 0.04566653711620676, LR: 0.001 [2025-07-28 00:46:07] (step=0002351) Train Loss: 0.2912, Train Steps/Sec: 0.27, Epoch: 0.04568596968519238, LR: 0.001 [2025-07-28 00:46:11] (step=0002352) Train Loss: 0.3338, Train Steps/Sec: 0.28, Epoch: 0.045705402254178004, LR: 0.001 [2025-07-28 00:46:14] (step=0002353) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.04572483482316362, LR: 0.001 [2025-07-28 00:46:18] (step=0002354) Train Loss: 0.3351, Train Steps/Sec: 0.27, Epoch: 0.04574426739214924, LR: 0.001 [2025-07-28 00:46:22] (step=0002355) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.045763699961134864, LR: 0.001 [2025-07-28 00:46:25] (step=0002356) Train Loss: 0.3468, Train Steps/Sec: 0.27, Epoch: 0.04578313253012048, LR: 0.001 [2025-07-28 00:46:29] (step=0002357) Train Loss: 0.2212, Train Steps/Sec: 0.27, Epoch: 0.0458025650991061, LR: 0.001 [2025-07-28 00:46:33] (step=0002358) Train Loss: 0.1810, Train Steps/Sec: 0.27, Epoch: 0.045821997668091724, LR: 0.001 [2025-07-28 00:46:36] (step=0002359) Train Loss: 0.2180, Train Steps/Sec: 0.27, Epoch: 0.04584143023707734, LR: 0.001 [2025-07-28 00:46:40] (step=0002360) Train Loss: 0.2437, Train Steps/Sec: 0.27, Epoch: 0.04586086280606296, LR: 0.001 [2025-07-28 00:46:43] (step=0002361) Train Loss: 0.1825, Train Steps/Sec: 0.27, Epoch: 0.045880295375048584, LR: 0.001 [2025-07-28 00:46:47] (step=0002362) Train Loss: 0.2983, Train Steps/Sec: 0.27, Epoch: 0.0458997279440342, LR: 0.001 [2025-07-28 00:46:51] (step=0002363) Train Loss: 0.2794, Train Steps/Sec: 0.27, Epoch: 0.04591916051301982, LR: 0.001 [2025-07-28 00:46:54] (step=0002364) Train Loss: 0.2227, Train Steps/Sec: 0.27, Epoch: 0.045938593082005444, LR: 0.001 [2025-07-28 00:46:58] (step=0002365) Train Loss: 0.3407, Train Steps/Sec: 0.27, Epoch: 0.04595802565099106, LR: 0.001 [2025-07-28 00:47:02] (step=0002366) Train Loss: 0.2449, Train Steps/Sec: 0.27, Epoch: 0.04597745821997668, LR: 0.001 [2025-07-28 00:47:05] (step=0002367) Train Loss: 0.2371, Train Steps/Sec: 0.27, Epoch: 0.045996890788962304, LR: 0.001 [2025-07-28 00:47:09] (step=0002368) Train Loss: 0.2471, Train Steps/Sec: 0.27, Epoch: 0.04601632335794792, LR: 0.001 [2025-07-28 00:47:13] (step=0002369) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.04603575592693354, LR: 0.001 [2025-07-28 00:47:16] (step=0002370) Train Loss: 0.2808, Train Steps/Sec: 0.27, Epoch: 0.046055188495919164, LR: 0.001 [2025-07-28 00:47:20] (step=0002371) Train Loss: 0.2768, Train Steps/Sec: 0.27, Epoch: 0.04607462106490478, LR: 0.001 [2025-07-28 00:47:24] (step=0002372) Train Loss: 0.2571, Train Steps/Sec: 0.27, Epoch: 0.0460940536338904, LR: 0.001 [2025-07-28 00:47:27] (step=0002373) Train Loss: 0.2714, Train Steps/Sec: 0.27, Epoch: 0.046113486202876024, LR: 0.001 [2025-07-28 00:47:31] (step=0002374) Train Loss: 0.2452, Train Steps/Sec: 0.27, Epoch: 0.04613291877186164, LR: 0.001 [2025-07-28 00:47:34] (step=0002375) Train Loss: 0.2525, Train Steps/Sec: 0.27, Epoch: 0.04615235134084726, LR: 0.001 [2025-07-28 00:47:38] (step=0002376) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.04617178390983288, LR: 0.001 [2025-07-28 00:47:42] (step=0002377) Train Loss: 0.2212, Train Steps/Sec: 0.27, Epoch: 0.0461912164788185, LR: 0.001 [2025-07-28 00:47:45] (step=0002378) Train Loss: 0.2073, Train Steps/Sec: 0.27, Epoch: 0.04621064904780412, LR: 0.001 [2025-07-28 00:47:49] (step=0002379) Train Loss: 0.2867, Train Steps/Sec: 0.27, Epoch: 0.04623008161678974, LR: 0.001 [2025-07-28 00:47:53] (step=0002380) Train Loss: 0.2095, Train Steps/Sec: 0.27, Epoch: 0.04624951418577536, LR: 0.001 [2025-07-28 00:47:56] (step=0002381) Train Loss: 0.2450, Train Steps/Sec: 0.27, Epoch: 0.04626894675476098, LR: 0.001 [2025-07-28 00:48:00] (step=0002382) Train Loss: 0.2779, Train Steps/Sec: 0.27, Epoch: 0.046288379323746597, LR: 0.001 [2025-07-28 00:48:04] (step=0002383) Train Loss: 0.1968, Train Steps/Sec: 0.27, Epoch: 0.04630781189273222, LR: 0.001 [2025-07-28 00:48:07] (step=0002384) Train Loss: 0.1972, Train Steps/Sec: 0.27, Epoch: 0.04632724446171784, LR: 0.001 [2025-07-28 00:48:11] (step=0002385) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.046346677030703456, LR: 0.001 [2025-07-28 00:48:14] (step=0002386) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.04636610959968908, LR: 0.001 [2025-07-28 00:48:18] (step=0002387) Train Loss: 0.2461, Train Steps/Sec: 0.27, Epoch: 0.0463855421686747, LR: 0.001 [2025-07-28 00:48:22] (step=0002388) Train Loss: 0.2467, Train Steps/Sec: 0.27, Epoch: 0.046404974737660316, LR: 0.001 [2025-07-28 00:48:25] (step=0002389) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.04642440730664594, LR: 0.001 [2025-07-28 00:48:29] (step=0002390) Train Loss: 0.2450, Train Steps/Sec: 0.27, Epoch: 0.04644383987563156, LR: 0.001 [2025-07-28 00:48:33] (step=0002391) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.046463272444617176, LR: 0.001 [2025-07-28 00:48:36] (step=0002392) Train Loss: 0.2043, Train Steps/Sec: 0.28, Epoch: 0.0464827050136028, LR: 0.001 [2025-07-28 00:48:40] (step=0002393) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.04650213758258842, LR: 0.001 [2025-07-28 00:48:44] (step=0002394) Train Loss: 0.1569, Train Steps/Sec: 0.28, Epoch: 0.046521570151574036, LR: 0.001 [2025-07-28 00:48:47] (step=0002395) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.04654100272055966, LR: 0.001 [2025-07-28 00:48:51] (step=0002396) Train Loss: 0.2483, Train Steps/Sec: 0.26, Epoch: 0.04656043528954528, LR: 0.001 [2025-07-28 00:48:55] (step=0002397) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.046579867858530896, LR: 0.001 [2025-07-28 00:48:58] (step=0002398) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.04659930042751652, LR: 0.001 [2025-07-28 00:49:02] (step=0002399) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.04661873299650214, LR: 0.001 [2025-07-28 00:49:05] (step=0002400) Train Loss: 0.2327, Train Steps/Sec: 0.27, Epoch: 0.046638165565487756, LR: 0.001 [2025-07-28 00:49:09] (step=0002401) Train Loss: 0.1894, Train Steps/Sec: 0.28, Epoch: 0.04665759813447338, LR: 0.001 [2025-07-28 00:49:13] (step=0002402) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.046677030703459, LR: 0.001 [2025-07-28 00:49:16] (step=0002403) Train Loss: 0.1900, Train Steps/Sec: 0.28, Epoch: 0.046696463272444616, LR: 0.001 [2025-07-28 00:49:20] (step=0002404) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.04671589584143024, LR: 0.001 [2025-07-28 00:49:24] (step=0002405) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.046735328410415854, LR: 0.001 [2025-07-28 00:49:27] (step=0002406) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.046754760979401476, LR: 0.001 [2025-07-28 00:49:31] (step=0002407) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.0467741935483871, LR: 0.001 [2025-07-28 00:49:34] (step=0002408) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.046793626117372714, LR: 0.001 [2025-07-28 00:49:38] (step=0002409) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.046813058686358336, LR: 0.001 [2025-07-28 00:49:42] (step=0002410) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.04683249125534396, LR: 0.001 [2025-07-28 00:49:45] (step=0002411) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.046851923824329574, LR: 0.001 [2025-07-28 00:49:49] (step=0002412) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.046871356393315196, LR: 0.001 [2025-07-28 00:49:53] (step=0002413) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.04689078896230082, LR: 0.001 [2025-07-28 00:49:56] (step=0002414) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.04691022153128643, LR: 0.001 [2025-07-28 00:50:00] (step=0002415) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.046929654100272056, LR: 0.001 [2025-07-28 00:50:03] (step=0002416) Train Loss: 0.2971, Train Steps/Sec: 0.28, Epoch: 0.04694908666925768, LR: 0.001 [2025-07-28 00:50:07] (step=0002417) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.04696851923824329, LR: 0.001 [2025-07-28 00:50:11] (step=0002418) Train Loss: 0.1898, Train Steps/Sec: 0.28, Epoch: 0.046987951807228916, LR: 0.001 [2025-07-28 00:50:14] (step=0002419) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.04700738437621454, LR: 0.001 [2025-07-28 00:50:18] (step=0002420) Train Loss: 0.1546, Train Steps/Sec: 0.28, Epoch: 0.04702681694520015, LR: 0.001 [2025-07-28 00:50:22] (step=0002421) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.047046249514185776, LR: 0.001 [2025-07-28 00:50:25] (step=0002422) Train Loss: 0.2830, Train Steps/Sec: 0.28, Epoch: 0.0470656820831714, LR: 0.001 [2025-07-28 00:50:29] (step=0002423) Train Loss: 0.2868, Train Steps/Sec: 0.28, Epoch: 0.04708511465215701, LR: 0.001 [2025-07-28 00:50:32] (step=0002424) Train Loss: 0.1823, Train Steps/Sec: 0.28, Epoch: 0.047104547221142636, LR: 0.001 [2025-07-28 00:50:36] (step=0002425) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.04712397979012826, LR: 0.001 [2025-07-28 00:50:40] (step=0002426) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.04714341235911387, LR: 0.001 [2025-07-28 00:50:43] (step=0002427) Train Loss: 0.1904, Train Steps/Sec: 0.28, Epoch: 0.047162844928099495, LR: 0.001 [2025-07-28 00:50:47] (step=0002428) Train Loss: 0.1843, Train Steps/Sec: 0.28, Epoch: 0.04718227749708512, LR: 0.001 [2025-07-28 00:50:51] (step=0002429) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.04720171006607073, LR: 0.001 [2025-07-28 00:50:54] (step=0002430) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.047221142635056355, LR: 0.001 [2025-07-28 00:50:58] (step=0002431) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.04724057520404198, LR: 0.001 [2025-07-28 00:51:01] (step=0002432) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.04726000777302759, LR: 0.001 [2025-07-28 00:51:05] (step=0002433) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.047279440342013215, LR: 0.001 [2025-07-28 00:51:09] (step=0002434) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.04729887291099883, LR: 0.001 [2025-07-28 00:51:12] (step=0002435) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.04731830547998445, LR: 0.001 [2025-07-28 00:51:16] (step=0002436) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.047337738048970075, LR: 0.001 [2025-07-28 00:51:20] (step=0002437) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.04735717061795569, LR: 0.001 [2025-07-28 00:51:23] (step=0002438) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.04737660318694131, LR: 0.001 [2025-07-28 00:51:27] (step=0002439) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.047396035755926935, LR: 0.001 [2025-07-28 00:51:30] (step=0002440) Train Loss: 0.3013, Train Steps/Sec: 0.28, Epoch: 0.04741546832491255, LR: 0.001 [2025-07-28 00:51:34] (step=0002441) Train Loss: 0.3186, Train Steps/Sec: 0.28, Epoch: 0.04743490089389817, LR: 0.001 [2025-07-28 00:51:38] (step=0002442) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.047454333462883795, LR: 0.001 [2025-07-28 00:51:41] (step=0002443) Train Loss: 0.2180, Train Steps/Sec: 0.27, Epoch: 0.04747376603186941, LR: 0.001 [2025-07-28 00:51:45] (step=0002444) Train Loss: 0.2629, Train Steps/Sec: 0.27, Epoch: 0.04749319860085503, LR: 0.001 [2025-07-28 00:51:49] (step=0002445) Train Loss: 0.1345, Train Steps/Sec: 0.28, Epoch: 0.047512631169840655, LR: 0.001 [2025-07-28 00:51:52] (step=0002446) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.04753206373882627, LR: 0.001 [2025-07-28 00:51:56] (step=0002447) Train Loss: 0.1969, Train Steps/Sec: 0.28, Epoch: 0.04755149630781189, LR: 0.001 [2025-07-28 00:52:00] (step=0002448) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.047570928876797515, LR: 0.001 [2025-07-28 00:52:03] (step=0002449) Train Loss: 0.2823, Train Steps/Sec: 0.28, Epoch: 0.04759036144578313, LR: 0.001 [2025-07-28 00:52:07] (step=0002450) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.04760979401476875, LR: 0.001 [2025-07-28 00:52:10] (step=0002451) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.047629226583754375, LR: 0.001 [2025-07-28 00:52:14] (step=0002452) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.04764865915273999, LR: 0.001 [2025-07-28 00:52:18] (step=0002453) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.04766809172172561, LR: 0.001 [2025-07-28 00:52:21] (step=0002454) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.047687524290711235, LR: 0.001 [2025-07-28 00:52:25] (step=0002455) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.04770695685969685, LR: 0.001 [2025-07-28 00:52:29] (step=0002456) Train Loss: 0.3504, Train Steps/Sec: 0.28, Epoch: 0.04772638942868247, LR: 0.001 [2025-07-28 00:52:32] (step=0002457) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.047745821997668095, LR: 0.001 [2025-07-28 00:52:36] (step=0002458) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.04776525456665371, LR: 0.001 [2025-07-28 00:52:39] (step=0002459) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.04778468713563933, LR: 0.001 [2025-07-28 00:52:43] (step=0002460) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.047804119704624955, LR: 0.001 [2025-07-28 00:52:47] (step=0002461) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.04782355227361057, LR: 0.001 [2025-07-28 00:52:50] (step=0002462) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.04784298484259619, LR: 0.001 [2025-07-28 00:52:54] (step=0002463) Train Loss: 0.1834, Train Steps/Sec: 0.28, Epoch: 0.04786241741158181, LR: 0.001 [2025-07-28 00:52:58] (step=0002464) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.04788184998056743, LR: 0.001 [2025-07-28 00:53:01] (step=0002465) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.04790128254955305, LR: 0.001 [2025-07-28 00:53:05] (step=0002466) Train Loss: 0.1490, Train Steps/Sec: 0.28, Epoch: 0.04792071511853867, LR: 0.001 [2025-07-28 00:53:08] (step=0002467) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.04794014768752429, LR: 0.001 [2025-07-28 00:53:12] (step=0002468) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.04795958025650991, LR: 0.001 [2025-07-28 00:53:16] (step=0002469) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.04797901282549553, LR: 0.001 [2025-07-28 00:53:19] (step=0002470) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.04799844539448115, LR: 0.001 [2025-07-28 00:53:23] (step=0002471) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.04801787796346677, LR: 0.001 [2025-07-28 00:53:27] (step=0002472) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.04803731053245239, LR: 0.001 [2025-07-28 00:53:30] (step=0002473) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.04805674310143801, LR: 0.001 [2025-07-28 00:53:34] (step=0002474) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.04807617567042363, LR: 0.001 [2025-07-28 00:53:37] (step=0002475) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.04809560823940925, LR: 0.001 [2025-07-28 00:53:41] (step=0002476) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.04811504080839487, LR: 0.001 [2025-07-28 00:53:45] (step=0002477) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.04813447337738049, LR: 0.001 [2025-07-28 00:53:48] (step=0002478) Train Loss: 0.1945, Train Steps/Sec: 0.28, Epoch: 0.04815390594636611, LR: 0.001 [2025-07-28 00:53:52] (step=0002479) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.04817333851535173, LR: 0.001 [2025-07-28 00:53:55] (step=0002480) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.04819277108433735, LR: 0.001 [2025-07-28 00:53:59] (step=0002481) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.04821220365332297, LR: 0.001 [2025-07-28 00:54:03] (step=0002482) Train Loss: 0.3083, Train Steps/Sec: 0.28, Epoch: 0.04823163622230859, LR: 0.001 [2025-07-28 00:54:06] (step=0002483) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.04825106879129421, LR: 0.001 [2025-07-28 00:54:10] (step=0002484) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.04827050136027983, LR: 0.001 [2025-07-28 00:54:14] (step=0002485) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.04828993392926545, LR: 0.001 [2025-07-28 00:54:17] (step=0002486) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.04830936649825107, LR: 0.001 [2025-07-28 00:54:21] (step=0002487) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.04832879906723669, LR: 0.001 [2025-07-28 00:54:25] (step=0002488) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.04834823163622231, LR: 0.001 [2025-07-28 00:54:28] (step=0002489) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.04836766420520793, LR: 0.001 [2025-07-28 00:54:32] (step=0002490) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.04838709677419355, LR: 0.001 [2025-07-28 00:54:35] (step=0002491) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.04840652934317917, LR: 0.001 [2025-07-28 00:54:39] (step=0002492) Train Loss: 0.2597, Train Steps/Sec: 0.26, Epoch: 0.04842596191216479, LR: 0.001 [2025-07-28 00:54:43] (step=0002493) Train Loss: 0.2272, Train Steps/Sec: 0.27, Epoch: 0.04844539448115041, LR: 0.001 [2025-07-28 00:54:46] (step=0002494) Train Loss: 0.2622, Train Steps/Sec: 0.28, Epoch: 0.04846482705013603, LR: 0.001 [2025-07-28 00:54:50] (step=0002495) Train Loss: 0.1848, Train Steps/Sec: 0.28, Epoch: 0.048484259619121645, LR: 0.001 [2025-07-28 00:54:54] (step=0002496) Train Loss: 0.2604, Train Steps/Sec: 0.28, Epoch: 0.04850369218810727, LR: 0.001 [2025-07-28 00:54:57] (step=0002497) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.04852312475709289, LR: 0.001 [2025-07-28 00:55:01] (step=0002498) Train Loss: 0.2245, Train Steps/Sec: 0.28, Epoch: 0.048542557326078505, LR: 0.001 [2025-07-28 00:55:05] (step=0002499) Train Loss: 0.2693, Train Steps/Sec: 0.28, Epoch: 0.04856198989506413, LR: 0.001 [2025-07-28 00:55:08] (step=0002500) Train Loss: 0.2946, Train Steps/Sec: 0.28, Epoch: 0.04858142246404975, LR: 0.001 [2025-07-28 00:55:12] (step=0002501) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.048600855033035364, LR: 0.001 [2025-07-28 00:55:16] (step=0002502) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.04862028760202099, LR: 0.001 [2025-07-28 00:55:19] (step=0002503) Train Loss: 0.2732, Train Steps/Sec: 0.27, Epoch: 0.04863972017100661, LR: 0.001 [2025-07-28 00:55:23] (step=0002504) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.048659152739992224, LR: 0.001 [2025-07-28 00:55:26] (step=0002505) Train Loss: 0.2113, Train Steps/Sec: 0.27, Epoch: 0.04867858530897785, LR: 0.001 [2025-07-28 00:55:30] (step=0002506) Train Loss: 0.2274, Train Steps/Sec: 0.28, Epoch: 0.04869801787796347, LR: 0.001 [2025-07-28 00:55:34] (step=0002507) Train Loss: 0.2549, Train Steps/Sec: 0.27, Epoch: 0.048717450446949084, LR: 0.001 [2025-07-28 00:55:37] (step=0002508) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.04873688301593471, LR: 0.001 [2025-07-28 00:55:41] (step=0002509) Train Loss: 0.2099, Train Steps/Sec: 0.27, Epoch: 0.04875631558492033, LR: 0.001 [2025-07-28 00:55:45] (step=0002510) Train Loss: 0.2850, Train Steps/Sec: 0.27, Epoch: 0.048775748153905944, LR: 0.001 [2025-07-28 00:55:48] (step=0002511) Train Loss: 0.2793, Train Steps/Sec: 0.27, Epoch: 0.04879518072289157, LR: 0.001 [2025-07-28 00:55:52] (step=0002512) Train Loss: 0.3020, Train Steps/Sec: 0.27, Epoch: 0.04881461329187719, LR: 0.001 [2025-07-28 00:55:56] (step=0002513) Train Loss: 0.2630, Train Steps/Sec: 0.27, Epoch: 0.048834045860862804, LR: 0.001 [2025-07-28 00:55:59] (step=0002514) Train Loss: 0.2616, Train Steps/Sec: 0.27, Epoch: 0.048853478429848426, LR: 0.001 [2025-07-28 00:56:03] (step=0002515) Train Loss: 0.2362, Train Steps/Sec: 0.27, Epoch: 0.04887291099883405, LR: 0.001 [2025-07-28 00:56:06] (step=0002516) Train Loss: 0.2385, Train Steps/Sec: 0.27, Epoch: 0.048892343567819664, LR: 0.001 [2025-07-28 00:56:10] (step=0002517) Train Loss: 0.3241, Train Steps/Sec: 0.27, Epoch: 0.048911776136805286, LR: 0.001 [2025-07-28 00:56:14] (step=0002518) Train Loss: 0.2875, Train Steps/Sec: 0.27, Epoch: 0.04893120870579091, LR: 0.001 [2025-07-28 00:56:17] (step=0002519) Train Loss: 0.1679, Train Steps/Sec: 0.27, Epoch: 0.048950641274776524, LR: 0.001 [2025-07-28 00:56:21] (step=0002520) Train Loss: 0.2261, Train Steps/Sec: 0.27, Epoch: 0.048970073843762146, LR: 0.001 [2025-07-28 00:56:25] (step=0002521) Train Loss: 0.2140, Train Steps/Sec: 0.27, Epoch: 0.04898950641274777, LR: 0.001 [2025-07-28 00:56:28] (step=0002522) Train Loss: 0.2632, Train Steps/Sec: 0.27, Epoch: 0.049008938981733384, LR: 0.001 [2025-07-28 00:56:32] (step=0002523) Train Loss: 0.2350, Train Steps/Sec: 0.27, Epoch: 0.049028371550719006, LR: 0.001 [2025-07-28 00:56:36] (step=0002524) Train Loss: 0.2479, Train Steps/Sec: 0.27, Epoch: 0.04904780411970462, LR: 0.001 [2025-07-28 00:56:39] (step=0002525) Train Loss: 0.2263, Train Steps/Sec: 0.27, Epoch: 0.049067236688690244, LR: 0.001 [2025-07-28 00:56:43] (step=0002526) Train Loss: 0.3543, Train Steps/Sec: 0.27, Epoch: 0.049086669257675866, LR: 0.001 [2025-07-28 00:56:47] (step=0002527) Train Loss: 0.2285, Train Steps/Sec: 0.27, Epoch: 0.04910610182666148, LR: 0.001 [2025-07-28 00:56:50] (step=0002528) Train Loss: 0.2951, Train Steps/Sec: 0.27, Epoch: 0.049125534395647104, LR: 0.001 [2025-07-28 00:56:54] (step=0002529) Train Loss: 0.2576, Train Steps/Sec: 0.27, Epoch: 0.049144966964632726, LR: 0.001 [2025-07-28 00:56:57] (step=0002530) Train Loss: 0.2167, Train Steps/Sec: 0.27, Epoch: 0.04916439953361834, LR: 0.001 [2025-07-28 00:57:01] (step=0002531) Train Loss: 0.2382, Train Steps/Sec: 0.27, Epoch: 0.049183832102603964, LR: 0.001 [2025-07-28 00:57:05] (step=0002532) Train Loss: 0.2862, Train Steps/Sec: 0.27, Epoch: 0.049203264671589586, LR: 0.001 [2025-07-28 00:57:08] (step=0002533) Train Loss: 0.2566, Train Steps/Sec: 0.27, Epoch: 0.0492226972405752, LR: 0.001 [2025-07-28 00:57:12] (step=0002534) Train Loss: 0.2661, Train Steps/Sec: 0.27, Epoch: 0.049242129809560824, LR: 0.001 [2025-07-28 00:57:16] (step=0002535) Train Loss: 0.2264, Train Steps/Sec: 0.27, Epoch: 0.049261562378546446, LR: 0.001 [2025-07-28 00:57:19] (step=0002536) Train Loss: 0.2319, Train Steps/Sec: 0.27, Epoch: 0.04928099494753206, LR: 0.001 [2025-07-28 00:57:23] (step=0002537) Train Loss: 0.2447, Train Steps/Sec: 0.27, Epoch: 0.049300427516517684, LR: 0.001 [2025-07-28 00:57:27] (step=0002538) Train Loss: 0.2585, Train Steps/Sec: 0.27, Epoch: 0.049319860085503306, LR: 0.001 [2025-07-28 00:57:30] (step=0002539) Train Loss: 0.2009, Train Steps/Sec: 0.27, Epoch: 0.04933929265448892, LR: 0.001 [2025-07-28 00:57:34] (step=0002540) Train Loss: 0.2553, Train Steps/Sec: 0.26, Epoch: 0.049358725223474544, LR: 0.001 [2025-07-28 00:57:38] (step=0002541) Train Loss: 0.2058, Train Steps/Sec: 0.27, Epoch: 0.049378157792460166, LR: 0.001 [2025-07-28 00:57:41] (step=0002542) Train Loss: 0.2790, Train Steps/Sec: 0.27, Epoch: 0.04939759036144578, LR: 0.001 [2025-07-28 00:57:45] (step=0002543) Train Loss: 0.2953, Train Steps/Sec: 0.28, Epoch: 0.049417022930431403, LR: 0.001 [2025-07-28 00:57:49] (step=0002544) Train Loss: 0.2975, Train Steps/Sec: 0.27, Epoch: 0.049436455499417026, LR: 0.001 [2025-07-28 00:57:52] (step=0002545) Train Loss: 0.2561, Train Steps/Sec: 0.27, Epoch: 0.04945588806840264, LR: 0.001 [2025-07-28 00:57:56] (step=0002546) Train Loss: 0.1851, Train Steps/Sec: 0.27, Epoch: 0.04947532063738826, LR: 0.001 [2025-07-28 00:58:00] (step=0002547) Train Loss: 0.1810, Train Steps/Sec: 0.27, Epoch: 0.049494753206373886, LR: 0.001 [2025-07-28 00:58:03] (step=0002548) Train Loss: 0.2133, Train Steps/Sec: 0.27, Epoch: 0.0495141857753595, LR: 0.001 [2025-07-28 00:58:07] (step=0002549) Train Loss: 0.2809, Train Steps/Sec: 0.27, Epoch: 0.04953361834434512, LR: 0.001 [2025-07-28 00:58:10] (step=0002550) Train Loss: 0.2419, Train Steps/Sec: 0.27, Epoch: 0.049553050913330746, LR: 0.001 [2025-07-28 00:58:14] (step=0002551) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.04957248348231636, LR: 0.001 [2025-07-28 00:58:18] (step=0002552) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.04959191605130198, LR: 0.001 [2025-07-28 00:58:21] (step=0002553) Train Loss: 0.3418, Train Steps/Sec: 0.28, Epoch: 0.0496113486202876, LR: 0.001 [2025-07-28 00:58:25] (step=0002554) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.04963078118927322, LR: 0.001 [2025-07-28 00:58:29] (step=0002555) Train Loss: 0.2044, Train Steps/Sec: 0.27, Epoch: 0.04965021375825884, LR: 0.001 [2025-07-28 00:58:32] (step=0002556) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.04966964632724446, LR: 0.001 [2025-07-28 00:58:36] (step=0002557) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.04968907889623008, LR: 0.001 [2025-07-28 00:58:39] (step=0002558) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.0497085114652157, LR: 0.001 [2025-07-28 00:58:43] (step=0002559) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.04972794403420132, LR: 0.001 [2025-07-28 00:58:47] (step=0002560) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.04974737660318694, LR: 0.001 [2025-07-28 00:58:50] (step=0002561) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.04976680917217256, LR: 0.001 [2025-07-28 00:58:54] (step=0002562) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.04978624174115818, LR: 0.001 [2025-07-28 00:58:58] (step=0002563) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.0498056743101438, LR: 0.001 [2025-07-28 00:59:01] (step=0002564) Train Loss: 0.2449, Train Steps/Sec: 0.27, Epoch: 0.04982510687912942, LR: 0.001 [2025-07-28 00:59:05] (step=0002565) Train Loss: 0.1534, Train Steps/Sec: 0.27, Epoch: 0.04984453944811504, LR: 0.001 [2025-07-28 00:59:09] (step=0002566) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.04986397201710066, LR: 0.001 [2025-07-28 00:59:12] (step=0002567) Train Loss: 0.2036, Train Steps/Sec: 0.27, Epoch: 0.04988340458608628, LR: 0.001 [2025-07-28 00:59:16] (step=0002568) Train Loss: 0.2737, Train Steps/Sec: 0.27, Epoch: 0.0499028371550719, LR: 0.001 [2025-07-28 00:59:19] (step=0002569) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.04992226972405752, LR: 0.001 [2025-07-28 00:59:23] (step=0002570) Train Loss: 0.2983, Train Steps/Sec: 0.28, Epoch: 0.04994170229304314, LR: 0.001 [2025-07-28 00:59:27] (step=0002571) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.04996113486202876, LR: 0.001 [2025-07-28 00:59:30] (step=0002572) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.04998056743101438, LR: 0.001 [2025-07-28 00:59:34] (step=0002573) Train Loss: 0.2611, Train Steps/Sec: 0.28, Epoch: 0.05, LR: 0.001 [2025-07-28 00:59:38] (step=0002574) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.05001943256898562, LR: 0.001 [2025-07-28 00:59:41] (step=0002575) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.05003886513797124, LR: 0.001 [2025-07-28 00:59:45] (step=0002576) Train Loss: 0.3260, Train Steps/Sec: 0.28, Epoch: 0.05005829770695686, LR: 0.001 [2025-07-28 00:59:49] (step=0002577) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.05007773027594248, LR: 0.001 [2025-07-28 00:59:52] (step=0002578) Train Loss: 0.2885, Train Steps/Sec: 0.28, Epoch: 0.0500971628449281, LR: 0.001 [2025-07-28 00:59:56] (step=0002579) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.05011659541391372, LR: 0.001 [2025-07-28 00:59:59] (step=0002580) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.05013602798289934, LR: 0.001 [2025-07-28 01:00:03] (step=0002581) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.05015546055188496, LR: 0.001 [2025-07-28 01:00:07] (step=0002582) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.050174893120870576, LR: 0.001 [2025-07-28 01:00:10] (step=0002583) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.0501943256898562, LR: 0.001 [2025-07-28 01:00:14] (step=0002584) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.05021375825884182, LR: 0.001 [2025-07-28 01:00:17] (step=0002585) Train Loss: 0.3604, Train Steps/Sec: 0.28, Epoch: 0.050233190827827436, LR: 0.001 [2025-07-28 01:00:21] (step=0002586) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.05025262339681306, LR: 0.001 [2025-07-28 01:00:25] (step=0002587) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.05027205596579868, LR: 0.001 [2025-07-28 01:00:28] (step=0002588) Train Loss: 0.2471, Train Steps/Sec: 0.27, Epoch: 0.050291488534784295, LR: 0.001 [2025-07-28 01:00:32] (step=0002589) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.05031092110376992, LR: 0.001 [2025-07-28 01:00:36] (step=0002590) Train Loss: 0.2938, Train Steps/Sec: 0.28, Epoch: 0.05033035367275554, LR: 0.001 [2025-07-28 01:00:39] (step=0002591) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.050349786241741155, LR: 0.001 [2025-07-28 01:00:43] (step=0002592) Train Loss: 0.2989, Train Steps/Sec: 0.28, Epoch: 0.05036921881072678, LR: 0.001 [2025-07-28 01:00:47] (step=0002593) Train Loss: 0.1848, Train Steps/Sec: 0.28, Epoch: 0.0503886513797124, LR: 0.001 [2025-07-28 01:00:50] (step=0002594) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.050408083948698015, LR: 0.001 [2025-07-28 01:00:54] (step=0002595) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.05042751651768364, LR: 0.001 [2025-07-28 01:00:57] (step=0002596) Train Loss: 0.3015, Train Steps/Sec: 0.28, Epoch: 0.05044694908666926, LR: 0.001 [2025-07-28 01:01:01] (step=0002597) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.050466381655654875, LR: 0.001 [2025-07-28 01:01:05] (step=0002598) Train Loss: 0.1777, Train Steps/Sec: 0.28, Epoch: 0.0504858142246405, LR: 0.001 [2025-07-28 01:01:08] (step=0002599) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.05050524679362612, LR: 0.001 [2025-07-28 01:01:12] (step=0002600) Train Loss: 0.2907, Train Steps/Sec: 0.28, Epoch: 0.050524679362611735, LR: 0.001 [2025-07-28 01:01:16] (step=0002601) Train Loss: 0.2909, Train Steps/Sec: 0.28, Epoch: 0.05054411193159736, LR: 0.001 [2025-07-28 01:01:19] (step=0002602) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.05056354450058298, LR: 0.001 [2025-07-28 01:01:23] (step=0002603) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.050582977069568595, LR: 0.001 [2025-07-28 01:01:26] (step=0002604) Train Loss: 0.3040, Train Steps/Sec: 0.28, Epoch: 0.05060240963855422, LR: 0.001 [2025-07-28 01:01:30] (step=0002605) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.05062184220753984, LR: 0.001 [2025-07-28 01:01:34] (step=0002606) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.050641274776525455, LR: 0.001 [2025-07-28 01:01:37] (step=0002607) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.05066070734551108, LR: 0.001 [2025-07-28 01:01:41] (step=0002608) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.0506801399144967, LR: 0.001 [2025-07-28 01:01:44] (step=0002609) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.050699572483482315, LR: 0.001 [2025-07-28 01:01:48] (step=0002610) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.05071900505246794, LR: 0.001 [2025-07-28 01:01:52] (step=0002611) Train Loss: 0.1357, Train Steps/Sec: 0.27, Epoch: 0.05073843762145356, LR: 0.001 [2025-07-28 01:01:55] (step=0002612) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.050757870190439175, LR: 0.001 [2025-07-28 01:01:59] (step=0002613) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.0507773027594248, LR: 0.001 [2025-07-28 01:02:34] Found latest checkpoint at /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000 [2025-07-28 01:02:34] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-28 01:02:35] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 01:02:35] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 01:03:19] Trainable parameters: 4,718,592 (lora) [2025-07-28 01:03:19] Total parameters in the model: 3,762,072,592 (lora) [2025-07-28 01:03:43] Dataset contains 205,841 [2025-07-28 01:03:43] Training for 2000 epochs... [2025-07-28 01:03:43] Beginning epoch 0... [2025-07-28 01:03:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1, "loss": 0.339937686920166, "memory_gb": 7.721559047698975, "step_time_ms": 4476.490497589111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:03:50] (step=0000001) Train Loss: 0.3323, Train Steps/Sec: 0.15, Epoch: 1.9432568985619897e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2, "loss": 0.23381412029266357, "memory_gb": 7.722414016723633, "step_time_ms": 3264.3017768859863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:03:53] (step=0000002) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 3.8865137971239795e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3, "loss": 0.15675032138824463, "memory_gb": 7.721559524536133, "step_time_ms": 3290.933132171631, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:03:57] (step=0000003) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 5.82977069568597e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4, "loss": 0.2622736096382141, "memory_gb": 7.722414016723633, "step_time_ms": 3513.122081756592, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:01] (step=0000004) Train Loss: 0.2305, Train Steps/Sec: 0.26, Epoch: 7.773027594247959e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5, "loss": 0.19606687128543854, "memory_gb": 7.721559524536133, "step_time_ms": 3304.333448410034, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:05] (step=0000005) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 9.71628449280995e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6, "loss": 0.3230525255203247, "memory_gb": 7.715639114379883, "step_time_ms": 3290.501594543457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:08] (step=0000006) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.0001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7, "loss": 0.2864382266998291, "memory_gb": 7.722414016723633, "step_time_ms": 3321.094036102295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:12] (step=0000007) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.0001360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8, "loss": 0.25621068477630615, "memory_gb": 7.721559524536133, "step_time_ms": 3337.097406387329, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:15] (step=0000008) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.00015546055188495918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9, "loss": 0.2907955050468445, "memory_gb": 7.721559524536133, "step_time_ms": 3321.8843936920166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:19] (step=0000009) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.00017489312087057908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10, "loss": 0.2312738001346588, "memory_gb": 7.721559524536133, "step_time_ms": 3351.53865814209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:23] (step=0000010) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.000194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11, "loss": 0.24025234580039978, "memory_gb": 7.721559524536133, "step_time_ms": 3356.550931930542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:26] (step=0000011) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.0002137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12, "loss": 0.2627186179161072, "memory_gb": 7.721559524536133, "step_time_ms": 3365.831136703491, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:30] (step=0000012) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.0002331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13, "loss": 0.21885211765766144, "memory_gb": 7.721559524536133, "step_time_ms": 3380.0671100616455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:33] (step=0000013) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.0002526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14, "loss": 0.302106648683548, "memory_gb": 7.721559524536133, "step_time_ms": 3393.385648727417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:37] (step=0000014) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.0002720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15, "loss": 0.2663143277168274, "memory_gb": 7.721559524536133, "step_time_ms": 3398.7908363342285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:41] (step=0000015) Train Loss: 0.2781, Train Steps/Sec: 0.28, Epoch: 0.0002914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 16, "loss": 0.33297035098075867, "memory_gb": 7.721559524536133, "step_time_ms": 3396.4478969573975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:44] (step=0000016) Train Loss: 0.3176, Train Steps/Sec: 0.28, Epoch: 0.00031092110376991836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 17, "loss": 0.4000062346458435, "memory_gb": 7.721559524536133, "step_time_ms": 3403.7904739379883, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:48] (step=0000017) Train Loss: 0.2899, Train Steps/Sec: 0.28, Epoch: 0.00033035367275553826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 18, "loss": 0.2604101300239563, "memory_gb": 7.721559524536133, "step_time_ms": 3412.5006198883057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:52] (step=0000018) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.00034978624174115817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 19, "loss": 0.25125426054000854, "memory_gb": 7.721559524536133, "step_time_ms": 3412.740707397461, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:55] (step=0000019) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.00036921881072677807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:04:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 20, "loss": 0.28867000341415405, "memory_gb": 7.715639114379883, "step_time_ms": 3378.2870769500732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:04:59] (step=0000020) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.000388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 21, "loss": 0.18565605580806732, "memory_gb": 7.722414016723633, "step_time_ms": 3409.1291427612305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:02] (step=0000021) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.0004080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 22, "loss": 0.3136715888977051, "memory_gb": 7.721559524536133, "step_time_ms": 3419.132947921753, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:06] (step=0000022) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.0004275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 23, "loss": 0.19316482543945312, "memory_gb": 7.721559524536133, "step_time_ms": 3423.922061920166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:10] (step=0000023) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.0004469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 24, "loss": 0.15652358531951904, "memory_gb": 7.721559524536133, "step_time_ms": 3416.341543197632, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:13] (step=0000024) Train Loss: 0.3387, Train Steps/Sec: 0.28, Epoch: 0.0004663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 25, "loss": 0.19425973296165466, "memory_gb": 7.721559524536133, "step_time_ms": 3435.7211589813232, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:17] (step=0000025) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.0004858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 26, "loss": 0.2759854197502136, "memory_gb": 7.721559524536133, "step_time_ms": 3413.6950969696045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:21] (step=0000026) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.0005052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 27, "loss": 0.23435881733894348, "memory_gb": 7.721559524536133, "step_time_ms": 3434.900999069214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:24] (step=0000027) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.0005246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 28, "loss": 0.21785025298595428, "memory_gb": 7.721559524536133, "step_time_ms": 3430.9263229370117, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:28] (step=0000028) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.0005441119315973572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 29, "loss": 0.27744871377944946, "memory_gb": 7.721559524536133, "step_time_ms": 3433.290481567383, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:31] (step=0000029) Train Loss: 0.2904, Train Steps/Sec: 0.27, Epoch: 0.0005635445005829771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 30, "loss": 0.19128993153572083, "memory_gb": 7.721559524536133, "step_time_ms": 3427.2570610046387, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:35] (step=0000030) Train Loss: 0.2389, Train Steps/Sec: 0.27, Epoch: 0.000582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 31, "loss": 0.29514792561531067, "memory_gb": 7.721559524536133, "step_time_ms": 3424.4790077209473, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:39] (step=0000031) Train Loss: 0.2498, Train Steps/Sec: 0.27, Epoch: 0.0006024096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 32, "loss": 0.2644522786140442, "memory_gb": 7.721559524536133, "step_time_ms": 3419.872999191284, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:42] (step=0000032) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.0006218422075398367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 33, "loss": 0.2308114767074585, "memory_gb": 7.721559524536133, "step_time_ms": 3429.584264755249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:46] (step=0000033) Train Loss: 0.2837, Train Steps/Sec: 0.27, Epoch: 0.0006412747765254566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 34, "loss": 0.2853633761405945, "memory_gb": 7.721559524536133, "step_time_ms": 3425.509214401245, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:50] (step=0000034) Train Loss: 0.2327, Train Steps/Sec: 0.27, Epoch: 0.0006607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 35, "loss": 0.24578271806240082, "memory_gb": 7.721559524536133, "step_time_ms": 3428.6513328552246, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:53] (step=0000035) Train Loss: 0.2713, Train Steps/Sec: 0.28, Epoch: 0.0006801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 36, "loss": 0.28081202507019043, "memory_gb": 7.721559524536133, "step_time_ms": 3429.8157691955566, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:05:57] (step=0000036) Train Loss: 0.3071, Train Steps/Sec: 0.27, Epoch: 0.0006995724834823163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 37, "loss": 0.19943930208683014, "memory_gb": 7.721559524536133, "step_time_ms": 3448.0576515197754, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:01] (step=0000037) Train Loss: 0.1849, Train Steps/Sec: 0.28, Epoch: 0.0007190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 38, "loss": 0.23485392332077026, "memory_gb": 7.721559524536133, "step_time_ms": 3442.3694610595703, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:04] (step=0000038) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.0007384376214535561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 39, "loss": 0.2814399003982544, "memory_gb": 7.721559524536133, "step_time_ms": 3447.8864669799805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:08] (step=0000039) Train Loss: 0.2756, Train Steps/Sec: 0.27, Epoch: 0.000757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 40, "loss": 0.2326139360666275, "memory_gb": 7.721559524536133, "step_time_ms": 3447.6802349090576, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:11] (step=0000040) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.000777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 41, "loss": 0.32369911670684814, "memory_gb": 7.721559524536133, "step_time_ms": 3454.7929763793945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:15] (step=0000041) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.0007967353284104159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 42, "loss": 0.3200656473636627, "memory_gb": 7.721559524536133, "step_time_ms": 3452.604055404663, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:19] (step=0000042) Train Loss: 0.3112, Train Steps/Sec: 0.28, Epoch: 0.0008161678973960358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 43, "loss": 0.3475222885608673, "memory_gb": 7.721559524536133, "step_time_ms": 3462.6975059509277, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:22] (step=0000043) Train Loss: 0.3710, Train Steps/Sec: 0.28, Epoch: 0.0008356004663816557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 44, "loss": 0.2426302134990692, "memory_gb": 7.721559524536133, "step_time_ms": 3470.5650806427, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:26] (step=0000044) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.0008550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 45, "loss": 0.2637980580329895, "memory_gb": 7.721559524536133, "step_time_ms": 3480.8828830718994, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:30] (step=0000045) Train Loss: 0.2858, Train Steps/Sec: 0.26, Epoch: 0.0008744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 46, "loss": 0.29035308957099915, "memory_gb": 7.721559524536133, "step_time_ms": 3470.9548950195312, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:33] (step=0000046) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.0008938981733385154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 47, "loss": 0.25794264674186707, "memory_gb": 7.721559524536133, "step_time_ms": 3482.9013347625732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:37] (step=0000047) Train Loss: 0.2272, Train Steps/Sec: 0.27, Epoch: 0.0009133307423241353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 48, "loss": 0.2851806879043579, "memory_gb": 7.721559524536133, "step_time_ms": 3473.154306411743, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:41] (step=0000048) Train Loss: 0.3141, Train Steps/Sec: 0.28, Epoch: 0.0009327633113097552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 49, "loss": 0.25453805923461914, "memory_gb": 7.721559524536133, "step_time_ms": 3456.650495529175, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:44] (step=0000049) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.0009521958802953751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 50, "loss": 0.25828105211257935, "memory_gb": 7.721559524536133, "step_time_ms": 3495.8949089050293, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:48] (step=0000050) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.000971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 51, "loss": 0.21718591451644897, "memory_gb": 7.721559524536133, "step_time_ms": 3480.9563159942627, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:52] (step=0000051) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.000991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 52, "loss": 0.24920432269573212, "memory_gb": 7.715639114379883, "step_time_ms": 3602.651357650757, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:55] (step=0000052) Train Loss: 0.2579, Train Steps/Sec: 0.27, Epoch: 0.0010104935872522348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 53, "loss": 0.3779979646205902, "memory_gb": 7.722414016723633, "step_time_ms": 3474.311113357544, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:06:59] (step=0000053) Train Loss: 0.3300, Train Steps/Sec: 0.28, Epoch: 0.0010299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 54, "loss": 0.24879775941371918, "memory_gb": 7.721559524536133, "step_time_ms": 3484.6653938293457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:02] (step=0000054) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.0010493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 55, "loss": 0.1665349006652832, "memory_gb": 7.721559524536133, "step_time_ms": 3479.696035385132, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:06] (step=0000055) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.0010687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 56, "loss": 0.3277041018009186, "memory_gb": 7.721559524536133, "step_time_ms": 3480.0877571105957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:10] (step=0000056) Train Loss: 0.3325, Train Steps/Sec: 0.28, Epoch: 0.0010882238631947144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 57, "loss": 0.29854971170425415, "memory_gb": 7.721559524536133, "step_time_ms": 3483.51788520813, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:13] (step=0000057) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.0011076564321803343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 58, "loss": 0.17249050736427307, "memory_gb": 7.721559524536133, "step_time_ms": 3464.1568660736084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:17] (step=0000058) Train Loss: 0.2445, Train Steps/Sec: 0.27, Epoch: 0.0011270890011659542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 59, "loss": 0.25381430983543396, "memory_gb": 7.721559524536133, "step_time_ms": 3555.0687313079834, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:21] (step=0000059) Train Loss: 0.2584, Train Steps/Sec: 0.27, Epoch: 0.0011465215701515741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 60, "loss": 0.3623574376106262, "memory_gb": 7.721559524536133, "step_time_ms": 3549.7918128967285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:24] (step=0000060) Train Loss: 0.3347, Train Steps/Sec: 0.27, Epoch: 0.001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 61, "loss": 0.3256654739379883, "memory_gb": 7.721559524536133, "step_time_ms": 3452.845811843872, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:28] (step=0000061) Train Loss: 0.3108, Train Steps/Sec: 0.27, Epoch: 0.001185386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 62, "loss": 0.24547627568244934, "memory_gb": 7.721559524536133, "step_time_ms": 3515.5155658721924, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:32] (step=0000062) Train Loss: 0.1854, Train Steps/Sec: 0.27, Epoch: 0.0012048192771084338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 63, "loss": 0.1699802279472351, "memory_gb": 7.721559524536133, "step_time_ms": 3468.2486057281494, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:35] (step=0000063) Train Loss: 0.1770, Train Steps/Sec: 0.27, Epoch: 0.0012242518460940535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 64, "loss": 0.2460261732339859, "memory_gb": 7.721559524536133, "step_time_ms": 3463.899850845337, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:39] (step=0000064) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.0012436844150796734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 65, "loss": 0.17398250102996826, "memory_gb": 7.721559524536133, "step_time_ms": 3487.8408908843994, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:43] (step=0000065) Train Loss: 0.1984, Train Steps/Sec: 0.27, Epoch: 0.0012631169840652933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 66, "loss": 0.3211126923561096, "memory_gb": 7.721559524536133, "step_time_ms": 3542.691230773926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:47] (step=0000066) Train Loss: 0.2902, Train Steps/Sec: 0.26, Epoch: 0.0012825495530509132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 67, "loss": 0.23047974705696106, "memory_gb": 7.721559524536133, "step_time_ms": 3521.585702896118, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:50] (step=0000067) Train Loss: 0.3034, Train Steps/Sec: 0.27, Epoch: 0.0013019821220365331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 68, "loss": 0.2725827097892761, "memory_gb": 7.721559524536133, "step_time_ms": 3558.220863342285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:54] (step=0000068) Train Loss: 0.2560, Train Steps/Sec: 0.27, Epoch: 0.001321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:07:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 69, "loss": 0.30088895559310913, "memory_gb": 7.721559524536133, "step_time_ms": 3544.8434352874756, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:07:58] (step=0000069) Train Loss: 0.2620, Train Steps/Sec: 0.27, Epoch: 0.001340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 70, "loss": 0.18660566210746765, "memory_gb": 7.721559524536133, "step_time_ms": 3625.6673336029053, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:01] (step=0000070) Train Loss: 0.1976, Train Steps/Sec: 0.27, Epoch: 0.0013602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 71, "loss": 0.26766443252563477, "memory_gb": 7.721559524536133, "step_time_ms": 3488.783359527588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:05] (step=0000071) Train Loss: 0.2774, Train Steps/Sec: 0.27, Epoch: 0.0013797123979790128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 72, "loss": 0.3305189311504364, "memory_gb": 7.721559524536133, "step_time_ms": 3508.2967281341553, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:09] (step=0000072) Train Loss: 0.3127, Train Steps/Sec: 0.27, Epoch: 0.0013991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 73, "loss": 0.2807726263999939, "memory_gb": 7.721559524536133, "step_time_ms": 3524.125099182129, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:13] (step=0000073) Train Loss: 0.2758, Train Steps/Sec: 0.27, Epoch: 0.0014185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 74, "loss": 0.29193028807640076, "memory_gb": 7.721559524536133, "step_time_ms": 3535.1967811584473, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:16] (step=0000074) Train Loss: 0.3271, Train Steps/Sec: 0.27, Epoch: 0.0014380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 75, "loss": 0.2076706886291504, "memory_gb": 7.721559524536133, "step_time_ms": 3480.212450027466, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:20] (step=0000075) Train Loss: 0.1887, Train Steps/Sec: 0.27, Epoch: 0.0014574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 76, "loss": 0.23656316101551056, "memory_gb": 7.721559524536133, "step_time_ms": 3510.0460052490234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:24] (step=0000076) Train Loss: 0.2050, Train Steps/Sec: 0.27, Epoch: 0.0014768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 77, "loss": 0.14857937395572662, "memory_gb": 7.721559524536133, "step_time_ms": 3529.1783809661865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:28] (step=0000077) Train Loss: 0.2127, Train Steps/Sec: 0.27, Epoch: 0.0014963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 78, "loss": 0.2911931872367859, "memory_gb": 7.721559524536133, "step_time_ms": 3580.509901046753, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:31] (step=0000078) Train Loss: 0.2540, Train Steps/Sec: 0.27, Epoch: 0.001515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 79, "loss": 0.32118114829063416, "memory_gb": 7.721559524536133, "step_time_ms": 3484.0002059936523, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:35] (step=0000079) Train Loss: 0.2863, Train Steps/Sec: 0.27, Epoch: 0.001535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 80, "loss": 0.3146507143974304, "memory_gb": 7.721559524536133, "step_time_ms": 3483.7729930877686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:39] (step=0000080) Train Loss: 0.2599, Train Steps/Sec: 0.27, Epoch: 0.001554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 81, "loss": 0.28479939699172974, "memory_gb": 7.721559524536133, "step_time_ms": 3558.0642223358154, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:43] (step=0000081) Train Loss: 0.2422, Train Steps/Sec: 0.27, Epoch: 0.0015740380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 82, "loss": 0.2660136818885803, "memory_gb": 7.721559524536133, "step_time_ms": 3525.103807449341, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:46] (step=0000082) Train Loss: 0.2567, Train Steps/Sec: 0.26, Epoch: 0.0015934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 83, "loss": 0.35458338260650635, "memory_gb": 7.721559524536133, "step_time_ms": 3462.9781246185303, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:50] (step=0000083) Train Loss: 0.3340, Train Steps/Sec: 0.27, Epoch: 0.0016129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 84, "loss": 0.35886478424072266, "memory_gb": 7.721559524536133, "step_time_ms": 3480.213165283203, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:54] (step=0000084) Train Loss: 0.2775, Train Steps/Sec: 0.27, Epoch: 0.0016323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 85, "loss": 0.1952989101409912, "memory_gb": 7.721559524536133, "step_time_ms": 3533.071279525757, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:08:58] (step=0000085) Train Loss: 0.2664, Train Steps/Sec: 0.23, Epoch: 0.0016517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 86, "loss": 0.20392459630966187, "memory_gb": 7.721559524536133, "step_time_ms": 3451.0669708251953, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:02] (step=0000086) Train Loss: 0.2051, Train Steps/Sec: 0.27, Epoch: 0.0016712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 87, "loss": 0.11053426563739777, "memory_gb": 7.721559524536133, "step_time_ms": 3471.6339111328125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:05] (step=0000087) Train Loss: 0.1651, Train Steps/Sec: 0.27, Epoch: 0.0016906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 88, "loss": 0.19796788692474365, "memory_gb": 7.721559524536133, "step_time_ms": 3510.694742202759, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:09] (step=0000088) Train Loss: 0.1982, Train Steps/Sec: 0.27, Epoch: 0.0017100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 89, "loss": 0.28282731771469116, "memory_gb": 7.721559524536133, "step_time_ms": 3580.5370807647705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:13] (step=0000089) Train Loss: 0.2205, Train Steps/Sec: 0.26, Epoch: 0.001729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 90, "loss": 0.3241238296031952, "memory_gb": 7.721559524536133, "step_time_ms": 3525.254487991333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:17] (step=0000090) Train Loss: 0.3185, Train Steps/Sec: 0.27, Epoch: 0.001748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 01:09:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 91, "loss": 0.32599782943725586, "memory_gb": 7.721559524536133, "step_time_ms": 3457.685708999634, "trainable_params": 4718592, "method": "lora"} [2025-07-28 01:09:20] (step=0000091) Train Loss: 0.2970, Train Steps/Sec: 0.27, Epoch: 0.0017683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 03:48:36] Found latest checkpoint at /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000 [2025-07-28 03:48:36] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-28 03:48:37] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 03:48:38] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 13:31:38] Found latest checkpoint at /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000 [2025-07-28 13:31:38] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-28 13:31:38] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 13:31:39] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 13:32:24] Trainable parameters: 4,718,592 (lora) [2025-07-28 13:32:24] Total parameters in the model: 3,762,072,592 (lora) [2025-07-28 13:32:48] Dataset contains 205,841 [2025-07-28 13:32:50] Training for 2000 epochs... [2025-07-28 13:32:50] Beginning epoch 0... [2025-07-28 13:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1, "loss": 0.19856391847133636, "memory_gb": 7.721559047698975, "step_time_ms": 9090.243816375732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:01] (step=0000001) Train Loss: 0.2352, Train Steps/Sec: 0.09, Epoch: 1.9432568985619897e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2, "loss": 0.14027594029903412, "memory_gb": 7.721559524536133, "step_time_ms": 7355.839014053345, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:09] (step=0000002) Train Loss: 0.2565, Train Steps/Sec: 0.13, Epoch: 3.8865137971239795e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3, "loss": 0.2545120120048523, "memory_gb": 7.721559524536133, "step_time_ms": 7517.899513244629, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:17] (step=0000003) Train Loss: 0.2976, Train Steps/Sec: 0.12, Epoch: 5.82977069568597e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4, "loss": 0.2811778783798218, "memory_gb": 7.721559524536133, "step_time_ms": 7744.851589202881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:25] (step=0000004) Train Loss: 0.2379, Train Steps/Sec: 0.12, Epoch: 7.773027594247959e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5, "loss": 0.23897185921669006, "memory_gb": 7.721559524536133, "step_time_ms": 7424.535036087036, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:33] (step=0000005) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 9.71628449280995e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6, "loss": 0.22341135144233704, "memory_gb": 7.721559524536133, "step_time_ms": 7407.861709594727, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:41] (step=0000006) Train Loss: 0.2436, Train Steps/Sec: 0.13, Epoch: 0.0001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7, "loss": 0.31852343678474426, "memory_gb": 7.721559524536133, "step_time_ms": 7466.874361038208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:49] (step=0000007) Train Loss: 0.3535, Train Steps/Sec: 0.12, Epoch: 0.0001360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8, "loss": 0.35782110691070557, "memory_gb": 7.721559524536133, "step_time_ms": 7387.878179550171, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:33:57] (step=0000008) Train Loss: 0.2955, Train Steps/Sec: 0.13, Epoch: 0.00015546055188495918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9, "loss": 0.17958782613277435, "memory_gb": 7.721559524536133, "step_time_ms": 7394.68789100647, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:05] (step=0000009) Train Loss: 0.2302, Train Steps/Sec: 0.13, Epoch: 0.00017489312087057908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10, "loss": 0.22838637232780457, "memory_gb": 7.721559524536133, "step_time_ms": 7439.704179763794, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:13] (step=0000010) Train Loss: 0.2110, Train Steps/Sec: 0.13, Epoch: 0.000194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11, "loss": 0.3194667100906372, "memory_gb": 7.721559524536133, "step_time_ms": 7406.871318817139, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:21] (step=0000011) Train Loss: 0.3190, Train Steps/Sec: 0.12, Epoch: 0.0002137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12, "loss": 0.15950626134872437, "memory_gb": 7.721559524536133, "step_time_ms": 7419.852018356323, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:29] (step=0000012) Train Loss: 0.2107, Train Steps/Sec: 0.12, Epoch: 0.0002331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13, "loss": 0.3256032466888428, "memory_gb": 7.721559524536133, "step_time_ms": 7421.065330505371, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:37] (step=0000013) Train Loss: 0.2845, Train Steps/Sec: 0.12, Epoch: 0.0002526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14, "loss": 0.22042277455329895, "memory_gb": 7.721559524536133, "step_time_ms": 7380.727052688599, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:45] (step=0000014) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.0002720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 15, "loss": 0.2776176333427429, "memory_gb": 7.721559524536133, "step_time_ms": 7288.150310516357, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:34:53] (step=0000015) Train Loss: 0.2466, Train Steps/Sec: 0.13, Epoch: 0.0002914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 16, "loss": 0.3453969657421112, "memory_gb": 7.721559524536133, "step_time_ms": 7434.518575668335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:01] (step=0000016) Train Loss: 0.3387, Train Steps/Sec: 0.12, Epoch: 0.00031092110376991836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 17, "loss": 0.36116892099380493, "memory_gb": 7.721559524536133, "step_time_ms": 4840.571165084839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:07] (step=0000017) Train Loss: 0.3451, Train Steps/Sec: 0.18, Epoch: 0.00033035367275553826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 18, "loss": 0.30594250559806824, "memory_gb": 7.721559524536133, "step_time_ms": 7408.357620239258, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:15] (step=0000018) Train Loss: 0.2587, Train Steps/Sec: 0.12, Epoch: 0.00034978624174115817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 19, "loss": 0.30189570784568787, "memory_gb": 7.721559524536133, "step_time_ms": 7318.52388381958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:23] (step=0000019) Train Loss: 0.2774, Train Steps/Sec: 0.13, Epoch: 0.00036921881072677807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 20, "loss": 0.3217572867870331, "memory_gb": 7.721559524536133, "step_time_ms": 7345.236301422119, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:31] (step=0000020) Train Loss: 0.2578, Train Steps/Sec: 0.13, Epoch: 0.000388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 21, "loss": 0.24542313814163208, "memory_gb": 7.721559524536133, "step_time_ms": 7370.414733886719, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:39] (step=0000021) Train Loss: 0.2792, Train Steps/Sec: 0.13, Epoch: 0.0004080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 22, "loss": 0.2606618404388428, "memory_gb": 7.721559524536133, "step_time_ms": 7307.555913925171, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:47] (step=0000022) Train Loss: 0.2004, Train Steps/Sec: 0.13, Epoch: 0.0004275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:35:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 23, "loss": 0.2839497923851013, "memory_gb": 7.715639114379883, "step_time_ms": 7339.513063430786, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:35:55] (step=0000023) Train Loss: 0.2997, Train Steps/Sec: 0.12, Epoch: 0.0004469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 24, "loss": 0.26701676845550537, "memory_gb": 7.715639114379883, "step_time_ms": 7360.454559326172, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:03] (step=0000024) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.0004663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 25, "loss": 0.24265585839748383, "memory_gb": 7.721559524536133, "step_time_ms": 7268.307447433472, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:11] (step=0000025) Train Loss: 0.2283, Train Steps/Sec: 0.13, Epoch: 0.0004858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 26, "loss": 0.3512459397315979, "memory_gb": 7.721559524536133, "step_time_ms": 7312.545537948608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:19] (step=0000026) Train Loss: 0.3158, Train Steps/Sec: 0.13, Epoch: 0.0005052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 27, "loss": 0.2741711735725403, "memory_gb": 7.721559524536133, "step_time_ms": 7333.219051361084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:27] (step=0000027) Train Loss: 0.2498, Train Steps/Sec: 0.13, Epoch: 0.0005246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 28, "loss": 0.19468438625335693, "memory_gb": 7.721559524536133, "step_time_ms": 7275.821685791016, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:35] (step=0000028) Train Loss: 0.2088, Train Steps/Sec: 0.13, Epoch: 0.0005441119315973572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 29, "loss": 0.276754766702652, "memory_gb": 7.721559524536133, "step_time_ms": 7325.8538246154785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:43] (step=0000029) Train Loss: 0.2548, Train Steps/Sec: 0.13, Epoch: 0.0005635445005829771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 30, "loss": 0.20507220923900604, "memory_gb": 7.721559524536133, "step_time_ms": 7350.601434707642, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:51] (step=0000030) Train Loss: 0.2079, Train Steps/Sec: 0.13, Epoch: 0.000582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 31, "loss": 0.2006029188632965, "memory_gb": 7.721559524536133, "step_time_ms": 7320.893049240112, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:36:59] (step=0000031) Train Loss: 0.2442, Train Steps/Sec: 0.12, Epoch: 0.0006024096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 32, "loss": 0.22050288319587708, "memory_gb": 7.721559524536133, "step_time_ms": 7306.015491485596, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:07] (step=0000032) Train Loss: 0.2104, Train Steps/Sec: 0.13, Epoch: 0.0006218422075398367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 33, "loss": 0.19410480558872223, "memory_gb": 7.721559524536133, "step_time_ms": 7344.637870788574, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:15] (step=0000033) Train Loss: 0.2289, Train Steps/Sec: 0.13, Epoch: 0.0006412747765254566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 34, "loss": 0.29005369544029236, "memory_gb": 7.721559524536133, "step_time_ms": 7266.115427017212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:23] (step=0000034) Train Loss: 0.2110, Train Steps/Sec: 0.13, Epoch: 0.0006607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 35, "loss": 0.22233876585960388, "memory_gb": 7.721559524536133, "step_time_ms": 7284.701824188232, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:31] (step=0000035) Train Loss: 0.2341, Train Steps/Sec: 0.13, Epoch: 0.0006801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 36, "loss": 0.27239537239074707, "memory_gb": 7.721559524536133, "step_time_ms": 7329.119443893433, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:39] (step=0000036) Train Loss: 0.2340, Train Steps/Sec: 0.13, Epoch: 0.0006995724834823163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 37, "loss": 0.3029818832874298, "memory_gb": 7.721559524536133, "step_time_ms": 7241.412401199341, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:47] (step=0000037) Train Loss: 0.2518, Train Steps/Sec: 0.13, Epoch: 0.0007190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:37:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 38, "loss": 0.24444982409477234, "memory_gb": 7.721559524536133, "step_time_ms": 7311.808824539185, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:37:55] (step=0000038) Train Loss: 0.2321, Train Steps/Sec: 0.13, Epoch: 0.0007384376214535561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 39, "loss": 0.1585378348827362, "memory_gb": 7.721559524536133, "step_time_ms": 7388.403654098511, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:03] (step=0000039) Train Loss: 0.1806, Train Steps/Sec: 0.12, Epoch: 0.000757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 40, "loss": 0.31016629934310913, "memory_gb": 7.721559524536133, "step_time_ms": 7294.957399368286, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:11] (step=0000040) Train Loss: 0.2303, Train Steps/Sec: 0.13, Epoch: 0.000777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 41, "loss": 0.2796105146408081, "memory_gb": 7.721559524536133, "step_time_ms": 7320.769309997559, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:19] (step=0000041) Train Loss: 0.2506, Train Steps/Sec: 0.13, Epoch: 0.0007967353284104159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 42, "loss": 0.1274777352809906, "memory_gb": 7.721559524536133, "step_time_ms": 7357.769727706909, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:27] (step=0000042) Train Loss: 0.1779, Train Steps/Sec: 0.13, Epoch: 0.0008161678973960358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 43, "loss": 0.26135003566741943, "memory_gb": 7.721559524536133, "step_time_ms": 7308.342218399048, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:35] (step=0000043) Train Loss: 0.2902, Train Steps/Sec: 0.13, Epoch: 0.0008356004663816557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 44, "loss": 0.3506247103214264, "memory_gb": 7.721559524536133, "step_time_ms": 7212.211847305298, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:43] (step=0000044) Train Loss: 0.2763, Train Steps/Sec: 0.13, Epoch: 0.0008550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 45, "loss": 0.3411334753036499, "memory_gb": 7.721559524536133, "step_time_ms": 7350.163698196411, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:51] (step=0000045) Train Loss: 0.2446, Train Steps/Sec: 0.12, Epoch: 0.0008744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:38:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 46, "loss": 0.3309897780418396, "memory_gb": 7.721559524536133, "step_time_ms": 4704.729080200195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:38:57] (step=0000046) Train Loss: 0.3464, Train Steps/Sec: 0.17, Epoch: 0.0008938981733385154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 47, "loss": 0.29476481676101685, "memory_gb": 7.721559524536133, "step_time_ms": 7328.320503234863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:05] (step=0000047) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.0009133307423241353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 48, "loss": 0.17478451132774353, "memory_gb": 7.721559524536133, "step_time_ms": 7070.53804397583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:13] (step=0000048) Train Loss: 0.2790, Train Steps/Sec: 0.13, Epoch: 0.0009327633113097552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 49, "loss": 0.26767200231552124, "memory_gb": 7.721559524536133, "step_time_ms": 7352.406024932861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:21] (step=0000049) Train Loss: 0.2611, Train Steps/Sec: 0.12, Epoch: 0.0009521958802953751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 50, "loss": 0.24415263533592224, "memory_gb": 7.721559524536133, "step_time_ms": 7365.142822265625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:29] (step=0000050) Train Loss: 0.2063, Train Steps/Sec: 0.12, Epoch: 0.000971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 51, "loss": 0.2496618777513504, "memory_gb": 7.721559524536133, "step_time_ms": 7247.650623321533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:37] (step=0000051) Train Loss: 0.2809, Train Steps/Sec: 0.13, Epoch: 0.000991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 52, "loss": 0.20441031455993652, "memory_gb": 7.721559524536133, "step_time_ms": 7448.822498321533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:45] (step=0000052) Train Loss: 0.2364, Train Steps/Sec: 0.12, Epoch: 0.0010104935872522348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:39:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 53, "loss": 0.1727057248353958, "memory_gb": 7.721559524536133, "step_time_ms": 7385.494232177734, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:39:53] (step=0000053) Train Loss: 0.1468, Train Steps/Sec: 0.12, Epoch: 0.0010299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 54, "loss": 0.2861213982105255, "memory_gb": 7.721559524536133, "step_time_ms": 7276.334047317505, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:01] (step=0000054) Train Loss: 0.2884, Train Steps/Sec: 0.13, Epoch: 0.0010493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 55, "loss": 0.1048416867852211, "memory_gb": 7.721559524536133, "step_time_ms": 7318.175315856934, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:09] (step=0000055) Train Loss: 0.1672, Train Steps/Sec: 0.13, Epoch: 0.0010687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 56, "loss": 0.21235138177871704, "memory_gb": 7.721559524536133, "step_time_ms": 7366.719007492065, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:17] (step=0000056) Train Loss: 0.2707, Train Steps/Sec: 0.12, Epoch: 0.0010882238631947144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 57, "loss": 0.29895201325416565, "memory_gb": 7.721559524536133, "step_time_ms": 7311.45167350769, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:25] (step=0000057) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 0.0011076564321803343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 58, "loss": 0.20891699194908142, "memory_gb": 7.721559524536133, "step_time_ms": 7353.089809417725, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:33] (step=0000058) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.0011270890011659542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 59, "loss": 0.17378917336463928, "memory_gb": 7.721559524536133, "step_time_ms": 7334.122657775879, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:41] (step=0000059) Train Loss: 0.1777, Train Steps/Sec: 0.12, Epoch: 0.0011465215701515741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 60, "loss": 0.20007598400115967, "memory_gb": 7.721559524536133, "step_time_ms": 7278.343439102173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:49] (step=0000060) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:40:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 61, "loss": 0.22218115627765656, "memory_gb": 7.721559524536133, "step_time_ms": 7313.600540161133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:40:57] (step=0000061) Train Loss: 0.2447, Train Steps/Sec: 0.13, Epoch: 0.001185386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 62, "loss": 0.19290706515312195, "memory_gb": 7.721559524536133, "step_time_ms": 7380.833387374878, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:05] (step=0000062) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.0012048192771084338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 63, "loss": 0.31826457381248474, "memory_gb": 7.721559524536133, "step_time_ms": 7273.633003234863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:13] (step=0000063) Train Loss: 0.3035, Train Steps/Sec: 0.13, Epoch: 0.0012242518460940535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 64, "loss": 0.27593812346458435, "memory_gb": 7.721559524536133, "step_time_ms": 7296.91219329834, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:21] (step=0000064) Train Loss: 0.2964, Train Steps/Sec: 0.13, Epoch: 0.0012436844150796734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 65, "loss": 0.3090028166770935, "memory_gb": 7.721559524536133, "step_time_ms": 7331.291437149048, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:29] (step=0000065) Train Loss: 0.3104, Train Steps/Sec: 0.13, Epoch: 0.0012631169840652933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 66, "loss": 0.2858107089996338, "memory_gb": 7.721559524536133, "step_time_ms": 7245.896577835083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:37] (step=0000066) Train Loss: 0.2563, Train Steps/Sec: 0.13, Epoch: 0.0012825495530509132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 67, "loss": 0.3679664731025696, "memory_gb": 7.721559524536133, "step_time_ms": 7332.486152648926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:45] (step=0000067) Train Loss: 0.2613, Train Steps/Sec: 0.12, Epoch: 0.0013019821220365331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:41:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 68, "loss": 0.3386029601097107, "memory_gb": 7.721559524536133, "step_time_ms": 7328.211784362793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:41:53] (step=0000068) Train Loss: 0.3092, Train Steps/Sec: 0.13, Epoch: 0.001321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 69, "loss": 0.16646558046340942, "memory_gb": 7.721559524536133, "step_time_ms": 7270.93768119812, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:01] (step=0000069) Train Loss: 0.2249, Train Steps/Sec: 0.12, Epoch: 0.001340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 70, "loss": 0.25415414571762085, "memory_gb": 7.721559524536133, "step_time_ms": 7356.194734573364, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:09] (step=0000070) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.0013602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 71, "loss": 0.2261860966682434, "memory_gb": 7.721559524536133, "step_time_ms": 7372.807264328003, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:17] (step=0000071) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.0013797123979790128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 72, "loss": 0.08817970752716064, "memory_gb": 7.721559524536133, "step_time_ms": 7298.287153244019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:25] (step=0000072) Train Loss: 0.1505, Train Steps/Sec: 0.13, Epoch: 0.0013991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 73, "loss": 0.2589040994644165, "memory_gb": 7.721559524536133, "step_time_ms": 7136.971473693848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:33] (step=0000073) Train Loss: 0.2770, Train Steps/Sec: 0.13, Epoch: 0.0014185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 74, "loss": 0.19248425960540771, "memory_gb": 7.721559524536133, "step_time_ms": 7335.424184799194, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:41] (step=0000074) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.0014380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 75, "loss": 0.2527018189430237, "memory_gb": 7.721559524536133, "step_time_ms": 4975.586175918579, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:47] (step=0000075) Train Loss: 0.2614, Train Steps/Sec: 0.18, Epoch: 0.0014574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:42:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 76, "loss": 0.23065751791000366, "memory_gb": 7.721559524536133, "step_time_ms": 7346.792221069336, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:42:55] (step=0000076) Train Loss: 0.3057, Train Steps/Sec: 0.13, Epoch: 0.0014768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 77, "loss": 0.3667227029800415, "memory_gb": 7.721559524536133, "step_time_ms": 7266.467809677124, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:03] (step=0000077) Train Loss: 0.3322, Train Steps/Sec: 0.13, Epoch: 0.0014963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 78, "loss": 0.2741358280181885, "memory_gb": 7.721559524536133, "step_time_ms": 7359.1368198394775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:11] (step=0000078) Train Loss: 0.2199, Train Steps/Sec: 0.12, Epoch: 0.001515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 79, "loss": 0.3065854609012604, "memory_gb": 7.721559524536133, "step_time_ms": 7342.353820800781, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:19] (step=0000079) Train Loss: 0.2780, Train Steps/Sec: 0.13, Epoch: 0.001535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 80, "loss": 0.3938297629356384, "memory_gb": 7.721559524536133, "step_time_ms": 7312.790155410767, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:27] (step=0000080) Train Loss: 0.3124, Train Steps/Sec: 0.13, Epoch: 0.001554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 81, "loss": 0.2216058075428009, "memory_gb": 7.721559524536133, "step_time_ms": 7316.799163818359, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:35] (step=0000081) Train Loss: 0.2007, Train Steps/Sec: 0.13, Epoch: 0.0015740380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 82, "loss": 0.30646809935569763, "memory_gb": 7.721559524536133, "step_time_ms": 7318.16291809082, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:43] (step=0000082) Train Loss: 0.3083, Train Steps/Sec: 0.12, Epoch: 0.0015934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 83, "loss": 0.24364686012268066, "memory_gb": 7.721559524536133, "step_time_ms": 7280.68470954895, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:51] (step=0000083) Train Loss: 0.2166, Train Steps/Sec: 0.13, Epoch: 0.0016129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 84, "loss": 0.26301535964012146, "memory_gb": 7.721559524536133, "step_time_ms": 7354.51078414917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:43:59] (step=0000084) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.0016323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 85, "loss": 0.23089717328548431, "memory_gb": 7.721559524536133, "step_time_ms": 7336.664199829102, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:07] (step=0000085) Train Loss: 0.2390, Train Steps/Sec: 0.12, Epoch: 0.0016517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 86, "loss": 0.34666407108306885, "memory_gb": 7.721559524536133, "step_time_ms": 7281.517267227173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:15] (step=0000086) Train Loss: 0.3031, Train Steps/Sec: 0.13, Epoch: 0.0016712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 87, "loss": 0.2170710563659668, "memory_gb": 7.721559524536133, "step_time_ms": 7346.3239669799805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:23] (step=0000087) Train Loss: 0.2220, Train Steps/Sec: 0.12, Epoch: 0.0016906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 88, "loss": 0.21945631504058838, "memory_gb": 7.721559524536133, "step_time_ms": 7312.156200408936, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:31] (step=0000088) Train Loss: 0.2181, Train Steps/Sec: 0.12, Epoch: 0.0017100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 89, "loss": 0.27434325218200684, "memory_gb": 7.721559524536133, "step_time_ms": 7262.401103973389, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:39] (step=0000089) Train Loss: 0.2178, Train Steps/Sec: 0.13, Epoch: 0.001729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 90, "loss": 0.2061372697353363, "memory_gb": 7.721559524536133, "step_time_ms": 7321.6400146484375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:47] (step=0000090) Train Loss: 0.1915, Train Steps/Sec: 0.12, Epoch: 0.001748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:44:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 91, "loss": 0.19284819066524506, "memory_gb": 7.721559524536133, "step_time_ms": 7353.950262069702, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:44:55] (step=0000091) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.0017683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 92, "loss": 0.22794130444526672, "memory_gb": 7.721559524536133, "step_time_ms": 7274.765491485596, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:03] (step=0000092) Train Loss: 0.2178, Train Steps/Sec: 0.13, Epoch: 0.0017877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 93, "loss": 0.3501286208629608, "memory_gb": 7.721559524536133, "step_time_ms": 7454.888105392456, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:11] (step=0000093) Train Loss: 0.3297, Train Steps/Sec: 0.13, Epoch: 0.0018072289156626507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 94, "loss": 0.1758650839328766, "memory_gb": 7.721559524536133, "step_time_ms": 7375.859022140503, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:19] (step=0000094) Train Loss: 0.1930, Train Steps/Sec: 0.12, Epoch: 0.0018266614846482706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 95, "loss": 0.2662169933319092, "memory_gb": 7.721559524536133, "step_time_ms": 7274.9621868133545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:27] (step=0000095) Train Loss: 0.2422, Train Steps/Sec: 0.13, Epoch: 0.0018460940536338905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 96, "loss": 0.19380193948745728, "memory_gb": 7.721559524536133, "step_time_ms": 7294.976234436035, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:35] (step=0000096) Train Loss: 0.2456, Train Steps/Sec: 0.13, Epoch: 0.0018655266226195104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 97, "loss": 0.3246355652809143, "memory_gb": 7.721559524536133, "step_time_ms": 7341.218709945679, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:43] (step=0000097) Train Loss: 0.2525, Train Steps/Sec: 0.13, Epoch: 0.0018849591916051303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 98, "loss": 0.23764875531196594, "memory_gb": 7.721559524536133, "step_time_ms": 7289.5143032073975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:51] (step=0000098) Train Loss: 0.2107, Train Steps/Sec: 0.13, Epoch: 0.0019043917605907502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:45:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 99, "loss": 0.253124475479126, "memory_gb": 7.721559524536133, "step_time_ms": 7306.342124938965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:45:59] (step=0000099) Train Loss: 0.2924, Train Steps/Sec: 0.13, Epoch: 0.00192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 100, "loss": 0.19019249081611633, "memory_gb": 7.721559524536133, "step_time_ms": 7348.460674285889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:07] (step=0000100) Train Loss: 0.2130, Train Steps/Sec: 0.12, Epoch: 0.00194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 101, "loss": 0.2807874083518982, "memory_gb": 7.721559524536133, "step_time_ms": 7306.769847869873, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:15] (step=0000101) Train Loss: 0.2616, Train Steps/Sec: 0.12, Epoch: 0.0019626894675476097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 102, "loss": 0.2293873280286789, "memory_gb": 7.721559524536133, "step_time_ms": 7203.951358795166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:23] (step=0000102) Train Loss: 0.2564, Train Steps/Sec: 0.13, Epoch: 0.00198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 103, "loss": 0.18360741436481476, "memory_gb": 7.721559524536133, "step_time_ms": 7333.9784145355225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:31] (step=0000103) Train Loss: 0.1679, Train Steps/Sec: 0.13, Epoch: 0.0020015546055188495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 104, "loss": 0.3066064715385437, "memory_gb": 7.721559524536133, "step_time_ms": 5140.095949172974, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:36] (step=0000104) Train Loss: 0.2926, Train Steps/Sec: 0.19, Epoch: 0.0020209871745044696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 105, "loss": 0.28633955121040344, "memory_gb": 7.721559524536133, "step_time_ms": 7255.064964294434, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:44] (step=0000105) Train Loss: 0.2603, Train Steps/Sec: 0.13, Epoch: 0.0020404197434900893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 106, "loss": 0.11875084042549133, "memory_gb": 7.721559524536133, "step_time_ms": 7255.497455596924, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:46:52] (step=0000106) Train Loss: 0.1268, Train Steps/Sec: 0.13, Epoch: 0.0020598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 107, "loss": 0.35940033197402954, "memory_gb": 7.721559524536133, "step_time_ms": 7350.677251815796, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:00] (step=0000107) Train Loss: 0.3208, Train Steps/Sec: 0.12, Epoch: 0.002079284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 108, "loss": 0.2783970534801483, "memory_gb": 7.715639114379883, "step_time_ms": 7261.880397796631, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:08] (step=0000108) Train Loss: 0.2636, Train Steps/Sec: 0.13, Epoch: 0.002098717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 109, "loss": 0.2067691683769226, "memory_gb": 7.721559524536133, "step_time_ms": 7328.466176986694, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:16] (step=0000109) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.002118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 110, "loss": 0.3715851306915283, "memory_gb": 7.715639114379883, "step_time_ms": 7353.200674057007, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:24] (step=0000110) Train Loss: 0.2863, Train Steps/Sec: 0.12, Epoch: 0.002137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 111, "loss": 0.22938419878482819, "memory_gb": 7.721559524536133, "step_time_ms": 7297.715425491333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:32] (step=0000111) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.0021570151574038087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 112, "loss": 0.2609632611274719, "memory_gb": 7.721559524536133, "step_time_ms": 7271.2907791137695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:40] (step=0000112) Train Loss: 0.2548, Train Steps/Sec: 0.13, Epoch: 0.002176447726389429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 113, "loss": 0.3409082591533661, "memory_gb": 7.721559524536133, "step_time_ms": 7405.972719192505, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:48] (step=0000113) Train Loss: 0.2576, Train Steps/Sec: 0.12, Epoch: 0.0021958802953750485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 114, "loss": 0.24031144380569458, "memory_gb": 7.721559524536133, "step_time_ms": 7275.982856750488, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:47:56] (step=0000114) Train Loss: 0.2227, Train Steps/Sec: 0.13, Epoch: 0.0022153128643606686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 115, "loss": 0.4184340238571167, "memory_gb": 7.721559524536133, "step_time_ms": 7060.636281967163, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:04] (step=0000115) Train Loss: 0.3517, Train Steps/Sec: 0.13, Epoch: 0.0022347454333462883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 116, "loss": 0.28038355708122253, "memory_gb": 7.721559524536133, "step_time_ms": 7360.918998718262, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:12] (step=0000116) Train Loss: 0.2835, Train Steps/Sec: 0.12, Epoch: 0.0022541780023319084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 117, "loss": 0.34224995970726013, "memory_gb": 7.721559524536133, "step_time_ms": 7294.541120529175, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:20] (step=0000117) Train Loss: 0.2965, Train Steps/Sec: 0.13, Epoch: 0.002273610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 118, "loss": 0.366329550743103, "memory_gb": 7.721559524536133, "step_time_ms": 7301.880836486816, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:28] (step=0000118) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.0022930431403031483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 119, "loss": 0.29994696378707886, "memory_gb": 7.721559524536133, "step_time_ms": 7360.568046569824, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:36] (step=0000119) Train Loss: 0.2832, Train Steps/Sec: 0.12, Epoch: 0.002312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 120, "loss": 0.2637939155101776, "memory_gb": 7.721559524536133, "step_time_ms": 7375.704526901245, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:44] (step=0000120) Train Loss: 0.2874, Train Steps/Sec: 0.12, Epoch: 0.002331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:48:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 121, "loss": 0.17858180403709412, "memory_gb": 7.721559524536133, "step_time_ms": 7293.493747711182, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:48:52] (step=0000121) Train Loss: 0.2376, Train Steps/Sec: 0.13, Epoch: 0.0023513408472600078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 122, "loss": 0.23799151182174683, "memory_gb": 7.721559524536133, "step_time_ms": 7467.322111129761, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:00] (step=0000122) Train Loss: 0.2774, Train Steps/Sec: 0.12, Epoch: 0.002370773416245628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 123, "loss": 0.26118600368499756, "memory_gb": 7.721559524536133, "step_time_ms": 7353.910446166992, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:08] (step=0000123) Train Loss: 0.2391, Train Steps/Sec: 0.13, Epoch: 0.0023902059852312476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 124, "loss": 0.29710155725479126, "memory_gb": 7.721559524536133, "step_time_ms": 7299.256086349487, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:16] (step=0000124) Train Loss: 0.3058, Train Steps/Sec: 0.13, Epoch: 0.0024096385542168677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 125, "loss": 0.3717457056045532, "memory_gb": 7.721559524536133, "step_time_ms": 7411.109685897827, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:24] (step=0000125) Train Loss: 0.3103, Train Steps/Sec: 0.12, Epoch: 0.0024290711232024874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 126, "loss": 0.179173082113266, "memory_gb": 7.721559524536133, "step_time_ms": 7331.1216831207275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:32] (step=0000126) Train Loss: 0.2002, Train Steps/Sec: 0.13, Epoch: 0.002448503692188107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 127, "loss": 0.2878264784812927, "memory_gb": 7.721559524536133, "step_time_ms": 7344.172716140747, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:41] (step=0000127) Train Loss: 0.2624, Train Steps/Sec: 0.12, Epoch: 0.002467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 128, "loss": 0.2480638027191162, "memory_gb": 7.721559524536133, "step_time_ms": 7426.483154296875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:49] (step=0000128) Train Loss: 0.2186, Train Steps/Sec: 0.12, Epoch: 0.002487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:49:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 129, "loss": 0.2258828729391098, "memory_gb": 7.721559524536133, "step_time_ms": 7329.319953918457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:49:57] (step=0000129) Train Loss: 0.2200, Train Steps/Sec: 0.13, Epoch: 0.002506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 13:50:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 130, "loss": 0.23347386717796326, "memory_gb": 7.721559524536133, "step_time_ms": 7297.749042510986, "trainable_params": 4718592, "method": "lora"} [2025-07-28 13:50:05] (step=0000130) Train Loss: 0.2973, Train Steps/Sec: 0.13, Epoch: 0.0025262339681305867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 14:52:25] Found latest checkpoint at /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000 [2025-07-28 14:52:25] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-28 14:52:26] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 14:52:26] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 17:56:10] Found latest checkpoint at /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000 [2025-07-28 17:56:10] Experiment directory created at /nvme-data/Komal/documents/results/VisualCloze/lora/depth [2025-07-28 17:56:10] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 17:56:10] Downloaded model to /nvme-data/Komal/huggingface/hub/models--Shitao--OmniGen-v1/snapshots/58e249c7c7634423c0ba41c34a774af79aa87889 [2025-07-28 17:56:55] Trainable parameters: 4,718,592 (lora) [2025-07-28 17:56:55] Total parameters in the model: 3,762,072,592 (lora) [2025-07-28 17:57:19] Dataset contains 205,841 [2025-07-28 17:57:21] Training for 2000 epochs... [2025-07-28 17:57:21] Beginning epoch 0... [2025-07-28 17:57:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1, "loss": 0.3330778181552887, "memory_gb": 7.721559047698975, "step_time_ms": 9019.688844680786, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:57:32] (step=0000001) Train Loss: 0.3240, Train Steps/Sec: 0.09, Epoch: 1.9432568985619897e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:57:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2, "loss": 0.26144829392433167, "memory_gb": 7.721559524536133, "step_time_ms": 7268.237113952637, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:57:40] (step=0000002) Train Loss: 0.2849, Train Steps/Sec: 0.13, Epoch: 3.8865137971239795e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:57:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3, "loss": 0.213661327958107, "memory_gb": 7.721559524536133, "step_time_ms": 6029.462099075317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:57:46] (step=0000003) Train Loss: 0.2430, Train Steps/Sec: 0.16, Epoch: 5.82977069568597e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:57:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4, "loss": 0.2625368535518646, "memory_gb": 7.721559524536133, "step_time_ms": 7206.452131271362, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:57:54] (step=0000004) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 7.773027594247959e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5, "loss": 0.23385097086429596, "memory_gb": 7.721559524536133, "step_time_ms": 7389.6284103393555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:02] (step=0000005) Train Loss: 0.2515, Train Steps/Sec: 0.13, Epoch: 9.71628449280995e-05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6, "loss": 0.3664487600326538, "memory_gb": 7.721559524536133, "step_time_ms": 7521.553993225098, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:10] (step=0000006) Train Loss: 0.3414, Train Steps/Sec: 0.12, Epoch: 0.0001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7, "loss": 0.2240256667137146, "memory_gb": 7.721559524536133, "step_time_ms": 7401.164293289185, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:18] (step=0000007) Train Loss: 0.2469, Train Steps/Sec: 0.13, Epoch: 0.0001360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8, "loss": 0.2708398401737213, "memory_gb": 7.721559524536133, "step_time_ms": 7421.164274215698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:26] (step=0000008) Train Loss: 0.2525, Train Steps/Sec: 0.13, Epoch: 0.00015546055188495918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9, "loss": 0.2696707248687744, "memory_gb": 7.721559524536133, "step_time_ms": 7459.409475326538, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:34] (step=0000009) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.00017489312087057908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10, "loss": 0.28505879640579224, "memory_gb": 7.721559524536133, "step_time_ms": 7425.406694412231, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:42] (step=0000010) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.000194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11, "loss": 0.2416115254163742, "memory_gb": 7.721559524536133, "step_time_ms": 7421.3385581970215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:50] (step=0000011) Train Loss: 0.2875, Train Steps/Sec: 0.12, Epoch: 0.0002137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:58:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12, "loss": 0.1734612137079239, "memory_gb": 7.721559524536133, "step_time_ms": 7479.779243469238, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:58:58] (step=0000012) Train Loss: 0.2364, Train Steps/Sec: 0.12, Epoch: 0.0002331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13, "loss": 0.13270074129104614, "memory_gb": 7.721559524536133, "step_time_ms": 7394.507884979248, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:06] (step=0000013) Train Loss: 0.1588, Train Steps/Sec: 0.12, Epoch: 0.0002526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14, "loss": 0.23944473266601562, "memory_gb": 7.721559524536133, "step_time_ms": 7374.418258666992, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:14] (step=0000014) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.0002720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15, "loss": 0.36665594577789307, "memory_gb": 7.721559524536133, "step_time_ms": 7452.602863311768, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:22] (step=0000015) Train Loss: 0.2900, Train Steps/Sec: 0.12, Epoch: 0.0002914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 16, "loss": 0.22666992247104645, "memory_gb": 7.721559524536133, "step_time_ms": 7393.158197402954, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:30] (step=0000016) Train Loss: 0.2398, Train Steps/Sec: 0.12, Epoch: 0.00031092110376991836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 17, "loss": 0.16857081651687622, "memory_gb": 7.721559524536133, "step_time_ms": 7426.649808883667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:38] (step=0000017) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.00033035367275553826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 18, "loss": 0.18646341562271118, "memory_gb": 7.721559524536133, "step_time_ms": 7476.085424423218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:46] (step=0000018) Train Loss: 0.1888, Train Steps/Sec: 0.12, Epoch: 0.00034978624174115817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 17:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 19, "loss": 0.17689935863018036, "memory_gb": 7.721559524536133, "step_time_ms": 7373.616933822632, "trainable_params": 4718592, "method": "lora"} [2025-07-28 17:59:54] (step=0000019) Train Loss: 0.2442, Train Steps/Sec: 0.13, Epoch: 0.00036921881072677807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 20, "loss": 0.3180862069129944, "memory_gb": 7.721559524536133, "step_time_ms": 7372.220754623413, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:02] (step=0000020) Train Loss: 0.3285, Train Steps/Sec: 0.13, Epoch: 0.000388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 21, "loss": 0.2744840681552887, "memory_gb": 7.721559524536133, "step_time_ms": 7443.49479675293, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:10] (step=0000021) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.0004080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 22, "loss": 0.2160215824842453, "memory_gb": 7.721559524536133, "step_time_ms": 7329.544305801392, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:18] (step=0000022) Train Loss: 0.2921, Train Steps/Sec: 0.13, Epoch: 0.0004275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 23, "loss": 0.23879870772361755, "memory_gb": 7.721559524536133, "step_time_ms": 7348.877429962158, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:26] (step=0000023) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.0004469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 24, "loss": 0.1784689724445343, "memory_gb": 7.721559524536133, "step_time_ms": 7444.704055786133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:35] (step=0000024) Train Loss: 0.2268, Train Steps/Sec: 0.12, Epoch: 0.0004663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 25, "loss": 0.157613605260849, "memory_gb": 7.721559524536133, "step_time_ms": 7396.971940994263, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:43] (step=0000025) Train Loss: 0.1885, Train Steps/Sec: 0.12, Epoch: 0.0004858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 26, "loss": 0.17819231748580933, "memory_gb": 7.721559524536133, "step_time_ms": 7331.063270568848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:51] (step=0000026) Train Loss: 0.1564, Train Steps/Sec: 0.12, Epoch: 0.0005052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:00:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 27, "loss": 0.2888820469379425, "memory_gb": 7.721559524536133, "step_time_ms": 7431.251287460327, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:00:59] (step=0000027) Train Loss: 0.2796, Train Steps/Sec: 0.12, Epoch: 0.0005246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 28, "loss": 0.27903884649276733, "memory_gb": 7.721559524536133, "step_time_ms": 7373.578310012817, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:07] (step=0000028) Train Loss: 0.2681, Train Steps/Sec: 0.12, Epoch: 0.0005441119315973572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 29, "loss": 0.20898553729057312, "memory_gb": 7.721559524536133, "step_time_ms": 7314.051389694214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:15] (step=0000029) Train Loss: 0.2348, Train Steps/Sec: 0.13, Epoch: 0.0005635445005829771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 30, "loss": 0.2869052588939667, "memory_gb": 7.721559524536133, "step_time_ms": 7399.941205978394, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:23] (step=0000030) Train Loss: 0.3051, Train Steps/Sec: 0.12, Epoch: 0.000582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 31, "loss": 0.32952019572257996, "memory_gb": 7.721559524536133, "step_time_ms": 7165.1976108551025, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:30] (step=0000031) Train Loss: 0.2401, Train Steps/Sec: 0.13, Epoch: 0.0006024096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 32, "loss": 0.28601402044296265, "memory_gb": 7.721559524536133, "step_time_ms": 6333.6451053619385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:37] (step=0000032) Train Loss: 0.2910, Train Steps/Sec: 0.15, Epoch: 0.0006218422075398367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 33, "loss": 0.30704137682914734, "memory_gb": 7.721559524536133, "step_time_ms": 6549.8669147491455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:45] (step=0000033) Train Loss: 0.3081, Train Steps/Sec: 0.14, Epoch: 0.0006412747765254566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 34, "loss": 0.25054866075515747, "memory_gb": 7.721559524536133, "step_time_ms": 7248.371601104736, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:01:52] (step=0000034) Train Loss: 0.3334, Train Steps/Sec: 0.13, Epoch: 0.0006607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 35, "loss": 0.2642165422439575, "memory_gb": 7.721559524536133, "step_time_ms": 7326.16662979126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:00] (step=0000035) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.0006801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 36, "loss": 0.32495564222335815, "memory_gb": 7.721559524536133, "step_time_ms": 7244.161367416382, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:08] (step=0000036) Train Loss: 0.2858, Train Steps/Sec: 0.13, Epoch: 0.0006995724834823163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 37, "loss": 0.18668225407600403, "memory_gb": 7.721559524536133, "step_time_ms": 7280.040740966797, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:16] (step=0000037) Train Loss: 0.2066, Train Steps/Sec: 0.13, Epoch: 0.0007190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 38, "loss": 0.2454148828983307, "memory_gb": 7.721559524536133, "step_time_ms": 7186.214208602905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:24] (step=0000038) Train Loss: 0.2406, Train Steps/Sec: 0.13, Epoch: 0.0007384376214535561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 39, "loss": 0.2638227045536041, "memory_gb": 7.721559524536133, "step_time_ms": 7224.343776702881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:32] (step=0000039) Train Loss: 0.2526, Train Steps/Sec: 0.12, Epoch: 0.000757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 40, "loss": 0.18669509887695312, "memory_gb": 7.721559524536133, "step_time_ms": 7279.682397842407, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:40] (step=0000040) Train Loss: 0.2169, Train Steps/Sec: 0.13, Epoch: 0.000777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 41, "loss": 0.14391997456550598, "memory_gb": 7.721559524536133, "step_time_ms": 7347.792387008667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:48] (step=0000041) Train Loss: 0.1884, Train Steps/Sec: 0.13, Epoch: 0.0007967353284104159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 42, "loss": 0.173164501786232, "memory_gb": 7.721559524536133, "step_time_ms": 7281.283140182495, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:02:56] (step=0000042) Train Loss: 0.2069, Train Steps/Sec: 0.13, Epoch: 0.0008161678973960358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 43, "loss": 0.29831570386886597, "memory_gb": 7.721559524536133, "step_time_ms": 7314.238786697388, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:04] (step=0000043) Train Loss: 0.2754, Train Steps/Sec: 0.13, Epoch: 0.0008356004663816557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 44, "loss": 0.28033676743507385, "memory_gb": 7.721559524536133, "step_time_ms": 7337.585926055908, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:12] (step=0000044) Train Loss: 0.2947, Train Steps/Sec: 0.12, Epoch: 0.0008550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 45, "loss": 0.3135179281234741, "memory_gb": 7.721559524536133, "step_time_ms": 7345.356225967407, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:20] (step=0000045) Train Loss: 0.2643, Train Steps/Sec: 0.12, Epoch: 0.0008744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 46, "loss": 0.3593050539493561, "memory_gb": 7.721559524536133, "step_time_ms": 7308.750867843628, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:28] (step=0000046) Train Loss: 0.3674, Train Steps/Sec: 0.13, Epoch: 0.0008938981733385154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 47, "loss": 0.3624342679977417, "memory_gb": 7.721559524536133, "step_time_ms": 7346.603870391846, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:36] (step=0000047) Train Loss: 0.3053, Train Steps/Sec: 0.12, Epoch: 0.0009133307423241353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 48, "loss": 0.21901841461658478, "memory_gb": 7.721559524536133, "step_time_ms": 7313.331604003906, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:44] (step=0000048) Train Loss: 0.2496, Train Steps/Sec: 0.12, Epoch: 0.0009327633113097552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:03:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 49, "loss": 0.24436822533607483, "memory_gb": 7.721559524536133, "step_time_ms": 7305.408239364624, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:03:52] (step=0000049) Train Loss: 0.2029, Train Steps/Sec: 0.12, Epoch: 0.0009521958802953751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 50, "loss": 0.22189533710479736, "memory_gb": 7.721559524536133, "step_time_ms": 7362.431764602661, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:00] (step=0000050) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.000971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 51, "loss": 0.21140359342098236, "memory_gb": 7.721559524536133, "step_time_ms": 7276.416063308716, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:08] (step=0000051) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.000991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 52, "loss": 0.20585963129997253, "memory_gb": 7.721559524536133, "step_time_ms": 7422.662973403931, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:16] (step=0000052) Train Loss: 0.2302, Train Steps/Sec: 0.13, Epoch: 0.0010104935872522348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 53, "loss": 0.19354157149791718, "memory_gb": 7.721559524536133, "step_time_ms": 7355.3009033203125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:24] (step=0000053) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.0010299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 54, "loss": 0.23527032136917114, "memory_gb": 7.721559524536133, "step_time_ms": 7279.905557632446, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:32] (step=0000054) Train Loss: 0.2599, Train Steps/Sec: 0.13, Epoch: 0.0010493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 55, "loss": 0.17002452909946442, "memory_gb": 7.721559524536133, "step_time_ms": 7339.089155197144, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:41] (step=0000055) Train Loss: 0.1971, Train Steps/Sec: 0.12, Epoch: 0.0010687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 56, "loss": 0.2639520764350891, "memory_gb": 7.721559524536133, "step_time_ms": 7343.621015548706, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:49] (step=0000056) Train Loss: 0.2954, Train Steps/Sec: 0.12, Epoch: 0.0010882238631947144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 57, "loss": 0.30389857292175293, "memory_gb": 7.721559524536133, "step_time_ms": 7242.013692855835, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:04:57] (step=0000057) Train Loss: 0.3050, Train Steps/Sec: 0.13, Epoch: 0.0011076564321803343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 58, "loss": 0.29223495721817017, "memory_gb": 7.721559524536133, "step_time_ms": 7274.834394454956, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:05] (step=0000058) Train Loss: 0.2523, Train Steps/Sec: 0.13, Epoch: 0.0011270890011659542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 59, "loss": 0.23104557394981384, "memory_gb": 7.721559524536133, "step_time_ms": 7361.202239990234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:13] (step=0000059) Train Loss: 0.2554, Train Steps/Sec: 0.12, Epoch: 0.0011465215701515741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 60, "loss": 0.285961389541626, "memory_gb": 7.721559524536133, "step_time_ms": 7139.707803726196, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:21] (step=0000060) Train Loss: 0.2371, Train Steps/Sec: 0.13, Epoch: 0.001165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 61, "loss": 0.2517350912094116, "memory_gb": 7.721559524536133, "step_time_ms": 6724.430084228516, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:28] (step=0000061) Train Loss: 0.2328, Train Steps/Sec: 0.14, Epoch: 0.001185386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 62, "loss": 0.1621696650981903, "memory_gb": 7.721559524536133, "step_time_ms": 5630.7783126831055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:34] (step=0000062) Train Loss: 0.2260, Train Steps/Sec: 0.16, Epoch: 0.0012048192771084338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 63, "loss": 0.3252037465572357, "memory_gb": 7.721559524536133, "step_time_ms": 7319.035768508911, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:42] (step=0000063) Train Loss: 0.2420, Train Steps/Sec: 0.12, Epoch: 0.0012242518460940535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 64, "loss": 0.20036444067955017, "memory_gb": 7.721559524536133, "step_time_ms": 7343.780040740967, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:50] (step=0000064) Train Loss: 0.2784, Train Steps/Sec: 0.12, Epoch: 0.0012436844150796734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 65, "loss": 0.3020925521850586, "memory_gb": 7.721559524536133, "step_time_ms": 7270.732879638672, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:05:58] (step=0000065) Train Loss: 0.2889, Train Steps/Sec: 0.13, Epoch: 0.0012631169840652933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 66, "loss": 0.16955271363258362, "memory_gb": 7.721559524536133, "step_time_ms": 7294.097900390625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:06] (step=0000066) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.0012825495530509132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 67, "loss": 0.35270559787750244, "memory_gb": 7.721559524536133, "step_time_ms": 7333.1544399261475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:14] (step=0000067) Train Loss: 0.3211, Train Steps/Sec: 0.12, Epoch: 0.0013019821220365331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 68, "loss": 0.23752541840076447, "memory_gb": 7.721559524536133, "step_time_ms": 7264.851808547974, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:22] (step=0000068) Train Loss: 0.2511, Train Steps/Sec: 0.13, Epoch: 0.001321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 69, "loss": 0.16972634196281433, "memory_gb": 7.721559524536133, "step_time_ms": 7270.195960998535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:30] (step=0000069) Train Loss: 0.1563, Train Steps/Sec: 0.12, Epoch: 0.001340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 70, "loss": 0.157131165266037, "memory_gb": 7.721559524536133, "step_time_ms": 7348.433494567871, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:38] (step=0000070) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.0013602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 71, "loss": 0.20889770984649658, "memory_gb": 7.721559524536133, "step_time_ms": 7252.447843551636, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:46] (step=0000071) Train Loss: 0.2428, Train Steps/Sec: 0.13, Epoch: 0.0013797123979790128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:06:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 72, "loss": 0.16898460686206818, "memory_gb": 7.721559524536133, "step_time_ms": 7297.110795974731, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:06:54] (step=0000072) Train Loss: 0.2241, Train Steps/Sec: 0.13, Epoch: 0.0013991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 73, "loss": 0.21221525967121124, "memory_gb": 7.721559524536133, "step_time_ms": 7145.309925079346, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:02] (step=0000073) Train Loss: 0.2429, Train Steps/Sec: 0.13, Epoch: 0.0014185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 74, "loss": 0.17310887575149536, "memory_gb": 7.721559524536133, "step_time_ms": 7277.789115905762, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:10] (step=0000074) Train Loss: 0.2569, Train Steps/Sec: 0.13, Epoch: 0.0014380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 75, "loss": 0.19082382321357727, "memory_gb": 7.721559524536133, "step_time_ms": 7362.671613693237, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:18] (step=0000075) Train Loss: 0.2323, Train Steps/Sec: 0.12, Epoch: 0.0014574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 76, "loss": 0.11717362701892853, "memory_gb": 7.721559524536133, "step_time_ms": 7304.036378860474, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:26] (step=0000076) Train Loss: 0.1867, Train Steps/Sec: 0.13, Epoch: 0.0014768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 77, "loss": 0.2594881057739258, "memory_gb": 7.721559524536133, "step_time_ms": 7259.358167648315, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:33] (step=0000077) Train Loss: 0.2350, Train Steps/Sec: 0.13, Epoch: 0.0014963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 78, "loss": 0.3203013241291046, "memory_gb": 7.721559524536133, "step_time_ms": 7415.162086486816, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:41] (step=0000078) Train Loss: 0.2827, Train Steps/Sec: 0.13, Epoch: 0.001515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 79, "loss": 0.30135849118232727, "memory_gb": 7.721559524536133, "step_time_ms": 7366.073131561279, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:49] (step=0000079) Train Loss: 0.2977, Train Steps/Sec: 0.13, Epoch: 0.001535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:07:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 80, "loss": 0.22141751646995544, "memory_gb": 7.721559524536133, "step_time_ms": 7336.262464523315, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:07:58] (step=0000080) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.001554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 81, "loss": 0.23520970344543457, "memory_gb": 7.721559524536133, "step_time_ms": 7446.11382484436, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:06] (step=0000081) Train Loss: 0.2094, Train Steps/Sec: 0.12, Epoch: 0.0015740380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 82, "loss": 0.26623839139938354, "memory_gb": 7.721559524536133, "step_time_ms": 7376.085042953491, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:14] (step=0000082) Train Loss: 0.2693, Train Steps/Sec: 0.12, Epoch: 0.0015934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 83, "loss": 0.41986897587776184, "memory_gb": 7.721559524536133, "step_time_ms": 7309.287786483765, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:22] (step=0000083) Train Loss: 0.3895, Train Steps/Sec: 0.12, Epoch: 0.0016129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 84, "loss": 0.29202207922935486, "memory_gb": 7.721559524536133, "step_time_ms": 7416.878223419189, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:30] (step=0000084) Train Loss: 0.2960, Train Steps/Sec: 0.12, Epoch: 0.0016323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 85, "loss": 0.16525563597679138, "memory_gb": 7.721559524536133, "step_time_ms": 7394.491910934448, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:38] (step=0000085) Train Loss: 0.1934, Train Steps/Sec: 0.12, Epoch: 0.0016517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 86, "loss": 0.23741689324378967, "memory_gb": 7.721559524536133, "step_time_ms": 7340.601682662964, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:46] (step=0000086) Train Loss: 0.3027, Train Steps/Sec: 0.12, Epoch: 0.0016712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 87, "loss": 0.2144116461277008, "memory_gb": 7.721559524536133, "step_time_ms": 7421.806335449219, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:08:54] (step=0000087) Train Loss: 0.2029, Train Steps/Sec: 0.13, Epoch: 0.0016906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 88, "loss": 0.18878108263015747, "memory_gb": 7.721559524536133, "step_time_ms": 7449.177503585815, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:02] (step=0000088) Train Loss: 0.1930, Train Steps/Sec: 0.12, Epoch: 0.0017100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 89, "loss": 0.2583012580871582, "memory_gb": 7.721559524536133, "step_time_ms": 7195.984840393066, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:10] (step=0000089) Train Loss: 0.2562, Train Steps/Sec: 0.13, Epoch: 0.001729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 90, "loss": 0.20873427391052246, "memory_gb": 7.721559524536133, "step_time_ms": 7434.0410232543945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:18] (step=0000090) Train Loss: 0.2109, Train Steps/Sec: 0.12, Epoch: 0.001748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 91, "loss": 0.17256489396095276, "memory_gb": 7.721559524536133, "step_time_ms": 5152.074098587036, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:23] (step=0000091) Train Loss: 0.1662, Train Steps/Sec: 0.18, Epoch: 0.0017683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 92, "loss": 0.299679160118103, "memory_gb": 7.721559524536133, "step_time_ms": 7484.609365463257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:31] (step=0000092) Train Loss: 0.3336, Train Steps/Sec: 0.12, Epoch: 0.0017877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 93, "loss": 0.21669533848762512, "memory_gb": 7.721559524536133, "step_time_ms": 7552.677154541016, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:40] (step=0000093) Train Loss: 0.2167, Train Steps/Sec: 0.12, Epoch: 0.0018072289156626507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 94, "loss": 0.23719388246536255, "memory_gb": 7.721559524536133, "step_time_ms": 7340.860366821289, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:47] (step=0000094) Train Loss: 0.2163, Train Steps/Sec: 0.13, Epoch: 0.0018266614846482706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 95, "loss": 0.26218533515930176, "memory_gb": 7.721559524536133, "step_time_ms": 7421.556711196899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:09:56] (step=0000095) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.0018460940536338905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 96, "loss": 0.247325599193573, "memory_gb": 7.721559524536133, "step_time_ms": 7368.017911911011, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:04] (step=0000096) Train Loss: 0.2142, Train Steps/Sec: 0.13, Epoch: 0.0018655266226195104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 97, "loss": 0.1648729145526886, "memory_gb": 7.721559524536133, "step_time_ms": 7342.02241897583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:12] (step=0000097) Train Loss: 0.1778, Train Steps/Sec: 0.12, Epoch: 0.0018849591916051303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 98, "loss": 0.20151910185813904, "memory_gb": 7.721559524536133, "step_time_ms": 7458.954811096191, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:20] (step=0000098) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.0019043917605907502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 99, "loss": 0.22304806113243103, "memory_gb": 7.721559524536133, "step_time_ms": 7465.339422225952, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:28] (step=0000099) Train Loss: 0.2008, Train Steps/Sec: 0.12, Epoch: 0.00192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 100, "loss": 0.20517833530902863, "memory_gb": 7.721559524536133, "step_time_ms": 7430.748701095581, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:36] (step=0000100) Train Loss: 0.2335, Train Steps/Sec: 0.13, Epoch: 0.00194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 101, "loss": 0.21643862128257751, "memory_gb": 7.721559524536133, "step_time_ms": 7521.424293518066, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:44] (step=0000101) Train Loss: 0.1745, Train Steps/Sec: 0.13, Epoch: 0.0019626894675476097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 102, "loss": 0.18105165660381317, "memory_gb": 7.721559524536133, "step_time_ms": 7465.126276016235, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:52] (step=0000102) Train Loss: 0.2001, Train Steps/Sec: 0.13, Epoch: 0.00198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 103, "loss": 0.19883757829666138, "memory_gb": 7.721559524536133, "step_time_ms": 7513.422966003418, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:10:59] (step=0000103) Train Loss: 0.2117, Train Steps/Sec: 0.13, Epoch: 0.0020015546055188495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 104, "loss": 0.275675892829895, "memory_gb": 7.721559524536133, "step_time_ms": 7571.848630905151, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:07] (step=0000104) Train Loss: 0.2708, Train Steps/Sec: 0.13, Epoch: 0.0020209871745044696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 105, "loss": 0.28954070806503296, "memory_gb": 7.721559524536133, "step_time_ms": 7591.753005981445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:15] (step=0000105) Train Loss: 0.3008, Train Steps/Sec: 0.13, Epoch: 0.0020404197434900893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 106, "loss": 0.2715284824371338, "memory_gb": 7.721559524536133, "step_time_ms": 7571.946859359741, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:23] (step=0000106) Train Loss: 0.2610, Train Steps/Sec: 0.13, Epoch: 0.0020598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 107, "loss": 0.28690552711486816, "memory_gb": 7.721559524536133, "step_time_ms": 7587.555646896362, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:31] (step=0000107) Train Loss: 0.2754, Train Steps/Sec: 0.13, Epoch: 0.002079284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 108, "loss": 0.19000378251075745, "memory_gb": 7.721559524536133, "step_time_ms": 7564.434289932251, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:39] (step=0000108) Train Loss: 0.2715, Train Steps/Sec: 0.13, Epoch: 0.002098717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 109, "loss": 0.3530789315700531, "memory_gb": 7.721559524536133, "step_time_ms": 7506.59966468811, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:47] (step=0000109) Train Loss: 0.3177, Train Steps/Sec: 0.13, Epoch: 0.002118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 110, "loss": 0.22505417466163635, "memory_gb": 7.721559524536133, "step_time_ms": 7501.842021942139, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:11:55] (step=0000110) Train Loss: 0.2281, Train Steps/Sec: 0.13, Epoch: 0.002137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 111, "loss": 0.24020372331142426, "memory_gb": 7.721559524536133, "step_time_ms": 7500.630140304565, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:03] (step=0000111) Train Loss: 0.2849, Train Steps/Sec: 0.13, Epoch: 0.0021570151574038087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 112, "loss": 0.22619563341140747, "memory_gb": 7.721559524536133, "step_time_ms": 7445.794582366943, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:11] (step=0000112) Train Loss: 0.2007, Train Steps/Sec: 0.13, Epoch: 0.002176447726389429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 113, "loss": 0.20409977436065674, "memory_gb": 7.721559524536133, "step_time_ms": 7458.245515823364, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:19] (step=0000113) Train Loss: 0.2307, Train Steps/Sec: 0.13, Epoch: 0.0021958802953750485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 114, "loss": 0.27569133043289185, "memory_gb": 7.721559524536133, "step_time_ms": 7501.948595046997, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:27] (step=0000114) Train Loss: 0.2557, Train Steps/Sec: 0.13, Epoch: 0.0022153128643606686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 115, "loss": 0.29605409502983093, "memory_gb": 7.721559524536133, "step_time_ms": 7426.423072814941, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:35] (step=0000115) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.0022347454333462883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 116, "loss": 0.31852924823760986, "memory_gb": 7.721559524536133, "step_time_ms": 7514.292240142822, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:43] (step=0000116) Train Loss: 0.2765, Train Steps/Sec: 0.12, Epoch: 0.0022541780023319084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 117, "loss": 0.20677891373634338, "memory_gb": 7.721559524536133, "step_time_ms": 7511.12174987793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:51] (step=0000117) Train Loss: 0.1939, Train Steps/Sec: 0.12, Epoch: 0.002273610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:12:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 118, "loss": 0.2160826027393341, "memory_gb": 7.721559524536133, "step_time_ms": 7337.864398956299, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:12:59] (step=0000118) Train Loss: 0.2134, Train Steps/Sec: 0.13, Epoch: 0.0022930431403031483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 119, "loss": 0.15745897591114044, "memory_gb": 7.721559524536133, "step_time_ms": 7153.786420822144, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:06] (step=0000119) Train Loss: 0.2218, Train Steps/Sec: 0.14, Epoch: 0.002312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 120, "loss": 0.2123843878507614, "memory_gb": 7.721559524536133, "step_time_ms": 5687.739610671997, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:13] (step=0000120) Train Loss: 0.2044, Train Steps/Sec: 0.16, Epoch: 0.002331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 121, "loss": 0.253734290599823, "memory_gb": 7.721559524536133, "step_time_ms": 7462.59331703186, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:21] (step=0000121) Train Loss: 0.2785, Train Steps/Sec: 0.12, Epoch: 0.0023513408472600078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 122, "loss": 0.15914906561374664, "memory_gb": 7.721559524536133, "step_time_ms": 7476.1223793029785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:29] (step=0000122) Train Loss: 0.2357, Train Steps/Sec: 0.13, Epoch: 0.002370773416245628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 123, "loss": 0.31151875853538513, "memory_gb": 7.721559524536133, "step_time_ms": 7403.46360206604, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:37] (step=0000123) Train Loss: 0.2314, Train Steps/Sec: 0.13, Epoch: 0.0023902059852312476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 124, "loss": 0.25656914710998535, "memory_gb": 7.721559524536133, "step_time_ms": 7434.469938278198, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:45] (step=0000124) Train Loss: 0.2723, Train Steps/Sec: 0.12, Epoch: 0.0024096385542168677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:13:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 125, "loss": 0.30480891466140747, "memory_gb": 7.721559524536133, "step_time_ms": 7479.938507080078, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:13:53] (step=0000125) Train Loss: 0.2624, Train Steps/Sec: 0.12, Epoch: 0.0024290711232024874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 126, "loss": 0.30002784729003906, "memory_gb": 7.721559524536133, "step_time_ms": 7415.016889572144, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:01] (step=0000126) Train Loss: 0.2754, Train Steps/Sec: 0.13, Epoch: 0.002448503692188107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 127, "loss": 0.334614634513855, "memory_gb": 7.721559524536133, "step_time_ms": 7406.353235244751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:09] (step=0000127) Train Loss: 0.2646, Train Steps/Sec: 0.13, Epoch: 0.002467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 128, "loss": 0.2596715986728668, "memory_gb": 7.721559524536133, "step_time_ms": 7474.039554595947, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:17] (step=0000128) Train Loss: 0.2249, Train Steps/Sec: 0.12, Epoch: 0.002487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 129, "loss": 0.12944412231445312, "memory_gb": 7.721559524536133, "step_time_ms": 7459.848165512085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:25] (step=0000129) Train Loss: 0.1772, Train Steps/Sec: 0.12, Epoch: 0.002506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 130, "loss": 0.2256413996219635, "memory_gb": 7.721559524536133, "step_time_ms": 7530.429363250732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:33] (step=0000130) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.0025262339681305867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 131, "loss": 0.2884518504142761, "memory_gb": 7.721559524536133, "step_time_ms": 7503.815174102783, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:41] (step=0000131) Train Loss: 0.2884, Train Steps/Sec: 0.13, Epoch: 0.002545666537116207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 132, "loss": 0.21946626901626587, "memory_gb": 7.721559524536133, "step_time_ms": 7414.147853851318, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:49] (step=0000132) Train Loss: 0.2452, Train Steps/Sec: 0.13, Epoch: 0.0025650991061018265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 133, "loss": 0.2067601978778839, "memory_gb": 7.721559524536133, "step_time_ms": 7433.511257171631, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:14:57] (step=0000133) Train Loss: 0.1856, Train Steps/Sec: 0.12, Epoch: 0.0025845316750874466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 134, "loss": 0.16041795909404755, "memory_gb": 7.721559524536133, "step_time_ms": 7466.978073120117, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:05] (step=0000134) Train Loss: 0.1501, Train Steps/Sec: 0.12, Epoch: 0.0026039642440730663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 135, "loss": 0.2219078689813614, "memory_gb": 7.721559524536133, "step_time_ms": 7386.183023452759, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:13] (step=0000135) Train Loss: 0.2396, Train Steps/Sec: 0.13, Epoch: 0.0026233968130586864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 136, "loss": 0.1995117962360382, "memory_gb": 7.721559524536133, "step_time_ms": 7478.189468383789, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:21] (step=0000136) Train Loss: 0.2652, Train Steps/Sec: 0.13, Epoch: 0.002642829382044306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 137, "loss": 0.2923789620399475, "memory_gb": 7.721559524536133, "step_time_ms": 7518.797159194946, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:29] (step=0000137) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.0026622619510299262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 138, "loss": 0.38095229864120483, "memory_gb": 7.721559524536133, "step_time_ms": 7440.5152797698975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:37] (step=0000138) Train Loss: 0.2952, Train Steps/Sec: 0.13, Epoch: 0.002681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 139, "loss": 0.32313108444213867, "memory_gb": 7.721559524536133, "step_time_ms": 7497.230291366577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:45] (step=0000139) Train Loss: 0.2974, Train Steps/Sec: 0.12, Epoch: 0.002701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:15:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 140, "loss": 0.2560212016105652, "memory_gb": 7.721559524536133, "step_time_ms": 7524.898529052734, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:15:53] (step=0000140) Train Loss: 0.2745, Train Steps/Sec: 0.13, Epoch: 0.0027205596579867857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 141, "loss": 0.24361209571361542, "memory_gb": 7.721559524536133, "step_time_ms": 7420.86935043335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:01] (step=0000141) Train Loss: 0.2671, Train Steps/Sec: 0.13, Epoch: 0.002739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 142, "loss": 0.22363422811031342, "memory_gb": 7.721559524536133, "step_time_ms": 7450.45804977417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:09] (step=0000142) Train Loss: 0.2224, Train Steps/Sec: 0.12, Epoch: 0.0027594247959580255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 143, "loss": 0.2535204291343689, "memory_gb": 7.721559524536133, "step_time_ms": 7496.0503578186035, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:17] (step=0000143) Train Loss: 0.2069, Train Steps/Sec: 0.13, Epoch: 0.0027788573649436456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 144, "loss": 0.2947269082069397, "memory_gb": 7.721559524536133, "step_time_ms": 7429.870843887329, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:25] (step=0000144) Train Loss: 0.3028, Train Steps/Sec: 0.12, Epoch: 0.0027982899339292653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 145, "loss": 0.1769900619983673, "memory_gb": 7.721559524536133, "step_time_ms": 7502.13623046875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:33] (step=0000145) Train Loss: 0.2261, Train Steps/Sec: 0.12, Epoch: 0.0028177225029148855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 146, "loss": 0.29083627462387085, "memory_gb": 7.721559524536133, "step_time_ms": 7569.283962249756, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:41] (step=0000146) Train Loss: 0.2522, Train Steps/Sec: 0.12, Epoch: 0.002837155071900505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 147, "loss": 0.21858619153499603, "memory_gb": 7.721559524536133, "step_time_ms": 7400.76756477356, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:49] (step=0000147) Train Loss: 0.2612, Train Steps/Sec: 0.13, Epoch: 0.0028565876408861253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:16:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 148, "loss": 0.27861785888671875, "memory_gb": 7.721559524536133, "step_time_ms": 7057.399034500122, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:16:56] (step=0000148) Train Loss: 0.1979, Train Steps/Sec: 0.14, Epoch: 0.002876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 149, "loss": 0.28302517533302307, "memory_gb": 7.721559524536133, "step_time_ms": 6059.103727340698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:03] (step=0000149) Train Loss: 0.2085, Train Steps/Sec: 0.16, Epoch: 0.002895452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 150, "loss": 0.2511742115020752, "memory_gb": 7.721559524536133, "step_time_ms": 7560.3766441345215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:11] (step=0000150) Train Loss: 0.2635, Train Steps/Sec: 0.12, Epoch: 0.0029148853478429848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 151, "loss": 0.13398173451423645, "memory_gb": 7.721559524536133, "step_time_ms": 7568.510293960571, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:19] (step=0000151) Train Loss: 0.1791, Train Steps/Sec: 0.12, Epoch: 0.002934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 152, "loss": 0.23045089840888977, "memory_gb": 7.721559524536133, "step_time_ms": 7525.261402130127, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:27] (step=0000152) Train Loss: 0.2268, Train Steps/Sec: 0.12, Epoch: 0.0029537504858142246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 153, "loss": 0.2774631977081299, "memory_gb": 7.721559524536133, "step_time_ms": 7594.46382522583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:35] (step=0000153) Train Loss: 0.2353, Train Steps/Sec: 0.12, Epoch: 0.0029731830547998447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 154, "loss": 0.18364675343036652, "memory_gb": 7.721559524536133, "step_time_ms": 7641.345024108887, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:43] (step=0000154) Train Loss: 0.2393, Train Steps/Sec: 0.12, Epoch: 0.0029926156237854644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 155, "loss": 0.19374728202819824, "memory_gb": 7.721559524536133, "step_time_ms": 7560.33182144165, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:51] (step=0000155) Train Loss: 0.2471, Train Steps/Sec: 0.13, Epoch: 0.0030120481927710845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 156, "loss": 0.24382688105106354, "memory_gb": 7.721559524536133, "step_time_ms": 7543.5168743133545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:17:59] (step=0000156) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.003031480761756704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 157, "loss": 0.21249136328697205, "memory_gb": 7.721559524536133, "step_time_ms": 7648.982524871826, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:07] (step=0000157) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.0030509133307423243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 158, "loss": 0.3049654960632324, "memory_gb": 7.721559524536133, "step_time_ms": 7592.979192733765, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:15] (step=0000158) Train Loss: 0.2618, Train Steps/Sec: 0.12, Epoch: 0.003070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 159, "loss": 0.244717076420784, "memory_gb": 7.721559524536133, "step_time_ms": 7547.5757122039795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:23] (step=0000159) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.003089778468713564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 160, "loss": 0.26353099942207336, "memory_gb": 7.721559524536133, "step_time_ms": 7638.192653656006, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:31] (step=0000160) Train Loss: 0.2674, Train Steps/Sec: 0.13, Epoch: 0.003109211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 161, "loss": 0.18871240317821503, "memory_gb": 7.721559524536133, "step_time_ms": 7576.378583908081, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:39] (step=0000161) Train Loss: 0.1853, Train Steps/Sec: 0.13, Epoch: 0.003128643606684804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 162, "loss": 0.2720540761947632, "memory_gb": 7.721559524536133, "step_time_ms": 7601.999044418335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:47] (step=0000162) Train Loss: 0.3066, Train Steps/Sec: 0.12, Epoch: 0.0031480761756704236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:18:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 163, "loss": 0.2724977433681488, "memory_gb": 7.721559524536133, "step_time_ms": 7605.636835098267, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:18:55] (step=0000163) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.0031675087446560437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 164, "loss": 0.22719815373420715, "memory_gb": 7.721559524536133, "step_time_ms": 7471.587419509888, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:03] (step=0000164) Train Loss: 0.2070, Train Steps/Sec: 0.12, Epoch: 0.0031869413136416634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 165, "loss": 0.2773497700691223, "memory_gb": 7.721559524536133, "step_time_ms": 7531.444549560547, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:11] (step=0000165) Train Loss: 0.2521, Train Steps/Sec: 0.12, Epoch: 0.0032063738826272835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 166, "loss": 0.2929798662662506, "memory_gb": 7.721559524536133, "step_time_ms": 7553.936004638672, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:19] (step=0000166) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.0032258064516129032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 167, "loss": 0.21940909326076508, "memory_gb": 7.721559524536133, "step_time_ms": 7511.222839355469, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:27] (step=0000167) Train Loss: 0.2268, Train Steps/Sec: 0.13, Epoch: 0.003245239020598523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 168, "loss": 0.23898138105869293, "memory_gb": 7.721559524536133, "step_time_ms": 7536.024332046509, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:35] (step=0000168) Train Loss: 0.2248, Train Steps/Sec: 0.12, Epoch: 0.003264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 169, "loss": 0.28453993797302246, "memory_gb": 7.721559524536133, "step_time_ms": 7581.707239151001, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:43] (step=0000169) Train Loss: 0.2687, Train Steps/Sec: 0.12, Epoch: 0.0032841041585697627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 170, "loss": 0.32203376293182373, "memory_gb": 7.721559524536133, "step_time_ms": 7470.778703689575, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:51] (step=0000170) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.003303536727555383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:19:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 171, "loss": 0.23101051151752472, "memory_gb": 7.721559524536133, "step_time_ms": 7495.282411575317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:19:59] (step=0000171) Train Loss: 0.2116, Train Steps/Sec: 0.13, Epoch: 0.0033229692965410025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 172, "loss": 0.3484323024749756, "memory_gb": 7.721559524536133, "step_time_ms": 7589.39790725708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:07] (step=0000172) Train Loss: 0.3048, Train Steps/Sec: 0.12, Epoch: 0.0033424018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 173, "loss": 0.15665897727012634, "memory_gb": 7.721559524536133, "step_time_ms": 7569.251775741577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:15] (step=0000173) Train Loss: 0.1521, Train Steps/Sec: 0.12, Epoch: 0.0033618344345122423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 174, "loss": 0.15185779333114624, "memory_gb": 7.721559524536133, "step_time_ms": 7458.374500274658, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:23] (step=0000174) Train Loss: 0.2017, Train Steps/Sec: 0.13, Epoch: 0.0033812670034978625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 175, "loss": 0.21098095178604126, "memory_gb": 7.721559524536133, "step_time_ms": 7514.371395111084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:31] (step=0000175) Train Loss: 0.2010, Train Steps/Sec: 0.12, Epoch: 0.003400699572483482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 176, "loss": 0.26392024755477905, "memory_gb": 7.721559524536133, "step_time_ms": 7363.085985183716, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:39] (step=0000176) Train Loss: 0.2634, Train Steps/Sec: 0.13, Epoch: 0.0034201321414691023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 177, "loss": 0.27000030875205994, "memory_gb": 7.721559524536133, "step_time_ms": 7019.2248821258545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:47] (step=0000177) Train Loss: 0.2827, Train Steps/Sec: 0.14, Epoch: 0.003439564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 178, "loss": 0.3325067162513733, "memory_gb": 7.721559524536133, "step_time_ms": 5557.622909545898, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:20:53] (step=0000178) Train Loss: 0.2741, Train Steps/Sec: 0.16, Epoch: 0.003458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 179, "loss": 0.2508588135242462, "memory_gb": 7.721559524536133, "step_time_ms": 7486.722946166992, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:01] (step=0000179) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.0034784298484259618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 180, "loss": 0.2739037275314331, "memory_gb": 7.721559524536133, "step_time_ms": 7503.4894943237305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:09] (step=0000180) Train Loss: 0.2729, Train Steps/Sec: 0.12, Epoch: 0.003497862417411582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 181, "loss": 0.3660714030265808, "memory_gb": 7.721559524536133, "step_time_ms": 7545.406341552734, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:17] (step=0000181) Train Loss: 0.3159, Train Steps/Sec: 0.13, Epoch: 0.0035172949863972016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 182, "loss": 0.279680460691452, "memory_gb": 7.721559524536133, "step_time_ms": 7501.846790313721, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:25] (step=0000182) Train Loss: 0.2678, Train Steps/Sec: 0.12, Epoch: 0.0035367275553828217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 183, "loss": 0.1459459662437439, "memory_gb": 7.721559524536133, "step_time_ms": 7506.994009017944, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:33] (step=0000183) Train Loss: 0.2086, Train Steps/Sec: 0.12, Epoch: 0.0035561601243684414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 184, "loss": 0.31243184208869934, "memory_gb": 7.721559524536133, "step_time_ms": 7445.2502727508545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:41] (step=0000184) Train Loss: 0.3334, Train Steps/Sec: 0.12, Epoch: 0.0035755926933540615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 185, "loss": 0.16259944438934326, "memory_gb": 7.721559524536133, "step_time_ms": 7426.932573318481, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:49] (step=0000185) Train Loss: 0.1867, Train Steps/Sec: 0.12, Epoch: 0.003595025262339681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 186, "loss": 0.18030057847499847, "memory_gb": 7.721559524536133, "step_time_ms": 7468.41287612915, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:21:57] (step=0000186) Train Loss: 0.1654, Train Steps/Sec: 0.12, Epoch: 0.0036144578313253013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 187, "loss": 0.22940494120121002, "memory_gb": 7.721559524536133, "step_time_ms": 7405.344247817993, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:05] (step=0000187) Train Loss: 0.1894, Train Steps/Sec: 0.12, Epoch: 0.003633890400310921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 188, "loss": 0.24684149026870728, "memory_gb": 7.721559524536133, "step_time_ms": 7479.646921157837, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:13] (step=0000188) Train Loss: 0.2194, Train Steps/Sec: 0.13, Epoch: 0.003653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 189, "loss": 0.13432317972183228, "memory_gb": 7.715639114379883, "step_time_ms": 7479.673862457275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:21] (step=0000189) Train Loss: 0.1798, Train Steps/Sec: 0.12, Epoch: 0.003672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 190, "loss": 0.18843436241149902, "memory_gb": 7.721559524536133, "step_time_ms": 7455.130815505981, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:29] (step=0000190) Train Loss: 0.2655, Train Steps/Sec: 0.12, Epoch: 0.003692188107267781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 191, "loss": 0.29568177461624146, "memory_gb": 7.721559524536133, "step_time_ms": 7414.533615112305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:37] (step=0000191) Train Loss: 0.2878, Train Steps/Sec: 0.13, Epoch: 0.0037116206762534006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 192, "loss": 0.31396961212158203, "memory_gb": 7.721559524536133, "step_time_ms": 7472.254753112793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:45] (step=0000192) Train Loss: 0.2856, Train Steps/Sec: 0.12, Epoch: 0.0037310532452390207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:22:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 193, "loss": 0.3005138337612152, "memory_gb": 7.721559524536133, "step_time_ms": 7452.540159225464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:22:53] (step=0000193) Train Loss: 0.3044, Train Steps/Sec: 0.12, Epoch: 0.0037504858142246404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 194, "loss": 0.2996774911880493, "memory_gb": 7.721559524536133, "step_time_ms": 7422.956943511963, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:01] (step=0000194) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.0037699183832102605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 195, "loss": 0.32089993357658386, "memory_gb": 7.721559524536133, "step_time_ms": 7465.221643447876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:09] (step=0000195) Train Loss: 0.2646, Train Steps/Sec: 0.12, Epoch: 0.0037893509521958802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 196, "loss": 0.16797134280204773, "memory_gb": 7.721559524536133, "step_time_ms": 7352.685213088989, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:17] (step=0000196) Train Loss: 0.2139, Train Steps/Sec: 0.13, Epoch: 0.0038087835211815003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 197, "loss": 0.21125580370426178, "memory_gb": 7.721559524536133, "step_time_ms": 7407.667636871338, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:25] (step=0000197) Train Loss: 0.2471, Train Steps/Sec: 0.13, Epoch: 0.00382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 198, "loss": 0.19508576393127441, "memory_gb": 7.721559524536133, "step_time_ms": 7459.526062011719, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:33] (step=0000198) Train Loss: 0.1966, Train Steps/Sec: 0.12, Epoch: 0.00384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 199, "loss": 0.3312471807003021, "memory_gb": 7.721559524536133, "step_time_ms": 7340.048551559448, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:42] (step=0000199) Train Loss: 0.2951, Train Steps/Sec: 0.13, Epoch: 0.00386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 200, "loss": 0.25934478640556335, "memory_gb": 7.721559524536133, "step_time_ms": 7117.647647857666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:50] (step=0000200) Train Loss: 0.2318, Train Steps/Sec: 0.13, Epoch: 0.00388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:23:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 201, "loss": 0.2084348499774933, "memory_gb": 7.721559524536133, "step_time_ms": 7407.530307769775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:23:58] (step=0000201) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.0039059463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 202, "loss": 0.2713736295700073, "memory_gb": 7.721559524536133, "step_time_ms": 7347.2137451171875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:06] (step=0000202) Train Loss: 0.2896, Train Steps/Sec: 0.13, Epoch: 0.003925378935095219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 203, "loss": 0.30674317479133606, "memory_gb": 7.721559524536133, "step_time_ms": 7423.014879226685, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:14] (step=0000203) Train Loss: 0.2979, Train Steps/Sec: 0.13, Epoch: 0.00394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 204, "loss": 0.23806416988372803, "memory_gb": 7.721559524536133, "step_time_ms": 7470.444202423096, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:22] (step=0000204) Train Loss: 0.2668, Train Steps/Sec: 0.12, Epoch: 0.00396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 205, "loss": 0.29966121912002563, "memory_gb": 7.721559524536133, "step_time_ms": 7324.336051940918, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:29] (step=0000205) Train Loss: 0.2755, Train Steps/Sec: 0.13, Epoch: 0.003983676642052079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 206, "loss": 0.25706008076667786, "memory_gb": 7.721559524536133, "step_time_ms": 7053.058862686157, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:37] (step=0000206) Train Loss: 0.2820, Train Steps/Sec: 0.14, Epoch: 0.004003109211037699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 207, "loss": 0.32341906428337097, "memory_gb": 7.721559524536133, "step_time_ms": 5142.50111579895, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:43] (step=0000207) Train Loss: 0.3357, Train Steps/Sec: 0.17, Epoch: 0.0040225417800233195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 208, "loss": 0.214075967669487, "memory_gb": 7.721559524536133, "step_time_ms": 7584.501504898071, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:51] (step=0000208) Train Loss: 0.1870, Train Steps/Sec: 0.12, Epoch: 0.004041974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:24:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 209, "loss": 0.257750004529953, "memory_gb": 7.721559524536133, "step_time_ms": 7523.993015289307, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:24:59] (step=0000209) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.004061406917994559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 210, "loss": 0.20927192270755768, "memory_gb": 7.715639114379883, "step_time_ms": 7448.524475097656, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:07] (step=0000210) Train Loss: 0.2299, Train Steps/Sec: 0.12, Epoch: 0.004080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 211, "loss": 0.21083450317382812, "memory_gb": 7.721559524536133, "step_time_ms": 7591.042757034302, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:15] (step=0000211) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.004100272055965798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 212, "loss": 0.24526508152484894, "memory_gb": 7.721559524536133, "step_time_ms": 7483.885288238525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:23] (step=0000212) Train Loss: 0.2903, Train Steps/Sec: 0.13, Epoch: 0.004119704624951419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 213, "loss": 0.19488033652305603, "memory_gb": 7.721559524536133, "step_time_ms": 7420.46332359314, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:31] (step=0000213) Train Loss: 0.2245, Train Steps/Sec: 0.12, Epoch: 0.0041391371939370385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 214, "loss": 0.21374019980430603, "memory_gb": 7.721559524536133, "step_time_ms": 7510.691404342651, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:39] (step=0000214) Train Loss: 0.2680, Train Steps/Sec: 0.13, Epoch: 0.004158569762922658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 215, "loss": 0.19657976925373077, "memory_gb": 7.721559524536133, "step_time_ms": 7533.214330673218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:47] (step=0000215) Train Loss: 0.1814, Train Steps/Sec: 0.12, Epoch: 0.004178002331908278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:25:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 216, "loss": 0.13017278909683228, "memory_gb": 7.721559524536133, "step_time_ms": 7478.142261505127, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:25:55] (step=0000216) Train Loss: 0.1483, Train Steps/Sec: 0.13, Epoch: 0.004197434900893898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 217, "loss": 0.24015820026397705, "memory_gb": 7.721559524536133, "step_time_ms": 7582.1373462677, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:03] (step=0000217) Train Loss: 0.2354, Train Steps/Sec: 0.13, Epoch: 0.004216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 218, "loss": 0.2567160725593567, "memory_gb": 7.721559524536133, "step_time_ms": 7583.71901512146, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:11] (step=0000218) Train Loss: 0.2643, Train Steps/Sec: 0.13, Epoch: 0.004236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 219, "loss": 0.2121877372264862, "memory_gb": 7.721559524536133, "step_time_ms": 7562.146902084351, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:19] (step=0000219) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.0042557326078507575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 220, "loss": 0.19980758428573608, "memory_gb": 7.721559524536133, "step_time_ms": 7581.358432769775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:27] (step=0000220) Train Loss: 0.1599, Train Steps/Sec: 0.13, Epoch: 0.004275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 221, "loss": 0.2591289281845093, "memory_gb": 7.721559524536133, "step_time_ms": 7619.572877883911, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:35] (step=0000221) Train Loss: 0.3104, Train Steps/Sec: 0.12, Epoch: 0.004294597745821998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 222, "loss": 0.38418424129486084, "memory_gb": 7.721559524536133, "step_time_ms": 7509.522438049316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:43] (step=0000222) Train Loss: 0.3570, Train Steps/Sec: 0.12, Epoch: 0.004314030314807617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 223, "loss": 0.17916211485862732, "memory_gb": 7.721559524536133, "step_time_ms": 7576.998710632324, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:51] (step=0000223) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.004333462883793237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 224, "loss": 0.2143276482820511, "memory_gb": 7.721559524536133, "step_time_ms": 7615.633487701416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:26:59] (step=0000224) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.004352895452778858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 225, "loss": 0.31010448932647705, "memory_gb": 7.721559524536133, "step_time_ms": 7511.8584632873535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:07] (step=0000225) Train Loss: 0.3507, Train Steps/Sec: 0.13, Epoch: 0.004372328021764477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 226, "loss": 0.2524460554122925, "memory_gb": 7.721559524536133, "step_time_ms": 7576.096773147583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:15] (step=0000226) Train Loss: 0.1990, Train Steps/Sec: 0.12, Epoch: 0.004391760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 227, "loss": 0.19415618479251862, "memory_gb": 7.721559524536133, "step_time_ms": 7566.251277923584, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:23] (step=0000227) Train Loss: 0.1756, Train Steps/Sec: 0.12, Epoch: 0.004411193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 228, "loss": 0.17366084456443787, "memory_gb": 7.721559524536133, "step_time_ms": 7514.6214962005615, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:32] (step=0000228) Train Loss: 0.2236, Train Steps/Sec: 0.12, Epoch: 0.004430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 229, "loss": 0.19229485094547272, "memory_gb": 7.721559524536133, "step_time_ms": 7727.422475814819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:40] (step=0000229) Train Loss: 0.1652, Train Steps/Sec: 0.12, Epoch: 0.004450058297706957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 230, "loss": 0.31754565238952637, "memory_gb": 7.721559524536133, "step_time_ms": 7611.018657684326, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:48] (step=0000230) Train Loss: 0.2795, Train Steps/Sec: 0.12, Epoch: 0.004469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:27:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 231, "loss": 0.1927809715270996, "memory_gb": 7.721559524536133, "step_time_ms": 7503.695249557495, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:27:56] (step=0000231) Train Loss: 0.2984, Train Steps/Sec: 0.12, Epoch: 0.004488923435678196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 232, "loss": 0.23985764384269714, "memory_gb": 7.721559524536133, "step_time_ms": 7535.685777664185, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:04] (step=0000232) Train Loss: 0.1848, Train Steps/Sec: 0.12, Epoch: 0.004508356004663817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 233, "loss": 0.3713711202144623, "memory_gb": 7.721559524536133, "step_time_ms": 7575.428485870361, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:12] (step=0000233) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.004527788573649437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 234, "loss": 0.19368760287761688, "memory_gb": 7.721559524536133, "step_time_ms": 7302.49810218811, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:20] (step=0000234) Train Loss: 0.2403, Train Steps/Sec: 0.13, Epoch: 0.004547221142635056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 235, "loss": 0.21496066451072693, "memory_gb": 7.721559524536133, "step_time_ms": 7329.853534698486, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:27] (step=0000235) Train Loss: 0.2513, Train Steps/Sec: 0.13, Epoch: 0.004566653711620676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 236, "loss": 0.3891306519508362, "memory_gb": 7.721559524536133, "step_time_ms": 5289.46852684021, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:34] (step=0000236) Train Loss: 0.3701, Train Steps/Sec: 0.16, Epoch: 0.0045860862806062965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 237, "loss": 0.28353363275527954, "memory_gb": 7.721559524536133, "step_time_ms": 7492.661237716675, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:42] (step=0000237) Train Loss: 0.2306, Train Steps/Sec: 0.12, Epoch: 0.004605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 238, "loss": 0.34805697202682495, "memory_gb": 7.721559524536133, "step_time_ms": 7495.864152908325, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:50] (step=0000238) Train Loss: 0.3305, Train Steps/Sec: 0.12, Epoch: 0.004624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 239, "loss": 0.2854536771774292, "memory_gb": 7.721559524536133, "step_time_ms": 7416.354179382324, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:28:58] (step=0000239) Train Loss: 0.2310, Train Steps/Sec: 0.13, Epoch: 0.004644383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 240, "loss": 0.259648859500885, "memory_gb": 7.721559524536133, "step_time_ms": 7555.030107498169, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:06] (step=0000240) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.004663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 241, "loss": 0.2931545376777649, "memory_gb": 7.721559524536133, "step_time_ms": 7523.144006729126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:14] (step=0000241) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.004683249125534396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 242, "loss": 0.27426546812057495, "memory_gb": 7.721559524536133, "step_time_ms": 7463.950634002686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:22] (step=0000242) Train Loss: 0.2582, Train Steps/Sec: 0.12, Epoch: 0.0047026816945200155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 243, "loss": 0.29878801107406616, "memory_gb": 7.721559524536133, "step_time_ms": 7535.586595535278, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:30] (step=0000243) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.004722114263505635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 244, "loss": 0.18848839402198792, "memory_gb": 7.721559524536133, "step_time_ms": 7492.646932601929, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:38] (step=0000244) Train Loss: 0.2771, Train Steps/Sec: 0.12, Epoch: 0.004741546832491256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 245, "loss": 0.23992708325386047, "memory_gb": 7.721559524536133, "step_time_ms": 7425.487756729126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:46] (step=0000245) Train Loss: 0.2734, Train Steps/Sec: 0.12, Epoch: 0.0047609794014768754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:29:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 246, "loss": 0.31378814578056335, "memory_gb": 7.715639114379883, "step_time_ms": 7459.003925323486, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:29:54] (step=0000246) Train Loss: 0.2585, Train Steps/Sec: 0.12, Epoch: 0.004780411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 247, "loss": 0.2352513074874878, "memory_gb": 7.721559524536133, "step_time_ms": 7428.694009780884, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:02] (step=0000247) Train Loss: 0.2414, Train Steps/Sec: 0.13, Epoch: 0.004799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 248, "loss": 0.1456538885831833, "memory_gb": 7.721559524536133, "step_time_ms": 7403.130292892456, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:10] (step=0000248) Train Loss: 0.1941, Train Steps/Sec: 0.13, Epoch: 0.004819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 249, "loss": 0.16460812091827393, "memory_gb": 7.721559524536133, "step_time_ms": 7494.884967803955, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:18] (step=0000249) Train Loss: 0.2326, Train Steps/Sec: 0.12, Epoch: 0.004838709677419355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 250, "loss": 0.24289484322071075, "memory_gb": 7.721559524536133, "step_time_ms": 7492.387056350708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:26] (step=0000250) Train Loss: 0.2766, Train Steps/Sec: 0.12, Epoch: 0.004858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 251, "loss": 0.25320151448249817, "memory_gb": 7.721559524536133, "step_time_ms": 7410.568952560425, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:34] (step=0000251) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.004877574815390594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 252, "loss": 0.2310183048248291, "memory_gb": 7.721559524536133, "step_time_ms": 7471.102952957153, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:42] (step=0000252) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.004897007384376214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 253, "loss": 0.28664520382881165, "memory_gb": 7.721559524536133, "step_time_ms": 7444.316625595093, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:50] (step=0000253) Train Loss: 0.2949, Train Steps/Sec: 0.12, Epoch: 0.004916439953361835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:30:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 254, "loss": 0.22948944568634033, "memory_gb": 7.721559524536133, "step_time_ms": 7380.281686782837, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:30:58] (step=0000254) Train Loss: 0.1809, Train Steps/Sec: 0.12, Epoch: 0.004935872522347454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 255, "loss": 0.23081433773040771, "memory_gb": 7.721559524536133, "step_time_ms": 7507.2691440582275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:06] (step=0000255) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.004955305091333074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 256, "loss": 0.31919336318969727, "memory_gb": 7.721559524536133, "step_time_ms": 7532.510995864868, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:15] (step=0000256) Train Loss: 0.2923, Train Steps/Sec: 0.12, Epoch: 0.004974737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 257, "loss": 0.31050431728363037, "memory_gb": 7.721559524536133, "step_time_ms": 7450.531721115112, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:23] (step=0000257) Train Loss: 0.2959, Train Steps/Sec: 0.12, Epoch: 0.004994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 258, "loss": 0.3572092652320862, "memory_gb": 7.721559524536133, "step_time_ms": 7543.722152709961, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:31] (step=0000258) Train Loss: 0.3086, Train Steps/Sec: 0.12, Epoch: 0.005013602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 259, "loss": 0.24435395002365112, "memory_gb": 7.721559524536133, "step_time_ms": 7508.730888366699, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:39] (step=0000259) Train Loss: 0.2854, Train Steps/Sec: 0.12, Epoch: 0.005033035367275554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 260, "loss": 0.24795226752758026, "memory_gb": 7.721559524536133, "step_time_ms": 7471.405744552612, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:47] (step=0000260) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.005052467936261173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:31:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 261, "loss": 0.2625769376754761, "memory_gb": 7.721559524536133, "step_time_ms": 7466.017961502075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:31:55] (step=0000261) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.005071900505246794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 262, "loss": 0.23571382462978363, "memory_gb": 7.721559524536133, "step_time_ms": 7477.9651165008545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:03] (step=0000262) Train Loss: 0.2708, Train Steps/Sec: 0.12, Epoch: 0.005091333074232414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 263, "loss": 0.1796359121799469, "memory_gb": 7.721559524536133, "step_time_ms": 7265.61427116394, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:11] (step=0000263) Train Loss: 0.1856, Train Steps/Sec: 0.13, Epoch: 0.005110765643218033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 264, "loss": 0.29892969131469727, "memory_gb": 7.721559524536133, "step_time_ms": 7320.825099945068, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:19] (step=0000264) Train Loss: 0.2772, Train Steps/Sec: 0.13, Epoch: 0.005130198212203653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 265, "loss": 0.26724785566329956, "memory_gb": 7.721559524536133, "step_time_ms": 5440.70839881897, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:25] (step=0000265) Train Loss: 0.2188, Train Steps/Sec: 0.17, Epoch: 0.0051496307811892735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 266, "loss": 0.3075309991836548, "memory_gb": 7.721559524536133, "step_time_ms": 7478.497743606567, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:33] (step=0000266) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.005169063350174893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 267, "loss": 0.2911860942840576, "memory_gb": 7.721559524536133, "step_time_ms": 7506.167650222778, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:41] (step=0000267) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.005188495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 268, "loss": 0.27752184867858887, "memory_gb": 7.721559524536133, "step_time_ms": 7412.087678909302, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:49] (step=0000268) Train Loss: 0.3166, Train Steps/Sec: 0.12, Epoch: 0.005207928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 269, "loss": 0.21961642801761627, "memory_gb": 7.721559524536133, "step_time_ms": 7601.238012313843, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:32:57] (step=0000269) Train Loss: 0.2095, Train Steps/Sec: 0.12, Epoch: 0.005227361057131753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 270, "loss": 0.29552143812179565, "memory_gb": 7.721559524536133, "step_time_ms": 7511.636018753052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:05] (step=0000270) Train Loss: 0.2717, Train Steps/Sec: 0.12, Epoch: 0.005246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 271, "loss": 0.2819455564022064, "memory_gb": 7.721559524536133, "step_time_ms": 7409.59906578064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:13] (step=0000271) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.0052662261951029925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 272, "loss": 0.18866167962551117, "memory_gb": 7.721559524536133, "step_time_ms": 7540.755271911621, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:21] (step=0000272) Train Loss: 0.2048, Train Steps/Sec: 0.12, Epoch: 0.005285658764088612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 273, "loss": 0.27967363595962524, "memory_gb": 7.721559524536133, "step_time_ms": 7536.653518676758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:29] (step=0000273) Train Loss: 0.2590, Train Steps/Sec: 0.12, Epoch: 0.005305091333074233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 274, "loss": 0.3076661229133606, "memory_gb": 7.721559524536133, "step_time_ms": 7439.8157596588135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:37] (step=0000274) Train Loss: 0.3201, Train Steps/Sec: 0.12, Epoch: 0.0053245239020598524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 275, "loss": 0.13568688929080963, "memory_gb": 7.721559524536133, "step_time_ms": 7481.794834136963, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:45] (step=0000275) Train Loss: 0.1775, Train Steps/Sec: 0.12, Epoch: 0.005343956471045472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:33:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 276, "loss": 0.2547053098678589, "memory_gb": 7.721559524536133, "step_time_ms": 7523.215055465698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:33:53] (step=0000276) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.005363389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 277, "loss": 0.16513322293758392, "memory_gb": 7.721559524536133, "step_time_ms": 7263.602495193481, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:01] (step=0000277) Train Loss: 0.2040, Train Steps/Sec: 0.13, Epoch: 0.005382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 278, "loss": 0.22461427748203278, "memory_gb": 7.721559524536133, "step_time_ms": 7523.798704147339, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:09] (step=0000278) Train Loss: 0.2668, Train Steps/Sec: 0.12, Epoch: 0.005402254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 279, "loss": 0.3388440012931824, "memory_gb": 7.721559524536133, "step_time_ms": 7502.784252166748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:17] (step=0000279) Train Loss: 0.2513, Train Steps/Sec: 0.12, Epoch: 0.005421686746987952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 280, "loss": 0.22811394929885864, "memory_gb": 7.721559524536133, "step_time_ms": 7437.118291854858, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:25] (step=0000280) Train Loss: 0.2008, Train Steps/Sec: 0.12, Epoch: 0.005441119315973571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 281, "loss": 0.19141297042369843, "memory_gb": 7.721559524536133, "step_time_ms": 7541.804552078247, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:33] (step=0000281) Train Loss: 0.2735, Train Steps/Sec: 0.12, Epoch: 0.005460551884959192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 282, "loss": 0.189959317445755, "memory_gb": 7.721559524536133, "step_time_ms": 7509.800434112549, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:41] (step=0000282) Train Loss: 0.2578, Train Steps/Sec: 0.12, Epoch: 0.005479984453944812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 283, "loss": 0.2528935372829437, "memory_gb": 7.721559524536133, "step_time_ms": 7485.831260681152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:49] (step=0000283) Train Loss: 0.2013, Train Steps/Sec: 0.13, Epoch: 0.005499417022930431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 284, "loss": 0.3194204866886139, "memory_gb": 7.721559524536133, "step_time_ms": 7541.866064071655, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:34:57] (step=0000284) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.005518849591916051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 285, "loss": 0.31824782490730286, "memory_gb": 7.721559524536133, "step_time_ms": 7523.070335388184, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:06] (step=0000285) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.005538282160901672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 286, "loss": 0.19953018426895142, "memory_gb": 7.721559524536133, "step_time_ms": 7485.6531620025635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:14] (step=0000286) Train Loss: 0.2315, Train Steps/Sec: 0.12, Epoch: 0.005557714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 287, "loss": 0.30819132924079895, "memory_gb": 7.721559524536133, "step_time_ms": 7633.932828903198, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:22] (step=0000287) Train Loss: 0.3285, Train Steps/Sec: 0.12, Epoch: 0.005577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 288, "loss": 0.23057794570922852, "memory_gb": 7.721559524536133, "step_time_ms": 7482.697010040283, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:30] (step=0000288) Train Loss: 0.2961, Train Steps/Sec: 0.12, Epoch: 0.005596579867858531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 289, "loss": 0.15834689140319824, "memory_gb": 7.721559524536133, "step_time_ms": 7513.111352920532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:38] (step=0000289) Train Loss: 0.2382, Train Steps/Sec: 0.12, Epoch: 0.005616012436844151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 290, "loss": 0.20782221853733063, "memory_gb": 7.721559524536133, "step_time_ms": 7604.735851287842, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:46] (step=0000290) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.005635445005829771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 291, "loss": 0.20293235778808594, "memory_gb": 7.721559524536133, "step_time_ms": 7587.515592575073, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:35:54] (step=0000291) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.005654877574815391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 292, "loss": 0.18861395120620728, "memory_gb": 7.721559524536133, "step_time_ms": 7423.570156097412, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:02] (step=0000292) Train Loss: 0.2145, Train Steps/Sec: 0.13, Epoch: 0.00567431014380101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 293, "loss": 0.22704385221004486, "memory_gb": 7.721559524536133, "step_time_ms": 7607.651233673096, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:10] (step=0000293) Train Loss: 0.2014, Train Steps/Sec: 0.12, Epoch: 0.00569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 294, "loss": 0.29377424716949463, "memory_gb": 7.721559524536133, "step_time_ms": 5414.446592330933, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:16] (step=0000294) Train Loss: 0.2844, Train Steps/Sec: 0.17, Epoch: 0.0057131752817722505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 295, "loss": 0.26224443316459656, "memory_gb": 7.721559524536133, "step_time_ms": 7595.044851303101, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:24] (step=0000295) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.00573260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 296, "loss": 0.2655065655708313, "memory_gb": 7.721559524536133, "step_time_ms": 7515.451431274414, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:32] (step=0000296) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.00575204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 297, "loss": 0.3358769416809082, "memory_gb": 7.721559524536133, "step_time_ms": 7553.1275272369385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:40] (step=0000297) Train Loss: 0.3157, Train Steps/Sec: 0.13, Epoch: 0.00577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 298, "loss": 0.2552902102470398, "memory_gb": 7.721559524536133, "step_time_ms": 7631.675720214844, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:48] (step=0000298) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.00579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 299, "loss": 0.17448721826076508, "memory_gb": 7.721559524536133, "step_time_ms": 7541.18013381958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:36:56] (step=0000299) Train Loss: 0.1963, Train Steps/Sec: 0.12, Epoch: 0.00581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 300, "loss": 0.2779313027858734, "memory_gb": 7.721559524536133, "step_time_ms": 7504.522085189819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:04] (step=0000300) Train Loss: 0.2437, Train Steps/Sec: 0.13, Epoch: 0.0058297706956859695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 301, "loss": 0.32365816831588745, "memory_gb": 7.721559524536133, "step_time_ms": 7569.597244262695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:12] (step=0000301) Train Loss: 0.3023, Train Steps/Sec: 0.12, Epoch: 0.005849203264671589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 302, "loss": 0.18670523166656494, "memory_gb": 7.721559524536133, "step_time_ms": 7554.02684211731, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:20] (step=0000302) Train Loss: 0.1981, Train Steps/Sec: 0.12, Epoch: 0.00586863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 303, "loss": 0.34672266244888306, "memory_gb": 7.721559524536133, "step_time_ms": 7517.76123046875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:28] (step=0000303) Train Loss: 0.3409, Train Steps/Sec: 0.12, Epoch: 0.0058880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 304, "loss": 0.21779799461364746, "memory_gb": 7.721559524536133, "step_time_ms": 7600.051164627075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:36] (step=0000304) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.005907500971628449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 305, "loss": 0.23786640167236328, "memory_gb": 7.721559524536133, "step_time_ms": 7600.2044677734375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:44] (step=0000305) Train Loss: 0.1988, Train Steps/Sec: 0.12, Epoch: 0.005926933540614069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 306, "loss": 0.20849069952964783, "memory_gb": 7.721559524536133, "step_time_ms": 7502.565860748291, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:37:52] (step=0000306) Train Loss: 0.2797, Train Steps/Sec: 0.12, Epoch: 0.005946366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 307, "loss": 0.32468992471694946, "memory_gb": 7.721559524536133, "step_time_ms": 7582.376956939697, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:00] (step=0000307) Train Loss: 0.2835, Train Steps/Sec: 0.12, Epoch: 0.005965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 308, "loss": 0.20603561401367188, "memory_gb": 7.721559524536133, "step_time_ms": 7519.443988800049, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:08] (step=0000308) Train Loss: 0.2718, Train Steps/Sec: 0.12, Epoch: 0.005985231247570929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 309, "loss": 0.30338382720947266, "memory_gb": 7.721559524536133, "step_time_ms": 7532.89794921875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:17] (step=0000309) Train Loss: 0.2960, Train Steps/Sec: 0.12, Epoch: 0.006004663816556548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 310, "loss": 0.1700131893157959, "memory_gb": 7.721559524536133, "step_time_ms": 7565.895318984985, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:25] (step=0000310) Train Loss: 0.2261, Train Steps/Sec: 0.12, Epoch: 0.006024096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 311, "loss": 0.30127060413360596, "memory_gb": 7.721559524536133, "step_time_ms": 7498.976945877075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:33] (step=0000311) Train Loss: 0.3156, Train Steps/Sec: 0.12, Epoch: 0.006043528954527789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 312, "loss": 0.28670719265937805, "memory_gb": 7.721559524536133, "step_time_ms": 7481.837749481201, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:41] (step=0000312) Train Loss: 0.3013, Train Steps/Sec: 0.13, Epoch: 0.006062961523513408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 313, "loss": 0.23672428727149963, "memory_gb": 7.721559524536133, "step_time_ms": 7557.906866073608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:49] (step=0000313) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.006082394092499028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:38:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 314, "loss": 0.24990889430046082, "memory_gb": 7.721559524536133, "step_time_ms": 7484.30323600769, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:38:57] (step=0000314) Train Loss: 0.2324, Train Steps/Sec: 0.12, Epoch: 0.006101826661484649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 315, "loss": 0.20452383160591125, "memory_gb": 7.721559524536133, "step_time_ms": 7419.694900512695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:05] (step=0000315) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.006121259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 316, "loss": 0.23966923356056213, "memory_gb": 7.721559524536133, "step_time_ms": 7615.578174591064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:13] (step=0000316) Train Loss: 0.2501, Train Steps/Sec: 0.12, Epoch: 0.006140691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 317, "loss": 0.276409387588501, "memory_gb": 7.721559524536133, "step_time_ms": 7453.920125961304, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:21] (step=0000317) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.006160124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 318, "loss": 0.3211938440799713, "memory_gb": 7.721559524536133, "step_time_ms": 7408.507585525513, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:29] (step=0000318) Train Loss: 0.2309, Train Steps/Sec: 0.13, Epoch: 0.006179556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 319, "loss": 0.2512177526950836, "memory_gb": 7.721559524536133, "step_time_ms": 7474.430799484253, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:37] (step=0000319) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.006198989506412748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 320, "loss": 0.35418105125427246, "memory_gb": 7.721559524536133, "step_time_ms": 7489.772319793701, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:45] (step=0000320) Train Loss: 0.2923, Train Steps/Sec: 0.12, Epoch: 0.006218422075398368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:39:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 321, "loss": 0.2308025062084198, "memory_gb": 7.721559524536133, "step_time_ms": 7335.321426391602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:39:53] (step=0000321) Train Loss: 0.2108, Train Steps/Sec: 0.13, Epoch: 0.006237854644383987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 322, "loss": 0.2344765067100525, "memory_gb": 7.721559524536133, "step_time_ms": 7512.683391571045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:01] (step=0000322) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.006257287213369608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 323, "loss": 0.37764298915863037, "memory_gb": 7.721559524536133, "step_time_ms": 5112.304449081421, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:07] (step=0000323) Train Loss: 0.2944, Train Steps/Sec: 0.17, Epoch: 0.0062767197823552275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 324, "loss": 0.2109108567237854, "memory_gb": 7.721559524536133, "step_time_ms": 7480.7374477386475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:15] (step=0000324) Train Loss: 0.1979, Train Steps/Sec: 0.12, Epoch: 0.006296152351340847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 325, "loss": 0.17927458882331848, "memory_gb": 7.721559524536133, "step_time_ms": 7486.490964889526, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:23] (step=0000325) Train Loss: 0.1960, Train Steps/Sec: 0.12, Epoch: 0.006315584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 326, "loss": 0.3298383951187134, "memory_gb": 7.721559524536133, "step_time_ms": 7379.058837890625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:31] (step=0000326) Train Loss: 0.2940, Train Steps/Sec: 0.12, Epoch: 0.0063350174893120875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 327, "loss": 0.2140336036682129, "memory_gb": 7.721559524536133, "step_time_ms": 7454.744100570679, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:39] (step=0000327) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.006354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 328, "loss": 0.29277169704437256, "memory_gb": 7.721559524536133, "step_time_ms": 7460.967063903809, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:47] (step=0000328) Train Loss: 0.3036, Train Steps/Sec: 0.12, Epoch: 0.006373882627283327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:40:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 329, "loss": 0.2490919828414917, "memory_gb": 7.721559524536133, "step_time_ms": 7377.598285675049, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:40:55] (step=0000329) Train Loss: 0.2456, Train Steps/Sec: 0.13, Epoch: 0.0063933151962689465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 330, "loss": 0.18588782846927643, "memory_gb": 7.721559524536133, "step_time_ms": 7489.687204360962, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:03] (step=0000330) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.006412747765254567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 331, "loss": 0.29974639415740967, "memory_gb": 7.721559524536133, "step_time_ms": 7458.8305950164795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:11] (step=0000331) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.006432180334240187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 332, "loss": 0.19464728236198425, "memory_gb": 7.721559524536133, "step_time_ms": 7417.253494262695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:19] (step=0000332) Train Loss: 0.1984, Train Steps/Sec: 0.12, Epoch: 0.0064516129032258064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 333, "loss": 0.2281622290611267, "memory_gb": 7.721559524536133, "step_time_ms": 7522.142648696899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:28] (step=0000333) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.006471045472211426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 334, "loss": 0.2809748649597168, "memory_gb": 7.721559524536133, "step_time_ms": 7515.089988708496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:36] (step=0000334) Train Loss: 0.2551, Train Steps/Sec: 0.12, Epoch: 0.006490478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 335, "loss": 0.2272336781024933, "memory_gb": 7.721559524536133, "step_time_ms": 7435.158729553223, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:44] (step=0000335) Train Loss: 0.2646, Train Steps/Sec: 0.13, Epoch: 0.006509910610182666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 336, "loss": 0.2687681317329407, "memory_gb": 7.721559524536133, "step_time_ms": 7527.016639709473, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:41:52] (step=0000336) Train Loss: 0.2936, Train Steps/Sec: 0.12, Epoch: 0.006529343179168286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 337, "loss": 0.140386164188385, "memory_gb": 7.721559524536133, "step_time_ms": 7517.503976821899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:00] (step=0000337) Train Loss: 0.1971, Train Steps/Sec: 0.12, Epoch: 0.006548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 338, "loss": 0.2171148657798767, "memory_gb": 7.721559524536133, "step_time_ms": 7411.646366119385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:08] (step=0000338) Train Loss: 0.2625, Train Steps/Sec: 0.13, Epoch: 0.0065682083171395254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 339, "loss": 0.3346259295940399, "memory_gb": 7.715639114379883, "step_time_ms": 7466.383695602417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:16] (step=0000339) Train Loss: 0.2945, Train Steps/Sec: 0.12, Epoch: 0.006587640886125146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 340, "loss": 0.20579886436462402, "memory_gb": 7.721559524536133, "step_time_ms": 7467.664003372192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:24] (step=0000340) Train Loss: 0.2319, Train Steps/Sec: 0.12, Epoch: 0.006607073455110766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 341, "loss": 0.31596639752388, "memory_gb": 7.721559524536133, "step_time_ms": 7382.325172424316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:32] (step=0000341) Train Loss: 0.2736, Train Steps/Sec: 0.12, Epoch: 0.006626506024096385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 342, "loss": 0.3167136013507843, "memory_gb": 7.721559524536133, "step_time_ms": 7499.017953872681, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:40] (step=0000342) Train Loss: 0.3165, Train Steps/Sec: 0.12, Epoch: 0.006645938593082005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 343, "loss": 0.17439007759094238, "memory_gb": 7.721559524536133, "step_time_ms": 7489.509582519531, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:48] (step=0000343) Train Loss: 0.1800, Train Steps/Sec: 0.12, Epoch: 0.006665371162067626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 344, "loss": 0.2089369297027588, "memory_gb": 7.721559524536133, "step_time_ms": 7213.444471359253, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:42:56] (step=0000344) Train Loss: 0.2257, Train Steps/Sec: 0.13, Epoch: 0.006684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 345, "loss": 0.17773567140102386, "memory_gb": 7.721559524536133, "step_time_ms": 7488.64221572876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:04] (step=0000345) Train Loss: 0.2132, Train Steps/Sec: 0.12, Epoch: 0.006704236300038865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 346, "loss": 0.24952803552150726, "memory_gb": 7.721559524536133, "step_time_ms": 7484.609127044678, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:12] (step=0000346) Train Loss: 0.2645, Train Steps/Sec: 0.12, Epoch: 0.006723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 347, "loss": 0.22402644157409668, "memory_gb": 7.721559524536133, "step_time_ms": 7440.904378890991, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:20] (step=0000347) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.006743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 348, "loss": 0.16899648308753967, "memory_gb": 7.721559524536133, "step_time_ms": 7571.125030517578, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:28] (step=0000348) Train Loss: 0.2296, Train Steps/Sec: 0.12, Epoch: 0.006762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 349, "loss": 0.2757059335708618, "memory_gb": 7.721559524536133, "step_time_ms": 7516.941070556641, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:36] (step=0000349) Train Loss: 0.2774, Train Steps/Sec: 0.12, Epoch: 0.006781966575981345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 350, "loss": 0.15582512319087982, "memory_gb": 7.721559524536133, "step_time_ms": 7291.112422943115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:44] (step=0000350) Train Loss: 0.1912, Train Steps/Sec: 0.13, Epoch: 0.006801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 351, "loss": 0.27824345231056213, "memory_gb": 7.721559524536133, "step_time_ms": 7492.014408111572, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:52] (step=0000351) Train Loss: 0.2465, Train Steps/Sec: 0.13, Epoch: 0.006820831713952585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:43:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 352, "loss": 0.21148496866226196, "memory_gb": 7.721559524536133, "step_time_ms": 5142.178535461426, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:43:58] (step=0000352) Train Loss: 0.2964, Train Steps/Sec: 0.16, Epoch: 0.0068402642829382045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 353, "loss": 0.1720501184463501, "memory_gb": 7.721559524536133, "step_time_ms": 7504.746913909912, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:06] (step=0000353) Train Loss: 0.1545, Train Steps/Sec: 0.12, Epoch: 0.006859696851923824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 354, "loss": 0.4255039691925049, "memory_gb": 7.721559524536133, "step_time_ms": 7515.903472900391, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:14] (step=0000354) Train Loss: 0.3342, Train Steps/Sec: 0.12, Epoch: 0.006879129420909444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 355, "loss": 0.1797764003276825, "memory_gb": 7.721559524536133, "step_time_ms": 7432.788133621216, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:22] (step=0000355) Train Loss: 0.2115, Train Steps/Sec: 0.12, Epoch: 0.0068985619898950645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 356, "loss": 0.22685471177101135, "memory_gb": 7.721559524536133, "step_time_ms": 7514.887094497681, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:31] (step=0000356) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.006917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 357, "loss": 0.17441290616989136, "memory_gb": 7.721559524536133, "step_time_ms": 7644.748687744141, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:39] (step=0000357) Train Loss: 0.2799, Train Steps/Sec: 0.12, Epoch: 0.006937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 358, "loss": 0.2017994374036789, "memory_gb": 7.721559524536133, "step_time_ms": 7402.973413467407, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:47] (step=0000358) Train Loss: 0.2585, Train Steps/Sec: 0.12, Epoch: 0.0069568596968519235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:44:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 359, "loss": 0.23891395330429077, "memory_gb": 7.721559524536133, "step_time_ms": 7513.797283172607, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:44:55] (step=0000359) Train Loss: 0.2597, Train Steps/Sec: 0.12, Epoch: 0.006976292265837544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 360, "loss": 0.20331326127052307, "memory_gb": 7.721559524536133, "step_time_ms": 7501.118421554565, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:03] (step=0000360) Train Loss: 0.1913, Train Steps/Sec: 0.12, Epoch: 0.006995724834823164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 361, "loss": 0.28068631887435913, "memory_gb": 7.721559524536133, "step_time_ms": 7432.368040084839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:11] (step=0000361) Train Loss: 0.2355, Train Steps/Sec: 0.12, Epoch: 0.0070151574038087834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 362, "loss": 0.22166745364665985, "memory_gb": 7.715639114379883, "step_time_ms": 7455.259561538696, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:19] (step=0000362) Train Loss: 0.2494, Train Steps/Sec: 0.12, Epoch: 0.007034589972794403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 363, "loss": 0.23568867146968842, "memory_gb": 7.721559524536133, "step_time_ms": 7493.343114852905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:27] (step=0000363) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.007054022541780024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 364, "loss": 0.2179080694913864, "memory_gb": 7.721559524536133, "step_time_ms": 7436.140060424805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:35] (step=0000364) Train Loss: 0.1987, Train Steps/Sec: 0.13, Epoch: 0.007073455110765643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 365, "loss": 0.24246273934841156, "memory_gb": 7.721559524536133, "step_time_ms": 7512.129783630371, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:43] (step=0000365) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.007092887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 366, "loss": 0.274760365486145, "memory_gb": 7.721559524536133, "step_time_ms": 7519.252777099609, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:51] (step=0000366) Train Loss: 0.2180, Train Steps/Sec: 0.12, Epoch: 0.007112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:45:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 367, "loss": 0.18715475499629974, "memory_gb": 7.721559524536133, "step_time_ms": 7464.917421340942, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:45:59] (step=0000367) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.007131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 368, "loss": 0.25439611077308655, "memory_gb": 7.721559524536133, "step_time_ms": 7583.334445953369, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:07] (step=0000368) Train Loss: 0.2469, Train Steps/Sec: 0.12, Epoch: 0.007151185386708123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 369, "loss": 0.16394659876823425, "memory_gb": 7.721559524536133, "step_time_ms": 7528.301239013672, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:15] (step=0000369) Train Loss: 0.2159, Train Steps/Sec: 0.13, Epoch: 0.007170617955693743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 370, "loss": 0.21453019976615906, "memory_gb": 7.721559524536133, "step_time_ms": 7491.380214691162, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:23] (step=0000370) Train Loss: 0.2177, Train Steps/Sec: 0.12, Epoch: 0.007190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 371, "loss": 0.25486093759536743, "memory_gb": 7.721559524536133, "step_time_ms": 7560.801267623901, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:31] (step=0000371) Train Loss: 0.2724, Train Steps/Sec: 0.12, Epoch: 0.007209483093664983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 372, "loss": 0.26084062457084656, "memory_gb": 7.721559524536133, "step_time_ms": 7560.680389404297, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:39] (step=0000372) Train Loss: 0.1795, Train Steps/Sec: 0.12, Epoch: 0.007228915662650603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 373, "loss": 0.24531406164169312, "memory_gb": 7.721559524536133, "step_time_ms": 7512.101888656616, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:47] (step=0000373) Train Loss: 0.2900, Train Steps/Sec: 0.13, Epoch: 0.007248348231636222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 374, "loss": 0.2529473304748535, "memory_gb": 7.721559524536133, "step_time_ms": 7601.446866989136, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:46:55] (step=0000374) Train Loss: 0.2114, Train Steps/Sec: 0.12, Epoch: 0.007267780800621842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 375, "loss": 0.2525833249092102, "memory_gb": 7.721559524536133, "step_time_ms": 7593.69421005249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:03] (step=0000375) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.007287213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 376, "loss": 0.2734355330467224, "memory_gb": 7.721559524536133, "step_time_ms": 7560.835838317871, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:11] (step=0000376) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.007306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 377, "loss": 0.22302129864692688, "memory_gb": 7.721559524536133, "step_time_ms": 7567.718982696533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:19] (step=0000377) Train Loss: 0.2747, Train Steps/Sec: 0.12, Epoch: 0.007326078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 378, "loss": 0.2569849491119385, "memory_gb": 7.721559524536133, "step_time_ms": 7535.357236862183, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:27] (step=0000378) Train Loss: 0.2394, Train Steps/Sec: 0.12, Epoch: 0.007345511076564322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 379, "loss": 0.23723752796649933, "memory_gb": 7.721559524536133, "step_time_ms": 7445.017099380493, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:35] (step=0000379) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.007364943645549941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 380, "loss": 0.16561445593833923, "memory_gb": 7.721559524536133, "step_time_ms": 7581.670045852661, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:43] (step=0000380) Train Loss: 0.2372, Train Steps/Sec: 0.13, Epoch: 0.007384376214535562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 381, "loss": 0.14560964703559875, "memory_gb": 7.721559524536133, "step_time_ms": 5149.480104446411, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:49] (step=0000381) Train Loss: 0.2070, Train Steps/Sec: 0.16, Epoch: 0.0074038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:47:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 382, "loss": 0.17755916714668274, "memory_gb": 7.721559524536133, "step_time_ms": 7554.224729537964, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:47:58] (step=0000382) Train Loss: 0.2060, Train Steps/Sec: 0.12, Epoch: 0.007423241352506801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 383, "loss": 0.24669131636619568, "memory_gb": 7.721559524536133, "step_time_ms": 7490.614891052246, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:06] (step=0000383) Train Loss: 0.2689, Train Steps/Sec: 0.12, Epoch: 0.007442673921492421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 384, "loss": 0.1567726582288742, "memory_gb": 7.721559524536133, "step_time_ms": 7442.136287689209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:14] (step=0000384) Train Loss: 0.1408, Train Steps/Sec: 0.12, Epoch: 0.0074621064904780415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 385, "loss": 0.1922139823436737, "memory_gb": 7.721559524536133, "step_time_ms": 7516.714334487915, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:22] (step=0000385) Train Loss: 0.2550, Train Steps/Sec: 0.12, Epoch: 0.007481539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 386, "loss": 0.14625027775764465, "memory_gb": 7.721559524536133, "step_time_ms": 7480.297803878784, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:30] (step=0000386) Train Loss: 0.2048, Train Steps/Sec: 0.12, Epoch: 0.007500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 387, "loss": 0.3221167325973511, "memory_gb": 7.721559524536133, "step_time_ms": 7444.344282150269, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:38] (step=0000387) Train Loss: 0.2807, Train Steps/Sec: 0.12, Epoch: 0.0075204041974349005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 388, "loss": 0.16957244277000427, "memory_gb": 7.721559524536133, "step_time_ms": 7557.904243469238, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:46] (step=0000388) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.007539836766420521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:48:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 389, "loss": 0.22708064317703247, "memory_gb": 7.721559524536133, "step_time_ms": 7535.658597946167, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:48:54] (step=0000389) Train Loss: 0.2070, Train Steps/Sec: 0.12, Epoch: 0.007559269335406141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 390, "loss": 0.28700095415115356, "memory_gb": 7.721559524536133, "step_time_ms": 7492.15841293335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:02] (step=0000390) Train Loss: 0.2894, Train Steps/Sec: 0.12, Epoch: 0.0075787019043917605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 391, "loss": 0.23093506693840027, "memory_gb": 7.721559524536133, "step_time_ms": 7540.649175643921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:10] (step=0000391) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.00759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 392, "loss": 0.30084872245788574, "memory_gb": 7.721559524536133, "step_time_ms": 7482.88631439209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:18] (step=0000392) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.007617567042363001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 393, "loss": 0.32957127690315247, "memory_gb": 7.721559524536133, "step_time_ms": 7439.869403839111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:26] (step=0000393) Train Loss: 0.3322, Train Steps/Sec: 0.12, Epoch: 0.00763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 394, "loss": 0.2453613579273224, "memory_gb": 7.721559524536133, "step_time_ms": 7453.322649002075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:34] (step=0000394) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.00765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 395, "loss": 0.3236967921257019, "memory_gb": 7.721559524536133, "step_time_ms": 7471.918344497681, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:42] (step=0000395) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.00767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 396, "loss": 0.30517834424972534, "memory_gb": 7.721559524536133, "step_time_ms": 7429.494857788086, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:51] (step=0000396) Train Loss: 0.2780, Train Steps/Sec: 0.12, Epoch: 0.00769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:49:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 397, "loss": 0.33332890272140503, "memory_gb": 7.721559524536133, "step_time_ms": 7532.030344009399, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:49:59] (step=0000397) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.0077147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 398, "loss": 0.2596887946128845, "memory_gb": 7.721559524536133, "step_time_ms": 7517.669200897217, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:07] (step=0000398) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.00773416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 399, "loss": 0.2384040355682373, "memory_gb": 7.721559524536133, "step_time_ms": 7417.841672897339, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:15] (step=0000399) Train Loss: 0.2703, Train Steps/Sec: 0.13, Epoch: 0.007753595025262339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 400, "loss": 0.2866485118865967, "memory_gb": 7.721559524536133, "step_time_ms": 7440.046310424805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:23] (step=0000400) Train Loss: 0.2776, Train Steps/Sec: 0.12, Epoch: 0.00777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 401, "loss": 0.37757307291030884, "memory_gb": 7.721559524536133, "step_time_ms": 7487.924575805664, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:31] (step=0000401) Train Loss: 0.3341, Train Steps/Sec: 0.12, Epoch: 0.00779246016323358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 402, "loss": 0.27545493841171265, "memory_gb": 7.721559524536133, "step_time_ms": 7434.5338344573975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:39] (step=0000402) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.007811892732219199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 403, "loss": 0.21495231986045837, "memory_gb": 7.721559524536133, "step_time_ms": 7474.544525146484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:47] (step=0000403) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.00783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:50:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 404, "loss": 0.3114014267921448, "memory_gb": 7.721559524536133, "step_time_ms": 7526.543855667114, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:50:55] (step=0000404) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.007850757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 405, "loss": 0.2760854959487915, "memory_gb": 7.721559524536133, "step_time_ms": 7619.9047565460205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:03] (step=0000405) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.00787019043917606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 406, "loss": 0.3688720464706421, "memory_gb": 7.721559524536133, "step_time_ms": 7505.311489105225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:11] (step=0000406) Train Loss: 0.3029, Train Steps/Sec: 0.13, Epoch: 0.00788962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 407, "loss": 0.30342400074005127, "memory_gb": 7.721559524536133, "step_time_ms": 7513.78607749939, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:19] (step=0000407) Train Loss: 0.3228, Train Steps/Sec: 0.12, Epoch: 0.007909055577147299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 408, "loss": 0.14260582625865936, "memory_gb": 7.721559524536133, "step_time_ms": 7293.931722640991, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:27] (step=0000408) Train Loss: 0.2133, Train Steps/Sec: 0.13, Epoch: 0.00792848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 409, "loss": 0.2972255051136017, "memory_gb": 7.721559524536133, "step_time_ms": 7449.614763259888, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:34] (step=0000409) Train Loss: 0.2297, Train Steps/Sec: 0.13, Epoch: 0.007947920715118538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 410, "loss": 0.20625469088554382, "memory_gb": 7.721559524536133, "step_time_ms": 5182.349920272827, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:40] (step=0000410) Train Loss: 0.2251, Train Steps/Sec: 0.17, Epoch: 0.007967353284104159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 411, "loss": 0.2534763813018799, "memory_gb": 7.721559524536133, "step_time_ms": 7485.886335372925, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:48] (step=0000411) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.007986785853089779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:51:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 412, "loss": 0.2830239534378052, "memory_gb": 7.721559524536133, "step_time_ms": 7305.534362792969, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:51:56] (step=0000412) Train Loss: 0.2274, Train Steps/Sec: 0.13, Epoch: 0.008006218422075398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 413, "loss": 0.2496396154165268, "memory_gb": 7.721559524536133, "step_time_ms": 7412.804126739502, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:04] (step=0000413) Train Loss: 0.2680, Train Steps/Sec: 0.13, Epoch: 0.008025650991061018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 414, "loss": 0.19600336253643036, "memory_gb": 7.721559524536133, "step_time_ms": 7519.0269947052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:12] (step=0000414) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.008045083560046639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 415, "loss": 0.201688751578331, "memory_gb": 7.721559524536133, "step_time_ms": 7485.665082931519, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:20] (step=0000415) Train Loss: 0.2336, Train Steps/Sec: 0.12, Epoch: 0.008064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 416, "loss": 0.2588244676589966, "memory_gb": 7.721559524536133, "step_time_ms": 7418.530464172363, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:28] (step=0000416) Train Loss: 0.2386, Train Steps/Sec: 0.12, Epoch: 0.008083948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 417, "loss": 0.18893399834632874, "memory_gb": 7.721559524536133, "step_time_ms": 7474.000453948975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:37] (step=0000417) Train Loss: 0.2581, Train Steps/Sec: 0.12, Epoch: 0.008103381267003497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 418, "loss": 0.22811488807201385, "memory_gb": 7.721559524536133, "step_time_ms": 7432.916879653931, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:45] (step=0000418) Train Loss: 0.2633, Train Steps/Sec: 0.12, Epoch: 0.008122813835989118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:52:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 419, "loss": 0.2427024245262146, "memory_gb": 7.721559524536133, "step_time_ms": 7446.920871734619, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:52:53] (step=0000419) Train Loss: 0.2141, Train Steps/Sec: 0.12, Epoch: 0.008142246404974738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 420, "loss": 0.21756017208099365, "memory_gb": 7.721559524536133, "step_time_ms": 7519.542455673218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:01] (step=0000420) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.008161678973960357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 421, "loss": 0.24493151903152466, "memory_gb": 7.721559524536133, "step_time_ms": 7469.163417816162, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:09] (step=0000421) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.008181111542945978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 422, "loss": 0.1797730177640915, "memory_gb": 7.721559524536133, "step_time_ms": 7469.085693359375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:17] (step=0000422) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.008200544111931597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 423, "loss": 0.10721701383590698, "memory_gb": 7.721559524536133, "step_time_ms": 7527.595520019531, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:25] (step=0000423) Train Loss: 0.1693, Train Steps/Sec: 0.12, Epoch: 0.008219976680917217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 424, "loss": 0.16381993889808655, "memory_gb": 7.721559524536133, "step_time_ms": 7483.552932739258, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:33] (step=0000424) Train Loss: 0.1687, Train Steps/Sec: 0.12, Epoch: 0.008239409249902838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 425, "loss": 0.2097010314464569, "memory_gb": 7.721559524536133, "step_time_ms": 7457.535982131958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:41] (step=0000425) Train Loss: 0.2015, Train Steps/Sec: 0.12, Epoch: 0.008258841818888456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 426, "loss": 0.35823461413383484, "memory_gb": 7.721559524536133, "step_time_ms": 7519.6521282196045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:49] (step=0000426) Train Loss: 0.3462, Train Steps/Sec: 0.12, Epoch: 0.008278274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 427, "loss": 0.30708375573158264, "memory_gb": 7.721559524536133, "step_time_ms": 7515.072822570801, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:53:57] (step=0000427) Train Loss: 0.2975, Train Steps/Sec: 0.12, Epoch: 0.008297706956859698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 428, "loss": 0.26337695121765137, "memory_gb": 7.721559524536133, "step_time_ms": 7448.064088821411, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:05] (step=0000428) Train Loss: 0.2951, Train Steps/Sec: 0.12, Epoch: 0.008317139525845316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 429, "loss": 0.22748833894729614, "memory_gb": 7.721559524536133, "step_time_ms": 7538.060903549194, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:13] (step=0000429) Train Loss: 0.1744, Train Steps/Sec: 0.12, Epoch: 0.008336572094830937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 430, "loss": 0.2770991325378418, "memory_gb": 7.721559524536133, "step_time_ms": 7480.154514312744, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:21] (step=0000430) Train Loss: 0.2952, Train Steps/Sec: 0.12, Epoch: 0.008356004663816556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 431, "loss": 0.3972146213054657, "memory_gb": 7.721559524536133, "step_time_ms": 7413.553476333618, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:29] (step=0000431) Train Loss: 0.3156, Train Steps/Sec: 0.12, Epoch: 0.008375437232802176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 432, "loss": 0.20690372586250305, "memory_gb": 7.721559524536133, "step_time_ms": 7518.23353767395, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:37] (step=0000432) Train Loss: 0.1922, Train Steps/Sec: 0.12, Epoch: 0.008394869801787797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 433, "loss": 0.31847259402275085, "memory_gb": 7.721559524536133, "step_time_ms": 7464.3871784210205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:45] (step=0000433) Train Loss: 0.2820, Train Steps/Sec: 0.13, Epoch: 0.008414302370773416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:54:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 434, "loss": 0.3660271167755127, "memory_gb": 7.721559524536133, "step_time_ms": 7400.623083114624, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:54:53] (step=0000434) Train Loss: 0.2974, Train Steps/Sec: 0.12, Epoch: 0.008433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 435, "loss": 0.22960983216762543, "memory_gb": 7.721559524536133, "step_time_ms": 7477.094411849976, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:01] (step=0000435) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.008453167508744657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 436, "loss": 0.1383819580078125, "memory_gb": 7.721559524536133, "step_time_ms": 7456.88533782959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:10] (step=0000436) Train Loss: 0.2354, Train Steps/Sec: 0.12, Epoch: 0.008472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 437, "loss": 0.36572566628456116, "memory_gb": 7.721559524536133, "step_time_ms": 7237.034797668457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:17] (step=0000437) Train Loss: 0.2826, Train Steps/Sec: 0.13, Epoch: 0.008492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 438, "loss": 0.2629137337207794, "memory_gb": 7.721559524536133, "step_time_ms": 7485.947608947754, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:25] (step=0000438) Train Loss: 0.3111, Train Steps/Sec: 0.12, Epoch: 0.008511465215701515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 439, "loss": 0.22537922859191895, "memory_gb": 7.721559524536133, "step_time_ms": 5094.527721405029, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:31] (step=0000439) Train Loss: 0.2920, Train Steps/Sec: 0.18, Epoch: 0.008530897784687136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 440, "loss": 0.2777401804924011, "memory_gb": 7.721559524536133, "step_time_ms": 7516.717910766602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:39] (step=0000440) Train Loss: 0.2462, Train Steps/Sec: 0.12, Epoch: 0.008550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 441, "loss": 0.2927217483520508, "memory_gb": 7.721559524536133, "step_time_ms": 7435.513734817505, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:47] (step=0000441) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.008569762922658375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:55:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 442, "loss": 0.3265117406845093, "memory_gb": 7.721559524536133, "step_time_ms": 7459.3870639801025, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:55:55] (step=0000442) Train Loss: 0.3245, Train Steps/Sec: 0.12, Epoch: 0.008589195491643995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 443, "loss": 0.29317712783813477, "memory_gb": 7.721559524536133, "step_time_ms": 7622.032642364502, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:03] (step=0000443) Train Loss: 0.2074, Train Steps/Sec: 0.12, Epoch: 0.008608628060629616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 444, "loss": 0.18203622102737427, "memory_gb": 7.721559524536133, "step_time_ms": 7457.656621932983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:11] (step=0000444) Train Loss: 0.2805, Train Steps/Sec: 0.13, Epoch: 0.008628060629615235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 445, "loss": 0.25602030754089355, "memory_gb": 7.721559524536133, "step_time_ms": 7423.191785812378, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:19] (step=0000445) Train Loss: 0.2640, Train Steps/Sec: 0.13, Epoch: 0.008647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 446, "loss": 0.2983524203300476, "memory_gb": 7.721559524536133, "step_time_ms": 7566.829681396484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:27] (step=0000446) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.008666925767586474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 447, "loss": 0.18876810371875763, "memory_gb": 7.721559524536133, "step_time_ms": 7505.815029144287, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:35] (step=0000447) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.008686358336572095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 448, "loss": 0.2968137860298157, "memory_gb": 7.721559524536133, "step_time_ms": 7472.153425216675, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:43] (step=0000448) Train Loss: 0.2776, Train Steps/Sec: 0.12, Epoch: 0.008705790905557715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 449, "loss": 0.20425787568092346, "memory_gb": 7.721559524536133, "step_time_ms": 7550.786733627319, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:51] (step=0000449) Train Loss: 0.2314, Train Steps/Sec: 0.12, Epoch: 0.008725223474543334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:56:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 450, "loss": 0.31002217531204224, "memory_gb": 7.721559524536133, "step_time_ms": 7543.4887409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:56:59] (step=0000450) Train Loss: 0.2610, Train Steps/Sec: 0.12, Epoch: 0.008744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 451, "loss": 0.2210865616798401, "memory_gb": 7.721559524536133, "step_time_ms": 7486.9585037231445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:07] (step=0000451) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.008764088612514575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 452, "loss": 0.20415928959846497, "memory_gb": 7.721559524536133, "step_time_ms": 7640.530347824097, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:16] (step=0000452) Train Loss: 0.1842, Train Steps/Sec: 0.12, Epoch: 0.008783521181500194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 453, "loss": 0.21252138912677765, "memory_gb": 7.721559524536133, "step_time_ms": 7588.789701461792, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:24] (step=0000453) Train Loss: 0.2150, Train Steps/Sec: 0.12, Epoch: 0.008802953750485815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 454, "loss": 0.19605711102485657, "memory_gb": 7.715639114379883, "step_time_ms": 7445.137500762939, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:32] (step=0000454) Train Loss: 0.1707, Train Steps/Sec: 0.12, Epoch: 0.008822386319471433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 455, "loss": 0.25360655784606934, "memory_gb": 7.721559524536133, "step_time_ms": 7557.020902633667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:40] (step=0000455) Train Loss: 0.2794, Train Steps/Sec: 0.12, Epoch: 0.008841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 456, "loss": 0.17162391543388367, "memory_gb": 7.721559524536133, "step_time_ms": 7440.3395652771, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:48] (step=0000456) Train Loss: 0.2288, Train Steps/Sec: 0.13, Epoch: 0.008861251457442675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:57:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 457, "loss": 0.2948821783065796, "memory_gb": 7.721559524536133, "step_time_ms": 7507.505893707275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:57:56] (step=0000457) Train Loss: 0.3084, Train Steps/Sec: 0.12, Epoch: 0.008880684026428293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 458, "loss": 0.228500634431839, "memory_gb": 7.721559524536133, "step_time_ms": 7645.571947097778, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:04] (step=0000458) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.008900116595413914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 459, "loss": 0.21645566821098328, "memory_gb": 7.721559524536133, "step_time_ms": 7648.717164993286, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:12] (step=0000459) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.008919549164399533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 460, "loss": 0.24691665172576904, "memory_gb": 7.721559524536133, "step_time_ms": 7574.4006633758545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:20] (step=0000460) Train Loss: 0.2383, Train Steps/Sec: 0.12, Epoch: 0.008938981733385153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 461, "loss": 0.30396491289138794, "memory_gb": 7.721559524536133, "step_time_ms": 7608.5193157196045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:28] (step=0000461) Train Loss: 0.3049, Train Steps/Sec: 0.12, Epoch: 0.008958414302370774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 462, "loss": 0.18211092054843903, "memory_gb": 7.721559524536133, "step_time_ms": 7599.072694778442, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:36] (step=0000462) Train Loss: 0.2375, Train Steps/Sec: 0.12, Epoch: 0.008977846871356393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 463, "loss": 0.2407507449388504, "memory_gb": 7.721559524536133, "step_time_ms": 7559.169769287109, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:44] (step=0000463) Train Loss: 0.2759, Train Steps/Sec: 0.12, Epoch: 0.008997279440342013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 464, "loss": 0.4302072525024414, "memory_gb": 7.721559524536133, "step_time_ms": 7570.700407028198, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:58:52] (step=0000464) Train Loss: 0.3383, Train Steps/Sec: 0.12, Epoch: 0.009016712009327634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 465, "loss": 0.2781420350074768, "memory_gb": 7.721559524536133, "step_time_ms": 7593.668460845947, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:00] (step=0000465) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.009036144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 466, "loss": 0.2143452763557434, "memory_gb": 7.721559524536133, "step_time_ms": 7422.104597091675, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:08] (step=0000466) Train Loss: 0.2822, Train Steps/Sec: 0.13, Epoch: 0.009055577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 467, "loss": 0.37515944242477417, "memory_gb": 7.721559524536133, "step_time_ms": 7623.449325561523, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:16] (step=0000467) Train Loss: 0.3114, Train Steps/Sec: 0.12, Epoch: 0.009075009716284492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 468, "loss": 0.2705722153186798, "memory_gb": 7.721559524536133, "step_time_ms": 5297.914266586304, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:22] (step=0000468) Train Loss: 0.3190, Train Steps/Sec: 0.18, Epoch: 0.009094442285270113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 469, "loss": 0.23673900961875916, "memory_gb": 7.721559524536133, "step_time_ms": 7622.845411300659, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:30] (step=0000469) Train Loss: 0.2220, Train Steps/Sec: 0.12, Epoch: 0.009113874854255733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 470, "loss": 0.2657349407672882, "memory_gb": 7.721559524536133, "step_time_ms": 7582.890748977661, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:38] (step=0000470) Train Loss: 0.2241, Train Steps/Sec: 0.12, Epoch: 0.009133307423241352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 471, "loss": 0.2138959914445877, "memory_gb": 7.715639114379883, "step_time_ms": 7508.349895477295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:46] (step=0000471) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.009152739992226972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 18:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 472, "loss": 0.1807672679424286, "memory_gb": 7.721559524536133, "step_time_ms": 7569.626569747925, "trainable_params": 4718592, "method": "lora"} [2025-07-28 18:59:54] (step=0000472) Train Loss: 0.1851, Train Steps/Sec: 0.12, Epoch: 0.009172172561212593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 473, "loss": 0.30098623037338257, "memory_gb": 7.721559524536133, "step_time_ms": 7458.740711212158, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:02] (step=0000473) Train Loss: 0.3482, Train Steps/Sec: 0.12, Epoch: 0.009191605130198212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 474, "loss": 0.2948484420776367, "memory_gb": 7.721559524536133, "step_time_ms": 7455.420732498169, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:10] (step=0000474) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.009211037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 475, "loss": 0.2355649173259735, "memory_gb": 7.721559524536133, "step_time_ms": 7518.622875213623, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:18] (step=0000475) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.009230470268169451, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 476, "loss": 0.314548522233963, "memory_gb": 7.721559524536133, "step_time_ms": 7448.332786560059, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:26] (step=0000476) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.009249902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 477, "loss": 0.30437928438186646, "memory_gb": 7.721559524536133, "step_time_ms": 7415.90428352356, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:34] (step=0000477) Train Loss: 0.3007, Train Steps/Sec: 0.13, Epoch: 0.009269335406140692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 478, "loss": 0.15414974093437195, "memory_gb": 7.721559524536133, "step_time_ms": 7499.533414840698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:42] (step=0000478) Train Loss: 0.2593, Train Steps/Sec: 0.12, Epoch: 0.009288767975126311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 479, "loss": 0.1905108243227005, "memory_gb": 7.721559524536133, "step_time_ms": 7249.404668807983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:50] (step=0000479) Train Loss: 0.2056, Train Steps/Sec: 0.13, Epoch: 0.009308200544111932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 480, "loss": 0.2203453779220581, "memory_gb": 7.721559524536133, "step_time_ms": 7420.459508895874, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:00:58] (step=0000480) Train Loss: 0.2269, Train Steps/Sec: 0.12, Epoch: 0.009327633113097552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 481, "loss": 0.18630658090114594, "memory_gb": 7.721559524536133, "step_time_ms": 7503.825902938843, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:06] (step=0000481) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.009347065682083171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 482, "loss": 0.30115121603012085, "memory_gb": 7.721559524536133, "step_time_ms": 7431.148052215576, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:14] (step=0000482) Train Loss: 0.3177, Train Steps/Sec: 0.12, Epoch: 0.009366498251068792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 483, "loss": 0.20016062259674072, "memory_gb": 7.721559524536133, "step_time_ms": 7417.733907699585, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:22] (step=0000483) Train Loss: 0.2325, Train Steps/Sec: 0.13, Epoch: 0.00938593082005441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 484, "loss": 0.1817520558834076, "memory_gb": 7.721559524536133, "step_time_ms": 7543.368577957153, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:31] (step=0000484) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.009405363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 485, "loss": 0.2084377557039261, "memory_gb": 7.721559524536133, "step_time_ms": 7456.986427307129, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:39] (step=0000485) Train Loss: 0.2085, Train Steps/Sec: 0.12, Epoch: 0.009424795958025652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 486, "loss": 0.24669736623764038, "memory_gb": 7.721559524536133, "step_time_ms": 7465.5914306640625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:47] (step=0000486) Train Loss: 0.2458, Train Steps/Sec: 0.12, Epoch: 0.00944422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:01:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 487, "loss": 0.2276095151901245, "memory_gb": 7.721559524536133, "step_time_ms": 7537.769317626953, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:01:55] (step=0000487) Train Loss: 0.2322, Train Steps/Sec: 0.12, Epoch: 0.009463661095996891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 488, "loss": 0.3004244267940521, "memory_gb": 7.721559524536133, "step_time_ms": 7481.926202774048, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:03] (step=0000488) Train Loss: 0.2439, Train Steps/Sec: 0.13, Epoch: 0.009483093664982511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 489, "loss": 0.25833049416542053, "memory_gb": 7.721559524536133, "step_time_ms": 7471.835136413574, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:11] (step=0000489) Train Loss: 0.2710, Train Steps/Sec: 0.12, Epoch: 0.00950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 490, "loss": 0.3313274681568146, "memory_gb": 7.721559524536133, "step_time_ms": 7534.94668006897, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:19] (step=0000490) Train Loss: 0.2777, Train Steps/Sec: 0.12, Epoch: 0.009521958802953751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 491, "loss": 0.3843839466571808, "memory_gb": 7.721559524536133, "step_time_ms": 7478.053331375122, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:27] (step=0000491) Train Loss: 0.3377, Train Steps/Sec: 0.12, Epoch: 0.00954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 492, "loss": 0.20303308963775635, "memory_gb": 7.721559524536133, "step_time_ms": 7410.876750946045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:35] (step=0000492) Train Loss: 0.2509, Train Steps/Sec: 0.13, Epoch: 0.00956082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 493, "loss": 0.29740750789642334, "memory_gb": 7.721559524536133, "step_time_ms": 7626.05881690979, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:43] (step=0000493) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.00958025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 494, "loss": 0.263103723526001, "memory_gb": 7.721559524536133, "step_time_ms": 7444.387912750244, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:51] (step=0000494) Train Loss: 0.2172, Train Steps/Sec: 0.13, Epoch: 0.00959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 495, "loss": 0.3056964576244354, "memory_gb": 7.721559524536133, "step_time_ms": 7348.169803619385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:02:59] (step=0000495) Train Loss: 0.2344, Train Steps/Sec: 0.13, Epoch: 0.00961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 496, "loss": 0.25234928727149963, "memory_gb": 7.721559524536133, "step_time_ms": 7536.232471466064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:07] (step=0000496) Train Loss: 0.2939, Train Steps/Sec: 0.12, Epoch: 0.00963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 497, "loss": 0.22609522938728333, "memory_gb": 7.721559524536133, "step_time_ms": 5205.697774887085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:13] (step=0000497) Train Loss: 0.2302, Train Steps/Sec: 0.17, Epoch: 0.00965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 498, "loss": 0.22466136515140533, "memory_gb": 7.721559524536133, "step_time_ms": 7530.311584472656, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:21] (step=0000498) Train Loss: 0.2574, Train Steps/Sec: 0.12, Epoch: 0.00967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 499, "loss": 0.1675933599472046, "memory_gb": 7.721559524536133, "step_time_ms": 7438.916444778442, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:29] (step=0000499) Train Loss: 0.2013, Train Steps/Sec: 0.12, Epoch: 0.009696851923824329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 500, "loss": 0.3209952116012573, "memory_gb": 7.721559524536133, "step_time_ms": 7446.954727172852, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:37] (step=0000500) Train Loss: 0.2625, Train Steps/Sec: 0.13, Epoch: 0.00971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 501, "loss": 0.28635162115097046, "memory_gb": 7.721559524536133, "step_time_ms": 7571.668863296509, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:45] (step=0000501) Train Loss: 0.2587, Train Steps/Sec: 0.12, Epoch: 0.00973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 502, "loss": 0.23680074512958527, "memory_gb": 7.721559524536133, "step_time_ms": 7465.1196002960205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:03:53] (step=0000502) Train Loss: 0.2492, Train Steps/Sec: 0.13, Epoch: 0.009755149630781189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 503, "loss": 0.2707083225250244, "memory_gb": 7.715639114379883, "step_time_ms": 7409.202814102173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:01] (step=0000503) Train Loss: 0.2301, Train Steps/Sec: 0.12, Epoch: 0.00977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 504, "loss": 0.2349206805229187, "memory_gb": 7.721559524536133, "step_time_ms": 7562.086582183838, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:09] (step=0000504) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.009794014768752428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 505, "loss": 0.31774866580963135, "memory_gb": 7.721559524536133, "step_time_ms": 7461.722373962402, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:17] (step=0000505) Train Loss: 0.2660, Train Steps/Sec: 0.12, Epoch: 0.009813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 506, "loss": 0.24231219291687012, "memory_gb": 7.721559524536133, "step_time_ms": 7442.812204360962, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:25] (step=0000506) Train Loss: 0.2614, Train Steps/Sec: 0.13, Epoch: 0.00983287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 507, "loss": 0.25777286291122437, "memory_gb": 7.721559524536133, "step_time_ms": 7540.12131690979, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:33] (step=0000507) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.009852312475709288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 508, "loss": 0.2291705161333084, "memory_gb": 7.721559524536133, "step_time_ms": 7387.937307357788, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:41] (step=0000508) Train Loss: 0.2746, Train Steps/Sec: 0.12, Epoch: 0.009871745044694909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 509, "loss": 0.3270219564437866, "memory_gb": 7.715639114379883, "step_time_ms": 7395.92719078064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:49] (step=0000509) Train Loss: 0.2689, Train Steps/Sec: 0.12, Epoch: 0.00989117761368053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 510, "loss": 0.2835707664489746, "memory_gb": 7.721559524536133, "step_time_ms": 7495.033979415894, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:04:58] (step=0000510) Train Loss: 0.2682, Train Steps/Sec: 0.12, Epoch: 0.009910610182666148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 511, "loss": 0.17205160856246948, "memory_gb": 7.721559524536133, "step_time_ms": 7432.679891586304, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:06] (step=0000511) Train Loss: 0.1667, Train Steps/Sec: 0.12, Epoch: 0.009930042751651769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 512, "loss": 0.20063477754592896, "memory_gb": 7.721559524536133, "step_time_ms": 7470.817565917969, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:14] (step=0000512) Train Loss: 0.1728, Train Steps/Sec: 0.12, Epoch: 0.009949475320637387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 513, "loss": 0.23051266372203827, "memory_gb": 7.721559524536133, "step_time_ms": 7494.368553161621, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:22] (step=0000513) Train Loss: 0.2752, Train Steps/Sec: 0.12, Epoch: 0.009968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 514, "loss": 0.23098446428775787, "memory_gb": 7.721559524536133, "step_time_ms": 7459.175825119019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:30] (step=0000514) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.009988340458608629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 515, "loss": 0.21441026031970978, "memory_gb": 7.721559524536133, "step_time_ms": 7470.081806182861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:38] (step=0000515) Train Loss: 0.2290, Train Steps/Sec: 0.13, Epoch: 0.010007773027594247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 516, "loss": 0.33084338903427124, "memory_gb": 7.721559524536133, "step_time_ms": 7498.480558395386, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:46] (step=0000516) Train Loss: 0.2865, Train Steps/Sec: 0.12, Epoch: 0.010027205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 517, "loss": 0.2943386137485504, "memory_gb": 7.721559524536133, "step_time_ms": 7502.465009689331, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:05:54] (step=0000517) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.010046638165565488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 518, "loss": 0.3199407756328583, "memory_gb": 7.721559524536133, "step_time_ms": 7453.032970428467, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:02] (step=0000518) Train Loss: 0.3046, Train Steps/Sec: 0.13, Epoch: 0.010066070734551107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 519, "loss": 0.29549455642700195, "memory_gb": 7.721559524536133, "step_time_ms": 7535.175561904907, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:10] (step=0000519) Train Loss: 0.3058, Train Steps/Sec: 0.12, Epoch: 0.010085503303536728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 520, "loss": 0.20302855968475342, "memory_gb": 7.721559524536133, "step_time_ms": 7488.141775131226, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:18] (step=0000520) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 0.010104935872522347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 521, "loss": 0.24876268208026886, "memory_gb": 7.721559524536133, "step_time_ms": 7475.122928619385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:26] (step=0000521) Train Loss: 0.2182, Train Steps/Sec: 0.12, Epoch: 0.010124368441507967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 522, "loss": 0.2630009651184082, "memory_gb": 7.721559524536133, "step_time_ms": 7553.101301193237, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:34] (step=0000522) Train Loss: 0.2647, Train Steps/Sec: 0.12, Epoch: 0.010143801010493588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 523, "loss": 0.2991054952144623, "memory_gb": 7.721559524536133, "step_time_ms": 7551.907777786255, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:42] (step=0000523) Train Loss: 0.2867, Train Steps/Sec: 0.12, Epoch: 0.010163233579479207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 524, "loss": 0.23756466805934906, "memory_gb": 7.721559524536133, "step_time_ms": 7411.820650100708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:50] (step=0000524) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.010182666148464827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:06:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 525, "loss": 0.3004646301269531, "memory_gb": 7.721559524536133, "step_time_ms": 7566.736459732056, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:06:58] (step=0000525) Train Loss: 0.3114, Train Steps/Sec: 0.12, Epoch: 0.010202098717450448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 526, "loss": 0.20560872554779053, "memory_gb": 7.721559524536133, "step_time_ms": 5477.825403213501, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:04] (step=0000526) Train Loss: 0.2065, Train Steps/Sec: 0.17, Epoch: 0.010221531286436067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 527, "loss": 0.17932705581188202, "memory_gb": 7.721559524536133, "step_time_ms": 7576.430082321167, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:12] (step=0000527) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.010240963855421687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 528, "loss": 0.22509709000587463, "memory_gb": 7.721559524536133, "step_time_ms": 7530.362367630005, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:20] (step=0000528) Train Loss: 0.2863, Train Steps/Sec: 0.12, Epoch: 0.010260396424407306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 529, "loss": 0.3114144802093506, "memory_gb": 7.721559524536133, "step_time_ms": 7450.577259063721, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:28] (step=0000529) Train Loss: 0.2776, Train Steps/Sec: 0.13, Epoch: 0.010279828993392926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 530, "loss": 0.23481397330760956, "memory_gb": 7.721559524536133, "step_time_ms": 7556.921482086182, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:36] (step=0000530) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.010299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 531, "loss": 0.2202327400445938, "memory_gb": 7.721559524536133, "step_time_ms": 7515.2013301849365, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:45] (step=0000531) Train Loss: 0.1989, Train Steps/Sec: 0.12, Epoch: 0.010318694131364166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 532, "loss": 0.2927897870540619, "memory_gb": 7.721559524536133, "step_time_ms": 7534.237861633301, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:07:53] (step=0000532) Train Loss: 0.2263, Train Steps/Sec: 0.13, Epoch: 0.010338126700349786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 533, "loss": 0.2244354784488678, "memory_gb": 7.721559524536133, "step_time_ms": 7615.514278411865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:01] (step=0000533) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.010357559269335407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 534, "loss": 0.2088548094034195, "memory_gb": 7.721559524536133, "step_time_ms": 7511.301517486572, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:09] (step=0000534) Train Loss: 0.2049, Train Steps/Sec: 0.12, Epoch: 0.010376991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 535, "loss": 0.14994937181472778, "memory_gb": 7.721559524536133, "step_time_ms": 7441.175937652588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:17] (step=0000535) Train Loss: 0.1991, Train Steps/Sec: 0.12, Epoch: 0.010396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 536, "loss": 0.25056320428848267, "memory_gb": 7.721559524536133, "step_time_ms": 7614.404916763306, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:25] (step=0000536) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.010415856976292265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 537, "loss": 0.2739763855934143, "memory_gb": 7.721559524536133, "step_time_ms": 7556.643724441528, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:33] (step=0000537) Train Loss: 0.3020, Train Steps/Sec: 0.13, Epoch: 0.010435289545277886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 538, "loss": 0.24626536667346954, "memory_gb": 7.721559524536133, "step_time_ms": 7509.032726287842, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:41] (step=0000538) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.010454722114263506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 539, "loss": 0.16771294176578522, "memory_gb": 7.721559524536133, "step_time_ms": 7601.006507873535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:49] (step=0000539) Train Loss: 0.1923, Train Steps/Sec: 0.12, Epoch: 0.010474154683249125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 540, "loss": 0.32009363174438477, "memory_gb": 7.721559524536133, "step_time_ms": 7644.180536270142, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:08:57] (step=0000540) Train Loss: 0.2889, Train Steps/Sec: 0.12, Epoch: 0.010493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 541, "loss": 0.3424534797668457, "memory_gb": 7.721559524536133, "step_time_ms": 7525.427579879761, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:05] (step=0000541) Train Loss: 0.2911, Train Steps/Sec: 0.13, Epoch: 0.010513019821220366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 542, "loss": 0.23839804530143738, "memory_gb": 7.721559524536133, "step_time_ms": 7564.615249633789, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:13] (step=0000542) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.010532452390205985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 543, "loss": 0.14519557356834412, "memory_gb": 7.721559524536133, "step_time_ms": 7531.930208206177, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:21] (step=0000543) Train Loss: 0.2000, Train Steps/Sec: 0.12, Epoch: 0.010551884959191606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 544, "loss": 0.1621442437171936, "memory_gb": 7.721559524536133, "step_time_ms": 7488.405466079712, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:29] (step=0000544) Train Loss: 0.1768, Train Steps/Sec: 0.12, Epoch: 0.010571317528177224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 545, "loss": 0.21876737475395203, "memory_gb": 7.721559524536133, "step_time_ms": 7584.0582847595215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:37] (step=0000545) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.010590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 546, "loss": 0.2598974406719208, "memory_gb": 7.721559524536133, "step_time_ms": 7491.872310638428, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:45] (step=0000546) Train Loss: 0.2408, Train Steps/Sec: 0.13, Epoch: 0.010610182666148466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:09:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 547, "loss": 0.3266739845275879, "memory_gb": 7.721559524536133, "step_time_ms": 7264.208793640137, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:09:53] (step=0000547) Train Loss: 0.3316, Train Steps/Sec: 0.13, Epoch: 0.010629615235134084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 548, "loss": 0.2693786919116974, "memory_gb": 7.721559524536133, "step_time_ms": 7551.916837692261, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:01] (step=0000548) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.010649047804119705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 549, "loss": 0.3166493773460388, "memory_gb": 7.721559524536133, "step_time_ms": 7538.6316776275635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:09] (step=0000549) Train Loss: 0.3139, Train Steps/Sec: 0.12, Epoch: 0.010668480373105324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 550, "loss": 0.21664535999298096, "memory_gb": 7.721559524536133, "step_time_ms": 7540.308237075806, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:17] (step=0000550) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.010687912942090944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 551, "loss": 0.33136701583862305, "memory_gb": 7.721559524536133, "step_time_ms": 7629.84561920166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:25] (step=0000551) Train Loss: 0.2533, Train Steps/Sec: 0.12, Epoch: 0.010707345511076565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 552, "loss": 0.2634877860546112, "memory_gb": 7.721559524536133, "step_time_ms": 7530.900239944458, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:33] (step=0000552) Train Loss: 0.2224, Train Steps/Sec: 0.12, Epoch: 0.010726778080062184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 553, "loss": 0.2124800682067871, "memory_gb": 7.721559524536133, "step_time_ms": 7359.950542449951, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:41] (step=0000553) Train Loss: 0.2290, Train Steps/Sec: 0.13, Epoch: 0.010746210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 554, "loss": 0.28623664379119873, "memory_gb": 7.721559524536133, "step_time_ms": 7575.710296630859, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:49] (step=0000554) Train Loss: 0.2967, Train Steps/Sec: 0.12, Epoch: 0.010765643218033425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:10:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 555, "loss": 0.3286606967449188, "memory_gb": 7.721559524536133, "step_time_ms": 4945.121288299561, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:10:55] (step=0000555) Train Loss: 0.2961, Train Steps/Sec: 0.18, Epoch: 0.010785075787019044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 556, "loss": 0.29150211811065674, "memory_gb": 7.721559524536133, "step_time_ms": 7558.128118515015, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:03] (step=0000556) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.010804508356004664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 557, "loss": 0.2094583958387375, "memory_gb": 7.721559524536133, "step_time_ms": 7532.344579696655, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:11] (step=0000557) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.010823940924990283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 558, "loss": 0.14264057576656342, "memory_gb": 7.721559524536133, "step_time_ms": 7471.725702285767, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:19] (step=0000558) Train Loss: 0.2091, Train Steps/Sec: 0.12, Epoch: 0.010843373493975903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 559, "loss": 0.25914984941482544, "memory_gb": 7.721559524536133, "step_time_ms": 7530.3590297698975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:27] (step=0000559) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.010862806062961524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 560, "loss": 0.28632789850234985, "memory_gb": 7.721559524536133, "step_time_ms": 7451.075553894043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:35] (step=0000560) Train Loss: 0.3419, Train Steps/Sec: 0.12, Epoch: 0.010882238631947143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 561, "loss": 0.2307073324918747, "memory_gb": 7.721559524536133, "step_time_ms": 7448.7152099609375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:43] (step=0000561) Train Loss: 0.2435, Train Steps/Sec: 0.13, Epoch: 0.010901671200932763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 562, "loss": 0.3005523681640625, "memory_gb": 7.721559524536133, "step_time_ms": 7525.291442871094, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:51] (step=0000562) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.010921103769918384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 563, "loss": 0.23234903812408447, "memory_gb": 7.721559524536133, "step_time_ms": 7449.309349060059, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:11:59] (step=0000563) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.010940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 564, "loss": 0.19746355712413788, "memory_gb": 7.721559524536133, "step_time_ms": 7487.721681594849, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:07] (step=0000564) Train Loss: 0.2098, Train Steps/Sec: 0.12, Epoch: 0.010959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 565, "loss": 0.2468872368335724, "memory_gb": 7.721559524536133, "step_time_ms": 7467.9155349731445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:15] (step=0000565) Train Loss: 0.2419, Train Steps/Sec: 0.12, Epoch: 0.010979401476875242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 566, "loss": 0.2766842544078827, "memory_gb": 7.721559524536133, "step_time_ms": 7460.731029510498, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:23] (step=0000566) Train Loss: 0.2760, Train Steps/Sec: 0.13, Epoch: 0.010998834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 567, "loss": 0.14394308626651764, "memory_gb": 7.721559524536133, "step_time_ms": 7470.553874969482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:32] (step=0000567) Train Loss: 0.1631, Train Steps/Sec: 0.12, Epoch: 0.011018266614846483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 568, "loss": 0.23311838507652283, "memory_gb": 7.721559524536133, "step_time_ms": 7522.326707839966, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:40] (step=0000568) Train Loss: 0.1932, Train Steps/Sec: 0.12, Epoch: 0.011037699183832102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 569, "loss": 0.29803597927093506, "memory_gb": 7.721559524536133, "step_time_ms": 7442.988395690918, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:48] (step=0000569) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.011057131752817723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:12:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 570, "loss": 0.32372766733169556, "memory_gb": 7.721559524536133, "step_time_ms": 7464.825868606567, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:12:56] (step=0000570) Train Loss: 0.2302, Train Steps/Sec: 0.12, Epoch: 0.011076564321803343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 571, "loss": 0.3289707899093628, "memory_gb": 7.721559524536133, "step_time_ms": 7510.894060134888, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:04] (step=0000571) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.011095996890788962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 572, "loss": 0.3025363087654114, "memory_gb": 7.721559524536133, "step_time_ms": 7437.715291976929, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:12] (step=0000572) Train Loss: 0.2927, Train Steps/Sec: 0.12, Epoch: 0.011115429459774583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 573, "loss": 0.18498972058296204, "memory_gb": 7.721559524536133, "step_time_ms": 7451.286792755127, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:20] (step=0000573) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.011134862028760201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 574, "loss": 0.1971016824245453, "memory_gb": 7.721559524536133, "step_time_ms": 7567.528963088989, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:28] (step=0000574) Train Loss: 0.2064, Train Steps/Sec: 0.12, Epoch: 0.011154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 575, "loss": 0.19475814700126648, "memory_gb": 7.721559524536133, "step_time_ms": 7415.543794631958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:36] (step=0000575) Train Loss: 0.2021, Train Steps/Sec: 0.13, Epoch: 0.011173727166731443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 576, "loss": 0.2693742513656616, "memory_gb": 7.721559524536133, "step_time_ms": 7441.401720046997, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:44] (step=0000576) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.011193159735717061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 577, "loss": 0.2839500308036804, "memory_gb": 7.721559524536133, "step_time_ms": 7394.885063171387, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:13:52] (step=0000577) Train Loss: 0.2449, Train Steps/Sec: 0.12, Epoch: 0.011212592304702682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 578, "loss": 0.26650968194007874, "memory_gb": 7.715639114379883, "step_time_ms": 7357.916831970215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:00] (step=0000578) Train Loss: 0.2652, Train Steps/Sec: 0.12, Epoch: 0.011232024873688302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 579, "loss": 0.27224817872047424, "memory_gb": 7.721559524536133, "step_time_ms": 7365.885257720947, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:08] (step=0000579) Train Loss: 0.2088, Train Steps/Sec: 0.13, Epoch: 0.011251457442673921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 580, "loss": 0.18088799715042114, "memory_gb": 7.721559524536133, "step_time_ms": 7309.8626136779785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:16] (step=0000580) Train Loss: 0.1657, Train Steps/Sec: 0.12, Epoch: 0.011270890011659542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 581, "loss": 0.17801257967948914, "memory_gb": 7.721559524536133, "step_time_ms": 7446.78807258606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:24] (step=0000581) Train Loss: 0.1827, Train Steps/Sec: 0.12, Epoch: 0.01129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 582, "loss": 0.22680938243865967, "memory_gb": 7.721559524536133, "step_time_ms": 7235.976457595825, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:32] (step=0000582) Train Loss: 0.2798, Train Steps/Sec: 0.13, Epoch: 0.011309755149630781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 583, "loss": 0.3198356032371521, "memory_gb": 7.721559524536133, "step_time_ms": 7511.379241943359, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:40] (step=0000583) Train Loss: 0.2676, Train Steps/Sec: 0.12, Epoch: 0.011329187718616402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 584, "loss": 0.24271047115325928, "memory_gb": 7.721559524536133, "step_time_ms": 4795.687198638916, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:46] (step=0000584) Train Loss: 0.2489, Train Steps/Sec: 0.19, Epoch: 0.01134862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:14:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 585, "loss": 0.3437623381614685, "memory_gb": 7.721559524536133, "step_time_ms": 7503.246307373047, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:14:54] (step=0000585) Train Loss: 0.3302, Train Steps/Sec: 0.12, Epoch: 0.011368052856587641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 586, "loss": 0.2669256031513214, "memory_gb": 7.721559524536133, "step_time_ms": 7472.092866897583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:02] (step=0000586) Train Loss: 0.1999, Train Steps/Sec: 0.12, Epoch: 0.01138748542557326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 587, "loss": 0.22154583036899567, "memory_gb": 7.721559524536133, "step_time_ms": 7509.730577468872, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:10] (step=0000587) Train Loss: 0.2219, Train Steps/Sec: 0.12, Epoch: 0.01140691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 588, "loss": 0.24299341440200806, "memory_gb": 7.721559524536133, "step_time_ms": 7582.8211307525635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:18] (step=0000588) Train Loss: 0.2565, Train Steps/Sec: 0.12, Epoch: 0.011426350563544501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 589, "loss": 0.40493667125701904, "memory_gb": 7.721559524536133, "step_time_ms": 7482.426404953003, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:26] (step=0000589) Train Loss: 0.3297, Train Steps/Sec: 0.12, Epoch: 0.01144578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 590, "loss": 0.39285778999328613, "memory_gb": 7.721559524536133, "step_time_ms": 7527.440547943115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:34] (step=0000590) Train Loss: 0.3129, Train Steps/Sec: 0.13, Epoch: 0.01146521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 591, "loss": 0.23017622530460358, "memory_gb": 7.721559524536133, "step_time_ms": 7597.407102584839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:42] (step=0000591) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.011484648270501361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 592, "loss": 0.2927144467830658, "memory_gb": 7.721559524536133, "step_time_ms": 7489.820241928101, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:50] (step=0000592) Train Loss: 0.3504, Train Steps/Sec: 0.12, Epoch: 0.01150408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 593, "loss": 0.2818336486816406, "memory_gb": 7.721559524536133, "step_time_ms": 7558.26735496521, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:15:58] (step=0000593) Train Loss: 0.2823, Train Steps/Sec: 0.12, Epoch: 0.0115235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 594, "loss": 0.20574520528316498, "memory_gb": 7.721559524536133, "step_time_ms": 7616.635799407959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:06] (step=0000594) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.01154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 595, "loss": 0.38026994466781616, "memory_gb": 7.721559524536133, "step_time_ms": 7471.513748168945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:14] (step=0000595) Train Loss: 0.2994, Train Steps/Sec: 0.12, Epoch: 0.01156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 596, "loss": 0.23836514353752136, "memory_gb": 7.721559524536133, "step_time_ms": 7524.45650100708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:22] (step=0000596) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.01158181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 597, "loss": 0.20768442749977112, "memory_gb": 7.721559524536133, "step_time_ms": 7595.6151485443115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:30] (step=0000597) Train Loss: 0.2284, Train Steps/Sec: 0.12, Epoch: 0.011601243684415079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 598, "loss": 0.32994410395622253, "memory_gb": 7.721559524536133, "step_time_ms": 7498.294115066528, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:38] (step=0000598) Train Loss: 0.2926, Train Steps/Sec: 0.12, Epoch: 0.0116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 599, "loss": 0.21876633167266846, "memory_gb": 7.721559524536133, "step_time_ms": 7618.462085723877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:46] (step=0000599) Train Loss: 0.2756, Train Steps/Sec: 0.12, Epoch: 0.01164010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 600, "loss": 0.20108233392238617, "memory_gb": 7.721559524536133, "step_time_ms": 7611.760854721069, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:16:55] (step=0000600) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.011659541391371939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 601, "loss": 0.20547287166118622, "memory_gb": 7.721559524536133, "step_time_ms": 7574.365615844727, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:03] (step=0000601) Train Loss: 0.2572, Train Steps/Sec: 0.12, Epoch: 0.01167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 602, "loss": 0.23558072745800018, "memory_gb": 7.721559524536133, "step_time_ms": 7587.944507598877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:11] (step=0000602) Train Loss: 0.2668, Train Steps/Sec: 0.12, Epoch: 0.011698406529343178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 603, "loss": 0.1760469377040863, "memory_gb": 7.721559524536133, "step_time_ms": 7545.023679733276, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:19] (step=0000603) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.011717839098328799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 604, "loss": 0.26625189185142517, "memory_gb": 7.721559524536133, "step_time_ms": 7470.164775848389, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:27] (step=0000604) Train Loss: 0.2441, Train Steps/Sec: 0.12, Epoch: 0.01173727166731442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 605, "loss": 0.26721352338790894, "memory_gb": 7.721559524536133, "step_time_ms": 7223.328113555908, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:35] (step=0000605) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.011756704236300038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 606, "loss": 0.2572786808013916, "memory_gb": 7.721559524536133, "step_time_ms": 7501.2476444244385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:43] (step=0000606) Train Loss: 0.3119, Train Steps/Sec: 0.12, Epoch: 0.011776136805285659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 607, "loss": 0.27940377593040466, "memory_gb": 7.721559524536133, "step_time_ms": 7440.342426300049, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:51] (step=0000607) Train Loss: 0.2880, Train Steps/Sec: 0.12, Epoch: 0.01179556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 608, "loss": 0.24216143786907196, "memory_gb": 7.721559524536133, "step_time_ms": 7568.657159805298, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:17:59] (step=0000608) Train Loss: 0.2693, Train Steps/Sec: 0.12, Epoch: 0.011815001943256898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 609, "loss": 0.24985437095165253, "memory_gb": 7.721559524536133, "step_time_ms": 7585.378170013428, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:07] (step=0000609) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.011834434512242519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 610, "loss": 0.25551682710647583, "memory_gb": 7.721559524536133, "step_time_ms": 7267.101287841797, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:15] (step=0000610) Train Loss: 0.2620, Train Steps/Sec: 0.13, Epoch: 0.011853867081228138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 611, "loss": 0.26426148414611816, "memory_gb": 7.721559524536133, "step_time_ms": 7400.982141494751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:23] (step=0000611) Train Loss: 0.2383, Train Steps/Sec: 0.13, Epoch: 0.011873299650213758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 612, "loss": 0.23793576657772064, "memory_gb": 7.721559524536133, "step_time_ms": 7506.615161895752, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:31] (step=0000612) Train Loss: 0.2213, Train Steps/Sec: 0.12, Epoch: 0.011892732219199379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 613, "loss": 0.28415030241012573, "memory_gb": 7.721559524536133, "step_time_ms": 5175.498485565186, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:36] (step=0000613) Train Loss: 0.1871, Train Steps/Sec: 0.18, Epoch: 0.011912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 614, "loss": 0.192297101020813, "memory_gb": 7.721559524536133, "step_time_ms": 7582.972049713135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:45] (step=0000614) Train Loss: 0.2164, Train Steps/Sec: 0.12, Epoch: 0.011931597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 615, "loss": 0.2690655589103699, "memory_gb": 7.715639114379883, "step_time_ms": 7452.502489089966, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:18:53] (step=0000615) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.011951029926156239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 616, "loss": 0.1620776653289795, "memory_gb": 7.721559524536133, "step_time_ms": 7513.484954833984, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:01] (step=0000616) Train Loss: 0.1813, Train Steps/Sec: 0.12, Epoch: 0.011970462495141857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 617, "loss": 0.23714670538902283, "memory_gb": 7.721559524536133, "step_time_ms": 7575.403928756714, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:09] (step=0000617) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.011989895064127478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 618, "loss": 0.17908638715744019, "memory_gb": 7.721559524536133, "step_time_ms": 7490.589618682861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:17] (step=0000618) Train Loss: 0.1645, Train Steps/Sec: 0.12, Epoch: 0.012009327633113097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 619, "loss": 0.20198318362236023, "memory_gb": 7.721559524536133, "step_time_ms": 7538.512468338013, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:25] (step=0000619) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.012028760202098717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 620, "loss": 0.2742438316345215, "memory_gb": 7.721559524536133, "step_time_ms": 7555.826902389526, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:33] (step=0000620) Train Loss: 0.2957, Train Steps/Sec: 0.12, Epoch: 0.012048192771084338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 621, "loss": 0.17924806475639343, "memory_gb": 7.721559524536133, "step_time_ms": 7460.54744720459, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:41] (step=0000621) Train Loss: 0.2252, Train Steps/Sec: 0.13, Epoch: 0.012067625340069957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 622, "loss": 0.22985966503620148, "memory_gb": 7.721559524536133, "step_time_ms": 7431.350469589233, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:49] (step=0000622) Train Loss: 0.2283, Train Steps/Sec: 0.13, Epoch: 0.012087057909055577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:19:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 623, "loss": 0.1658962517976761, "memory_gb": 7.721559524536133, "step_time_ms": 7469.202995300293, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:19:57] (step=0000623) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.012106490478041198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 624, "loss": 0.30723655223846436, "memory_gb": 7.721559524536133, "step_time_ms": 7476.736545562744, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:05] (step=0000624) Train Loss: 0.2940, Train Steps/Sec: 0.13, Epoch: 0.012125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 625, "loss": 0.1752922683954239, "memory_gb": 7.721559524536133, "step_time_ms": 7463.685750961304, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:13] (step=0000625) Train Loss: 0.2575, Train Steps/Sec: 0.13, Epoch: 0.012145355616012437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 626, "loss": 0.2614365220069885, "memory_gb": 7.721559524536133, "step_time_ms": 7497.145414352417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:21] (step=0000626) Train Loss: 0.2504, Train Steps/Sec: 0.12, Epoch: 0.012164788184998056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 627, "loss": 0.20070859789848328, "memory_gb": 7.721559524536133, "step_time_ms": 7412.0941162109375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:29] (step=0000627) Train Loss: 0.2397, Train Steps/Sec: 0.13, Epoch: 0.012184220753983677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 628, "loss": 0.30998024344444275, "memory_gb": 7.721559524536133, "step_time_ms": 7467.576265335083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:37] (step=0000628) Train Loss: 0.2965, Train Steps/Sec: 0.12, Epoch: 0.012203653322969297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 629, "loss": 0.26414790749549866, "memory_gb": 7.721559524536133, "step_time_ms": 7654.633045196533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:45] (step=0000629) Train Loss: 0.2148, Train Steps/Sec: 0.12, Epoch: 0.012223085891954916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 630, "loss": 0.274016410112381, "memory_gb": 7.721559524536133, "step_time_ms": 7419.300556182861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:20:53] (step=0000630) Train Loss: 0.2790, Train Steps/Sec: 0.12, Epoch: 0.012242518460940537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 631, "loss": 0.26592445373535156, "memory_gb": 7.721559524536133, "step_time_ms": 7465.867280960083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:01] (step=0000631) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.012261951029926155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 632, "loss": 0.27956485748291016, "memory_gb": 7.721559524536133, "step_time_ms": 7556.498765945435, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:09] (step=0000632) Train Loss: 0.2538, Train Steps/Sec: 0.12, Epoch: 0.012281383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 633, "loss": 0.16760069131851196, "memory_gb": 7.721559524536133, "step_time_ms": 7457.424640655518, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:17] (step=0000633) Train Loss: 0.2474, Train Steps/Sec: 0.13, Epoch: 0.012300816167897397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 634, "loss": 0.2022339403629303, "memory_gb": 7.721559524536133, "step_time_ms": 7449.747323989868, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:25] (step=0000634) Train Loss: 0.2527, Train Steps/Sec: 0.13, Epoch: 0.012320248736883015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 635, "loss": 0.1755581647157669, "memory_gb": 7.721559524536133, "step_time_ms": 7509.4616413116455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:33] (step=0000635) Train Loss: 0.1691, Train Steps/Sec: 0.12, Epoch: 0.012339681305868636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 636, "loss": 0.16505227982997894, "memory_gb": 7.721559524536133, "step_time_ms": 7434.647798538208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:41] (step=0000636) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.012359113874854256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 637, "loss": 0.3034210503101349, "memory_gb": 7.721559524536133, "step_time_ms": 7451.124906539917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:49] (step=0000637) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.012378546443839875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 638, "loss": 0.3184433877468109, "memory_gb": 7.721559524536133, "step_time_ms": 7501.463174819946, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:21:57] (step=0000638) Train Loss: 0.2826, Train Steps/Sec: 0.12, Epoch: 0.012397979012825496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 639, "loss": 0.2405829280614853, "memory_gb": 7.721559524536133, "step_time_ms": 7401.937007904053, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:05] (step=0000639) Train Loss: 0.1971, Train Steps/Sec: 0.12, Epoch: 0.012417411581811115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 640, "loss": 0.1897619068622589, "memory_gb": 7.721559524536133, "step_time_ms": 7339.665174484253, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:13] (step=0000640) Train Loss: 0.2574, Train Steps/Sec: 0.13, Epoch: 0.012436844150796735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 641, "loss": 0.28162112832069397, "memory_gb": 7.721559524536133, "step_time_ms": 7498.290777206421, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:21] (step=0000641) Train Loss: 0.2890, Train Steps/Sec: 0.12, Epoch: 0.012456276719782356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 642, "loss": 0.35207968950271606, "memory_gb": 7.721559524536133, "step_time_ms": 5212.005376815796, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:27] (step=0000642) Train Loss: 0.3075, Train Steps/Sec: 0.18, Epoch: 0.012475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 643, "loss": 0.27148646116256714, "memory_gb": 7.721559524536133, "step_time_ms": 7491.528749465942, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:35] (step=0000643) Train Loss: 0.2332, Train Steps/Sec: 0.12, Epoch: 0.012495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 644, "loss": 0.3020925223827362, "memory_gb": 7.721559524536133, "step_time_ms": 7411.1316204071045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:43] (step=0000644) Train Loss: 0.3005, Train Steps/Sec: 0.12, Epoch: 0.012514574426739216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 645, "loss": 0.19335712492465973, "memory_gb": 7.721559524536133, "step_time_ms": 7433.536052703857, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:51] (step=0000645) Train Loss: 0.1951, Train Steps/Sec: 0.12, Epoch: 0.012534006995724834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 646, "loss": 0.22227540612220764, "memory_gb": 7.721559524536133, "step_time_ms": 7457.893371582031, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:22:59] (step=0000646) Train Loss: 0.2404, Train Steps/Sec: 0.13, Epoch: 0.012553439564710455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 647, "loss": 0.17424212396144867, "memory_gb": 7.721559524536133, "step_time_ms": 7388.81516456604, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:07] (step=0000647) Train Loss: 0.1599, Train Steps/Sec: 0.12, Epoch: 0.012572872133696074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 648, "loss": 0.12674963474273682, "memory_gb": 7.721559524536133, "step_time_ms": 7432.482004165649, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:15] (step=0000648) Train Loss: 0.1906, Train Steps/Sec: 0.12, Epoch: 0.012592304702681694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 649, "loss": 0.2352287620306015, "memory_gb": 7.721559524536133, "step_time_ms": 7516.526222229004, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:23] (step=0000649) Train Loss: 0.3055, Train Steps/Sec: 0.12, Epoch: 0.012611737271667315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 650, "loss": 0.30078768730163574, "memory_gb": 7.721559524536133, "step_time_ms": 7435.635566711426, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:32] (step=0000650) Train Loss: 0.2670, Train Steps/Sec: 0.12, Epoch: 0.012631169840652934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 651, "loss": 0.31084585189819336, "memory_gb": 7.721559524536133, "step_time_ms": 7459.896802902222, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:40] (step=0000651) Train Loss: 0.2949, Train Steps/Sec: 0.12, Epoch: 0.012650602409638554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 652, "loss": 0.18348218500614166, "memory_gb": 7.721559524536133, "step_time_ms": 7518.601655960083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:48] (step=0000652) Train Loss: 0.1902, Train Steps/Sec: 0.12, Epoch: 0.012670034978624175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 653, "loss": 0.17972147464752197, "memory_gb": 7.721559524536133, "step_time_ms": 7434.377193450928, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:23:56] (step=0000653) Train Loss: 0.2000, Train Steps/Sec: 0.12, Epoch: 0.012689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 654, "loss": 0.21409901976585388, "memory_gb": 7.721559524536133, "step_time_ms": 7494.730710983276, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:04] (step=0000654) Train Loss: 0.2693, Train Steps/Sec: 0.12, Epoch: 0.012708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 655, "loss": 0.2763826251029968, "memory_gb": 7.721559524536133, "step_time_ms": 7495.770454406738, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:12] (step=0000655) Train Loss: 0.2753, Train Steps/Sec: 0.12, Epoch: 0.012728332685581033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 656, "loss": 0.4059538245201111, "memory_gb": 7.721559524536133, "step_time_ms": 7445.044994354248, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:20] (step=0000656) Train Loss: 0.2948, Train Steps/Sec: 0.13, Epoch: 0.012747765254566654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 657, "loss": 0.23299174010753632, "memory_gb": 7.721559524536133, "step_time_ms": 7496.549844741821, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:28] (step=0000657) Train Loss: 0.2650, Train Steps/Sec: 0.13, Epoch: 0.012767197823552274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 658, "loss": 0.15623052418231964, "memory_gb": 7.721559524536133, "step_time_ms": 7498.943328857422, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:36] (step=0000658) Train Loss: 0.1736, Train Steps/Sec: 0.12, Epoch: 0.012786630392537893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 659, "loss": 0.17390121519565582, "memory_gb": 7.721559524536133, "step_time_ms": 7456.978321075439, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:44] (step=0000659) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.012806062961523514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 660, "loss": 0.35802674293518066, "memory_gb": 7.721559524536133, "step_time_ms": 7483.957290649414, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:24:52] (step=0000660) Train Loss: 0.2521, Train Steps/Sec: 0.12, Epoch: 0.012825495530509134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 661, "loss": 0.16499583423137665, "memory_gb": 7.721559524536133, "step_time_ms": 7574.996471405029, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:00] (step=0000661) Train Loss: 0.1772, Train Steps/Sec: 0.12, Epoch: 0.012844928099494753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 662, "loss": 0.26756808161735535, "memory_gb": 7.721559524536133, "step_time_ms": 7498.048543930054, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:08] (step=0000662) Train Loss: 0.3119, Train Steps/Sec: 0.13, Epoch: 0.012864360668480374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 663, "loss": 0.23053069412708282, "memory_gb": 7.721559524536133, "step_time_ms": 7488.938093185425, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:16] (step=0000663) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.012883793237465992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 664, "loss": 0.2533315420150757, "memory_gb": 7.721559524536133, "step_time_ms": 7578.806400299072, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:24] (step=0000664) Train Loss: 0.2212, Train Steps/Sec: 0.12, Epoch: 0.012903225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 665, "loss": 0.3050347566604614, "memory_gb": 7.721559524536133, "step_time_ms": 7482.200384140015, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:32] (step=0000665) Train Loss: 0.2600, Train Steps/Sec: 0.12, Epoch: 0.012922658375437233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 666, "loss": 0.1616535484790802, "memory_gb": 7.721559524536133, "step_time_ms": 7519.088268280029, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:40] (step=0000666) Train Loss: 0.2377, Train Steps/Sec: 0.12, Epoch: 0.012942090944422852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 667, "loss": 0.15602600574493408, "memory_gb": 7.721559524536133, "step_time_ms": 7593.90664100647, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:48] (step=0000667) Train Loss: 0.2097, Train Steps/Sec: 0.12, Epoch: 0.012961523513408473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:25:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 668, "loss": 0.20920586585998535, "memory_gb": 7.721559524536133, "step_time_ms": 7528.325080871582, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:25:56] (step=0000668) Train Loss: 0.1917, Train Steps/Sec: 0.13, Epoch: 0.012980956082394092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 669, "loss": 0.20262250304222107, "memory_gb": 7.721559524536133, "step_time_ms": 7543.763637542725, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:04] (step=0000669) Train Loss: 0.1917, Train Steps/Sec: 0.13, Epoch: 0.013000388651379712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 670, "loss": 0.2708195447921753, "memory_gb": 7.721559524536133, "step_time_ms": 7642.663240432739, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:12] (step=0000670) Train Loss: 0.2805, Train Steps/Sec: 0.12, Epoch: 0.013019821220365333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 671, "loss": 0.22593075037002563, "memory_gb": 7.721559524536133, "step_time_ms": 4971.318960189819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:18] (step=0000671) Train Loss: 0.2380, Train Steps/Sec: 0.18, Epoch: 0.013039253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 672, "loss": 0.313103049993515, "memory_gb": 7.721559524536133, "step_time_ms": 7552.381277084351, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:26] (step=0000672) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.013058686358336572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 673, "loss": 0.31096571683883667, "memory_gb": 7.721559524536133, "step_time_ms": 7435.044765472412, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:34] (step=0000673) Train Loss: 0.2402, Train Steps/Sec: 0.13, Epoch: 0.013078118927322193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 674, "loss": 0.26463931798934937, "memory_gb": 7.721559524536133, "step_time_ms": 7527.182102203369, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:42] (step=0000674) Train Loss: 0.2143, Train Steps/Sec: 0.12, Epoch: 0.013097551496307811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 675, "loss": 0.17910778522491455, "memory_gb": 7.721559524536133, "step_time_ms": 7596.171140670776, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:50] (step=0000675) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.013116984065293432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:26:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 676, "loss": 0.27727624773979187, "memory_gb": 7.721559524536133, "step_time_ms": 7496.321439743042, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:26:58] (step=0000676) Train Loss: 0.2798, Train Steps/Sec: 0.12, Epoch: 0.013136416634279051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 677, "loss": 0.19596624374389648, "memory_gb": 7.721559524536133, "step_time_ms": 7602.35333442688, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:06] (step=0000677) Train Loss: 0.1859, Train Steps/Sec: 0.12, Epoch: 0.013155849203264671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 678, "loss": 0.23539721965789795, "memory_gb": 7.721559524536133, "step_time_ms": 7621.156454086304, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:14] (step=0000678) Train Loss: 0.2108, Train Steps/Sec: 0.12, Epoch: 0.013175281772250292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 679, "loss": 0.21402007341384888, "memory_gb": 7.721559524536133, "step_time_ms": 7556.387901306152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:22] (step=0000679) Train Loss: 0.2893, Train Steps/Sec: 0.12, Epoch: 0.01319471434123591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 680, "loss": 0.2653496265411377, "memory_gb": 7.721559524536133, "step_time_ms": 7561.692953109741, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:30] (step=0000680) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.013214146910221531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 681, "loss": 0.2343336045742035, "memory_gb": 7.721559524536133, "step_time_ms": 7590.924263000488, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:38] (step=0000681) Train Loss: 0.2747, Train Steps/Sec: 0.12, Epoch: 0.013233579479207152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 682, "loss": 0.19972999393939972, "memory_gb": 7.721559524536133, "step_time_ms": 7294.121503829956, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:46] (step=0000682) Train Loss: 0.2409, Train Steps/Sec: 0.13, Epoch: 0.01325301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:27:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 683, "loss": 0.13632166385650635, "memory_gb": 7.721559524536133, "step_time_ms": 7556.31947517395, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:27:54] (step=0000683) Train Loss: 0.1506, Train Steps/Sec: 0.12, Epoch: 0.013272444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 684, "loss": 0.3535691499710083, "memory_gb": 7.721559524536133, "step_time_ms": 7628.834962844849, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:02] (step=0000684) Train Loss: 0.3476, Train Steps/Sec: 0.12, Epoch: 0.01329187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 685, "loss": 0.2520996332168579, "memory_gb": 7.721559524536133, "step_time_ms": 7513.653516769409, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:10] (step=0000685) Train Loss: 0.2157, Train Steps/Sec: 0.12, Epoch: 0.01331130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 686, "loss": 0.29325640201568604, "memory_gb": 7.721559524536133, "step_time_ms": 7574.199199676514, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:18] (step=0000686) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.013330742324135251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 687, "loss": 0.13130950927734375, "memory_gb": 7.721559524536133, "step_time_ms": 7581.141710281372, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:26] (step=0000687) Train Loss: 0.2237, Train Steps/Sec: 0.12, Epoch: 0.01335017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 688, "loss": 0.2508887052536011, "memory_gb": 7.721559524536133, "step_time_ms": 7507.106065750122, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:34] (step=0000688) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.01336960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 689, "loss": 0.3188092112541199, "memory_gb": 7.721559524536133, "step_time_ms": 7504.325866699219, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:42] (step=0000689) Train Loss: 0.2859, Train Steps/Sec: 0.12, Epoch: 0.013389040031092111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 690, "loss": 0.2432508021593094, "memory_gb": 7.721559524536133, "step_time_ms": 7513.795852661133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:50] (step=0000690) Train Loss: 0.2270, Train Steps/Sec: 0.12, Epoch: 0.01340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 691, "loss": 0.3474041521549225, "memory_gb": 7.721559524536133, "step_time_ms": 7454.495429992676, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:28:58] (step=0000691) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.01342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 692, "loss": 0.17474579811096191, "memory_gb": 7.721559524536133, "step_time_ms": 7516.7646408081055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:06] (step=0000692) Train Loss: 0.1645, Train Steps/Sec: 0.12, Epoch: 0.01344733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 693, "loss": 0.2354878932237625, "memory_gb": 7.721559524536133, "step_time_ms": 7543.6131954193115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:15] (step=0000693) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.01346677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 694, "loss": 0.11741792410612106, "memory_gb": 7.721559524536133, "step_time_ms": 7399.208307266235, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:23] (step=0000694) Train Loss: 0.1978, Train Steps/Sec: 0.12, Epoch: 0.01348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 695, "loss": 0.30573907494544983, "memory_gb": 7.721559524536133, "step_time_ms": 7524.23095703125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:31] (step=0000695) Train Loss: 0.2486, Train Steps/Sec: 0.12, Epoch: 0.01350563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 696, "loss": 0.28634995222091675, "memory_gb": 7.721559524536133, "step_time_ms": 7519.066095352173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:39] (step=0000696) Train Loss: 0.2504, Train Steps/Sec: 0.12, Epoch: 0.01352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 697, "loss": 0.21270830929279327, "memory_gb": 7.721559524536133, "step_time_ms": 7475.962162017822, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:47] (step=0000697) Train Loss: 0.2737, Train Steps/Sec: 0.12, Epoch: 0.01354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 698, "loss": 0.20194220542907715, "memory_gb": 7.721559524536133, "step_time_ms": 7370.644569396973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:29:55] (step=0000698) Train Loss: 0.1901, Train Steps/Sec: 0.13, Epoch: 0.01356393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 699, "loss": 0.18265235424041748, "memory_gb": 7.721559524536133, "step_time_ms": 7534.51943397522, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:03] (step=0000699) Train Loss: 0.2021, Train Steps/Sec: 0.12, Epoch: 0.01358336572094831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 700, "loss": 0.15779444575309753, "memory_gb": 7.721559524536133, "step_time_ms": 5488.682746887207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:08] (step=0000700) Train Loss: 0.1995, Train Steps/Sec: 0.18, Epoch: 0.013602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 701, "loss": 0.26667970418930054, "memory_gb": 7.721559524536133, "step_time_ms": 7539.804458618164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:16] (step=0000701) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.013622230858919549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 702, "loss": 0.21793906390666962, "memory_gb": 7.721559524536133, "step_time_ms": 7486.110210418701, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:24] (step=0000702) Train Loss: 0.2458, Train Steps/Sec: 0.13, Epoch: 0.01364166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 703, "loss": 0.303221732378006, "memory_gb": 7.721559524536133, "step_time_ms": 7533.951759338379, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:32] (step=0000703) Train Loss: 0.2674, Train Steps/Sec: 0.12, Epoch: 0.013661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 704, "loss": 0.22855550050735474, "memory_gb": 7.721559524536133, "step_time_ms": 7547.595262527466, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:41] (step=0000704) Train Loss: 0.2171, Train Steps/Sec: 0.12, Epoch: 0.013680528565876409, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 705, "loss": 0.19587622582912445, "memory_gb": 7.721559524536133, "step_time_ms": 7439.18514251709, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:49] (step=0000705) Train Loss: 0.2649, Train Steps/Sec: 0.13, Epoch: 0.01369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:30:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 706, "loss": 0.24458318948745728, "memory_gb": 7.721559524536133, "step_time_ms": 7540.838241577148, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:30:57] (step=0000706) Train Loss: 0.2028, Train Steps/Sec: 0.12, Epoch: 0.013719393703847648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 707, "loss": 0.16994337737560272, "memory_gb": 7.721559524536133, "step_time_ms": 7550.528526306152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:05] (step=0000707) Train Loss: 0.2002, Train Steps/Sec: 0.12, Epoch: 0.013738826272833269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 708, "loss": 0.25557461380958557, "memory_gb": 7.721559524536133, "step_time_ms": 7441.286087036133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:13] (step=0000708) Train Loss: 0.2378, Train Steps/Sec: 0.13, Epoch: 0.013758258841818888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 709, "loss": 0.18888506293296814, "memory_gb": 7.721559524536133, "step_time_ms": 7499.58610534668, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:21] (step=0000709) Train Loss: 0.2140, Train Steps/Sec: 0.12, Epoch: 0.013777691410804508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 710, "loss": 0.3374127745628357, "memory_gb": 7.721559524536133, "step_time_ms": 7503.542184829712, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:29] (step=0000710) Train Loss: 0.2677, Train Steps/Sec: 0.12, Epoch: 0.013797123979790129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 711, "loss": 0.2751966118812561, "memory_gb": 7.721559524536133, "step_time_ms": 7402.8120040893555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:37] (step=0000711) Train Loss: 0.2868, Train Steps/Sec: 0.13, Epoch: 0.013816556548775748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 712, "loss": 0.24248665571212769, "memory_gb": 7.721559524536133, "step_time_ms": 7493.490219116211, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:45] (step=0000712) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.013835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:31:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 713, "loss": 0.1538330316543579, "memory_gb": 7.721559524536133, "step_time_ms": 7467.816591262817, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:31:53] (step=0000713) Train Loss: 0.2035, Train Steps/Sec: 0.12, Epoch: 0.013855421686746987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 714, "loss": 0.28257516026496887, "memory_gb": 7.721559524536133, "step_time_ms": 7390.969276428223, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:01] (step=0000714) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.013874854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 715, "loss": 0.21485315263271332, "memory_gb": 7.721559524536133, "step_time_ms": 7487.239837646484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:09] (step=0000715) Train Loss: 0.2407, Train Steps/Sec: 0.12, Epoch: 0.013894286824718228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 716, "loss": 0.16517937183380127, "memory_gb": 7.721559524536133, "step_time_ms": 7624.373435974121, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:17] (step=0000716) Train Loss: 0.2014, Train Steps/Sec: 0.12, Epoch: 0.013913719393703847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 717, "loss": 0.3841029703617096, "memory_gb": 7.721559524536133, "step_time_ms": 7398.444652557373, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:25] (step=0000717) Train Loss: 0.2862, Train Steps/Sec: 0.12, Epoch: 0.013933151962689468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 718, "loss": 0.23872461915016174, "memory_gb": 7.721559524536133, "step_time_ms": 7449.560165405273, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:33] (step=0000718) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.013952584531675088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 719, "loss": 0.364529550075531, "memory_gb": 7.721559524536133, "step_time_ms": 7427.7637004852295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:41] (step=0000719) Train Loss: 0.2992, Train Steps/Sec: 0.13, Epoch: 0.013972017100660707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 720, "loss": 0.2459823042154312, "memory_gb": 7.721559524536133, "step_time_ms": 7399.978160858154, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:49] (step=0000720) Train Loss: 0.2530, Train Steps/Sec: 0.13, Epoch: 0.013991449669646328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 721, "loss": 0.2603667378425598, "memory_gb": 7.721559524536133, "step_time_ms": 7419.723272323608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:32:57] (step=0000721) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.014010882238631946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 722, "loss": 0.2650532126426697, "memory_gb": 7.721559524536133, "step_time_ms": 7479.484081268311, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:05] (step=0000722) Train Loss: 0.2561, Train Steps/Sec: 0.12, Epoch: 0.014030314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 723, "loss": 0.24909551441669464, "memory_gb": 7.721559524536133, "step_time_ms": 7424.208402633667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:13] (step=0000723) Train Loss: 0.2963, Train Steps/Sec: 0.13, Epoch: 0.014049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 724, "loss": 0.2583200931549072, "memory_gb": 7.721559524536133, "step_time_ms": 7404.2909145355225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:21] (step=0000724) Train Loss: 0.2896, Train Steps/Sec: 0.13, Epoch: 0.014069179945588806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 725, "loss": 0.2758660316467285, "memory_gb": 7.721559524536133, "step_time_ms": 7448.448181152344, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:29] (step=0000725) Train Loss: 0.2630, Train Steps/Sec: 0.12, Epoch: 0.014088612514574427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 726, "loss": 0.22555501759052277, "memory_gb": 7.721559524536133, "step_time_ms": 7416.465520858765, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:37] (step=0000726) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.014108045083560047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 727, "loss": 0.19536253809928894, "memory_gb": 7.721559524536133, "step_time_ms": 7305.131673812866, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:45] (step=0000727) Train Loss: 0.1968, Train Steps/Sec: 0.13, Epoch: 0.014127477652545666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 728, "loss": 0.29227110743522644, "memory_gb": 7.721559524536133, "step_time_ms": 7491.514682769775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:53] (step=0000728) Train Loss: 0.3141, Train Steps/Sec: 0.12, Epoch: 0.014146910221531287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:33:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 729, "loss": 0.26241832971572876, "memory_gb": 7.721559524536133, "step_time_ms": 5393.6755657196045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:33:59] (step=0000729) Train Loss: 0.3220, Train Steps/Sec: 0.18, Epoch: 0.014166342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 730, "loss": 0.11522150039672852, "memory_gb": 7.721559524536133, "step_time_ms": 7462.710857391357, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:07] (step=0000730) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.014185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 731, "loss": 0.22455930709838867, "memory_gb": 7.721559524536133, "step_time_ms": 7408.669948577881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:15] (step=0000731) Train Loss: 0.2006, Train Steps/Sec: 0.12, Epoch: 0.014205207928488147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 732, "loss": 0.12300299108028412, "memory_gb": 7.721559524536133, "step_time_ms": 7473.516225814819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:23] (step=0000732) Train Loss: 0.1771, Train Steps/Sec: 0.12, Epoch: 0.014224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 733, "loss": 0.2559505105018616, "memory_gb": 7.721559524536133, "step_time_ms": 7488.308668136597, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:31] (step=0000733) Train Loss: 0.2662, Train Steps/Sec: 0.13, Epoch: 0.014244073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 734, "loss": 0.22296158969402313, "memory_gb": 7.721559524536133, "step_time_ms": 7421.601057052612, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:39] (step=0000734) Train Loss: 0.1924, Train Steps/Sec: 0.12, Epoch: 0.014263505635445007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 735, "loss": 0.22624728083610535, "memory_gb": 7.721559524536133, "step_time_ms": 7516.786336898804, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:47] (step=0000735) Train Loss: 0.2505, Train Steps/Sec: 0.12, Epoch: 0.014282938204430625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:34:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 736, "loss": 0.18396075069904327, "memory_gb": 7.721559524536133, "step_time_ms": 7492.095947265625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:34:55] (step=0000736) Train Loss: 0.2531, Train Steps/Sec: 0.13, Epoch: 0.014302370773416246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 737, "loss": 0.2566319704055786, "memory_gb": 7.721559524536133, "step_time_ms": 7445.81937789917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:03] (step=0000737) Train Loss: 0.2250, Train Steps/Sec: 0.13, Epoch: 0.014321803342401865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 738, "loss": 0.2622584104537964, "memory_gb": 7.715639114379883, "step_time_ms": 7490.415334701538, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:11] (step=0000738) Train Loss: 0.2458, Train Steps/Sec: 0.12, Epoch: 0.014341235911387485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 739, "loss": 0.315898597240448, "memory_gb": 7.721559524536133, "step_time_ms": 7487.507343292236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:19] (step=0000739) Train Loss: 0.2846, Train Steps/Sec: 0.13, Epoch: 0.014360668480373106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 740, "loss": 0.22016221284866333, "memory_gb": 7.721559524536133, "step_time_ms": 7439.364194869995, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:27] (step=0000740) Train Loss: 0.2316, Train Steps/Sec: 0.12, Epoch: 0.014380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 741, "loss": 0.16209210455417633, "memory_gb": 7.721559524536133, "step_time_ms": 7565.948009490967, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:35] (step=0000741) Train Loss: 0.1584, Train Steps/Sec: 0.12, Epoch: 0.014399533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 742, "loss": 0.3023398518562317, "memory_gb": 7.721559524536133, "step_time_ms": 7507.3912143707275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:43] (step=0000742) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.014418966187329966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 743, "loss": 0.19032108783721924, "memory_gb": 7.721559524536133, "step_time_ms": 7463.167190551758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:51] (step=0000743) Train Loss: 0.2538, Train Steps/Sec: 0.13, Epoch: 0.014438398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:35:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 744, "loss": 0.2619675397872925, "memory_gb": 7.721559524536133, "step_time_ms": 7537.770986557007, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:35:59] (step=0000744) Train Loss: 0.2297, Train Steps/Sec: 0.12, Epoch: 0.014457831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 745, "loss": 0.3622830808162689, "memory_gb": 7.721559524536133, "step_time_ms": 7519.932508468628, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:07] (step=0000745) Train Loss: 0.3070, Train Steps/Sec: 0.13, Epoch: 0.014477263894286824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 746, "loss": 0.13274449110031128, "memory_gb": 7.721559524536133, "step_time_ms": 7448.364019393921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:15] (step=0000746) Train Loss: 0.2146, Train Steps/Sec: 0.13, Epoch: 0.014496696463272445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 747, "loss": 0.2840423285961151, "memory_gb": 7.721559524536133, "step_time_ms": 7574.43642616272, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:24] (step=0000747) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.014516129032258065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 748, "loss": 0.3641916513442993, "memory_gb": 7.721559524536133, "step_time_ms": 7321.502685546875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:31] (step=0000748) Train Loss: 0.3214, Train Steps/Sec: 0.13, Epoch: 0.014535561601243684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 749, "loss": 0.1781512200832367, "memory_gb": 7.721559524536133, "step_time_ms": 7525.150775909424, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:39] (step=0000749) Train Loss: 0.1609, Train Steps/Sec: 0.12, Epoch: 0.014554994170229305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 750, "loss": 0.22562681138515472, "memory_gb": 7.721559524536133, "step_time_ms": 7604.358673095703, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:48] (step=0000750) Train Loss: 0.2025, Train Steps/Sec: 0.12, Epoch: 0.014574426739214923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 751, "loss": 0.3167527914047241, "memory_gb": 7.721559524536133, "step_time_ms": 7530.84659576416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:36:56] (step=0000751) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.014593859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 752, "loss": 0.2630685269832611, "memory_gb": 7.721559524536133, "step_time_ms": 7508.943557739258, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:04] (step=0000752) Train Loss: 0.2644, Train Steps/Sec: 0.13, Epoch: 0.014613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 753, "loss": 0.1660374402999878, "memory_gb": 7.721559524536133, "step_time_ms": 7616.091728210449, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:12] (step=0000753) Train Loss: 0.1748, Train Steps/Sec: 0.12, Epoch: 0.014632724446171783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 754, "loss": 0.16780894994735718, "memory_gb": 7.721559524536133, "step_time_ms": 7570.4405307769775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:20] (step=0000754) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.014652157015157404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 755, "loss": 0.24097442626953125, "memory_gb": 7.721559524536133, "step_time_ms": 7524.18327331543, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:28] (step=0000755) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.014671589584143024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 756, "loss": 0.2579791843891144, "memory_gb": 7.721559524536133, "step_time_ms": 7681.297302246094, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:36] (step=0000756) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.014691022153128643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 757, "loss": 0.23867210745811462, "memory_gb": 7.721559524536133, "step_time_ms": 7592.567205429077, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:44] (step=0000757) Train Loss: 0.2490, Train Steps/Sec: 0.13, Epoch: 0.014710454722114264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 758, "loss": 0.2720663547515869, "memory_gb": 7.721559524536133, "step_time_ms": 6131.289482116699, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:50] (step=0000758) Train Loss: 0.2766, Train Steps/Sec: 0.16, Epoch: 0.014729887291099883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:37:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 759, "loss": 0.11868461221456528, "memory_gb": 7.721559524536133, "step_time_ms": 6656.030654907227, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:37:57] (step=0000759) Train Loss: 0.2006, Train Steps/Sec: 0.14, Epoch: 0.014749319860085503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 760, "loss": 0.17265143990516663, "memory_gb": 7.721559524536133, "step_time_ms": 7533.0564975738525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:05] (step=0000760) Train Loss: 0.1521, Train Steps/Sec: 0.12, Epoch: 0.014768752429071124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 761, "loss": 0.2791706323623657, "memory_gb": 7.721559524536133, "step_time_ms": 7591.388702392578, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:13] (step=0000761) Train Loss: 0.3034, Train Steps/Sec: 0.12, Epoch: 0.014788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 762, "loss": 0.2788090705871582, "memory_gb": 7.721559524536133, "step_time_ms": 7579.842567443848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:21] (step=0000762) Train Loss: 0.2103, Train Steps/Sec: 0.13, Epoch: 0.014807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 763, "loss": 0.26653754711151123, "memory_gb": 7.721559524536133, "step_time_ms": 7541.443824768066, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:29] (step=0000763) Train Loss: 0.3269, Train Steps/Sec: 0.13, Epoch: 0.014827050136027984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 764, "loss": 0.29312580823898315, "memory_gb": 7.721559524536133, "step_time_ms": 7636.045455932617, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:37] (step=0000764) Train Loss: 0.2711, Train Steps/Sec: 0.12, Epoch: 0.014846482705013602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 765, "loss": 0.17758512496948242, "memory_gb": 7.721559524536133, "step_time_ms": 7533.49494934082, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:45] (step=0000765) Train Loss: 0.2003, Train Steps/Sec: 0.12, Epoch: 0.014865915273999223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:38:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 766, "loss": 0.236252561211586, "memory_gb": 7.721559524536133, "step_time_ms": 7503.8957595825195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:38:53] (step=0000766) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.014885347842984842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 767, "loss": 0.11845620721578598, "memory_gb": 7.721559524536133, "step_time_ms": 7553.330421447754, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:01] (step=0000767) Train Loss: 0.1476, Train Steps/Sec: 0.12, Epoch: 0.014904780411970462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 768, "loss": 0.2835025191307068, "memory_gb": 7.721559524536133, "step_time_ms": 7446.185111999512, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:09] (step=0000768) Train Loss: 0.2133, Train Steps/Sec: 0.13, Epoch: 0.014924212980956083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 769, "loss": 0.26208508014678955, "memory_gb": 7.721559524536133, "step_time_ms": 7507.311820983887, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:17] (step=0000769) Train Loss: 0.2683, Train Steps/Sec: 0.13, Epoch: 0.014943645549941702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 770, "loss": 0.2989465296268463, "memory_gb": 7.721559524536133, "step_time_ms": 7570.203065872192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:25] (step=0000770) Train Loss: 0.2606, Train Steps/Sec: 0.12, Epoch: 0.014963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 771, "loss": 0.2701634168624878, "memory_gb": 7.721559524536133, "step_time_ms": 7474.052667617798, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:33] (step=0000771) Train Loss: 0.2764, Train Steps/Sec: 0.12, Epoch: 0.014982510687912943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 772, "loss": 0.14633743464946747, "memory_gb": 7.721559524536133, "step_time_ms": 7481.946229934692, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:42] (step=0000772) Train Loss: 0.1886, Train Steps/Sec: 0.12, Epoch: 0.015001943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 773, "loss": 0.20433655381202698, "memory_gb": 7.721559524536133, "step_time_ms": 7536.74054145813, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:50] (step=0000773) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.015021375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 774, "loss": 0.3557817339897156, "memory_gb": 7.721559524536133, "step_time_ms": 7458.656072616577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:39:58] (step=0000774) Train Loss: 0.3377, Train Steps/Sec: 0.12, Epoch: 0.015040808394869801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 775, "loss": 0.2847253084182739, "memory_gb": 7.721559524536133, "step_time_ms": 7464.724779129028, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:06] (step=0000775) Train Loss: 0.2966, Train Steps/Sec: 0.12, Epoch: 0.015060240963855422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 776, "loss": 0.1849745661020279, "memory_gb": 7.721559524536133, "step_time_ms": 7532.002210617065, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:14] (step=0000776) Train Loss: 0.2196, Train Steps/Sec: 0.13, Epoch: 0.015079673532841042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 777, "loss": 0.2949613332748413, "memory_gb": 7.721559524536133, "step_time_ms": 7416.5356159210205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:22] (step=0000777) Train Loss: 0.2573, Train Steps/Sec: 0.13, Epoch: 0.015099106101826661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 778, "loss": 0.28189724683761597, "memory_gb": 7.721559524536133, "step_time_ms": 7434.117555618286, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:30] (step=0000778) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.015118538670812282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 779, "loss": 0.2627027630805969, "memory_gb": 7.721559524536133, "step_time_ms": 7516.659736633301, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:38] (step=0000779) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.015137971239797902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 780, "loss": 0.2519221305847168, "memory_gb": 7.721559524536133, "step_time_ms": 7437.509536743164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:46] (step=0000780) Train Loss: 0.2546, Train Steps/Sec: 0.13, Epoch: 0.015157403808783521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 781, "loss": 0.17722542583942413, "memory_gb": 7.721559524536133, "step_time_ms": 7421.60177230835, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:40:54] (step=0000781) Train Loss: 0.2314, Train Steps/Sec: 0.12, Epoch: 0.015176836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 782, "loss": 0.22335855662822723, "memory_gb": 7.721559524536133, "step_time_ms": 7483.310222625732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:02] (step=0000782) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.01519626894675476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 783, "loss": 0.1982479691505432, "memory_gb": 7.721559524536133, "step_time_ms": 7407.658100128174, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:10] (step=0000783) Train Loss: 0.1982, Train Steps/Sec: 0.13, Epoch: 0.01521570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 784, "loss": 0.28335317969322205, "memory_gb": 7.721559524536133, "step_time_ms": 7415.731430053711, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:18] (step=0000784) Train Loss: 0.2861, Train Steps/Sec: 0.12, Epoch: 0.015235134084726001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 785, "loss": 0.2358492910861969, "memory_gb": 7.721559524536133, "step_time_ms": 7505.990266799927, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:26] (step=0000785) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.01525456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 786, "loss": 0.18628321588039398, "memory_gb": 7.721559524536133, "step_time_ms": 7362.998247146606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:34] (step=0000786) Train Loss: 0.2741, Train Steps/Sec: 0.13, Epoch: 0.01527399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 787, "loss": 0.20320095121860504, "memory_gb": 7.721559524536133, "step_time_ms": 6561.443567276001, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:41] (step=0000787) Train Loss: 0.1895, Train Steps/Sec: 0.15, Epoch: 0.015293431791682861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 788, "loss": 0.2333832085132599, "memory_gb": 7.721559524536133, "step_time_ms": 6157.408952713013, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:47] (step=0000788) Train Loss: 0.2041, Train Steps/Sec: 0.15, Epoch: 0.01531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:41:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 789, "loss": 0.34810870885849, "memory_gb": 7.721559524536133, "step_time_ms": 7474.696636199951, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:41:55] (step=0000789) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.0153322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 790, "loss": 0.21309715509414673, "memory_gb": 7.721559524536133, "step_time_ms": 7476.959705352783, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:03] (step=0000790) Train Loss: 0.3077, Train Steps/Sec: 0.12, Epoch: 0.01535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 791, "loss": 0.3019118905067444, "memory_gb": 7.721559524536133, "step_time_ms": 7453.713178634644, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:11] (step=0000791) Train Loss: 0.3114, Train Steps/Sec: 0.12, Epoch: 0.01537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 792, "loss": 0.18912798166275024, "memory_gb": 7.721559524536133, "step_time_ms": 7439.539432525635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:19] (step=0000792) Train Loss: 0.1743, Train Steps/Sec: 0.13, Epoch: 0.01539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 793, "loss": 0.1902962177991867, "memory_gb": 7.721559524536133, "step_time_ms": 7533.236742019653, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:28] (step=0000793) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.01541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 794, "loss": 0.17921662330627441, "memory_gb": 7.721559524536133, "step_time_ms": 7432.285785675049, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:36] (step=0000794) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.0154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 795, "loss": 0.2521494925022125, "memory_gb": 7.721559524536133, "step_time_ms": 7445.717573165894, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:44] (step=0000795) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.015448892343567819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 796, "loss": 0.22075115144252777, "memory_gb": 7.721559524536133, "step_time_ms": 7486.899375915527, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:42:52] (step=0000796) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.01546832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 797, "loss": 0.3338618576526642, "memory_gb": 7.721559524536133, "step_time_ms": 7467.263698577881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:00] (step=0000797) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.01548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 798, "loss": 0.1963118612766266, "memory_gb": 7.721559524536133, "step_time_ms": 7423.31600189209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:08] (step=0000798) Train Loss: 0.2060, Train Steps/Sec: 0.12, Epoch: 0.015507190050524679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 799, "loss": 0.24409246444702148, "memory_gb": 7.721559524536133, "step_time_ms": 7498.7664222717285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:16] (step=0000799) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.0155266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 800, "loss": 0.3340081572532654, "memory_gb": 7.721559524536133, "step_time_ms": 7397.128582000732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:24] (step=0000800) Train Loss: 0.2649, Train Steps/Sec: 0.13, Epoch: 0.01554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 801, "loss": 0.3018417954444885, "memory_gb": 7.721559524536133, "step_time_ms": 7398.4973430633545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:32] (step=0000801) Train Loss: 0.2666, Train Steps/Sec: 0.13, Epoch: 0.015565487757481539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 802, "loss": 0.2901047468185425, "memory_gb": 7.721559524536133, "step_time_ms": 7464.359998703003, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:40] (step=0000802) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.01558492032646716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 803, "loss": 0.24668388068675995, "memory_gb": 7.721559524536133, "step_time_ms": 7405.893564224243, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:48] (step=0000803) Train Loss: 0.3164, Train Steps/Sec: 0.13, Epoch: 0.015604352895452778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:43:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 804, "loss": 0.31305184960365295, "memory_gb": 7.721559524536133, "step_time_ms": 7496.495008468628, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:43:56] (step=0000804) Train Loss: 0.3261, Train Steps/Sec: 0.12, Epoch: 0.015623785464438399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 805, "loss": 0.2947726845741272, "memory_gb": 7.721559524536133, "step_time_ms": 7617.29621887207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:04] (step=0000805) Train Loss: 0.2990, Train Steps/Sec: 0.12, Epoch: 0.015643218033424017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 806, "loss": 0.12758013606071472, "memory_gb": 7.721559524536133, "step_time_ms": 7406.821727752686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:12] (step=0000806) Train Loss: 0.2054, Train Steps/Sec: 0.12, Epoch: 0.01566265060240964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 807, "loss": 0.2642368674278259, "memory_gb": 7.721559524536133, "step_time_ms": 7428.306102752686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:20] (step=0000807) Train Loss: 0.2654, Train Steps/Sec: 0.13, Epoch: 0.01568208317139526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 808, "loss": 0.2604609429836273, "memory_gb": 7.721559524536133, "step_time_ms": 7516.407489776611, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:28] (step=0000808) Train Loss: 0.2533, Train Steps/Sec: 0.12, Epoch: 0.015701515740380877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 809, "loss": 0.1996917724609375, "memory_gb": 7.721559524536133, "step_time_ms": 7432.717084884644, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:36] (step=0000809) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.0157209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 810, "loss": 0.3094286322593689, "memory_gb": 7.721559524536133, "step_time_ms": 7456.958770751953, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:44] (step=0000810) Train Loss: 0.2944, Train Steps/Sec: 0.13, Epoch: 0.01574038087835212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:44:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 811, "loss": 0.39276689291000366, "memory_gb": 7.721559524536133, "step_time_ms": 7506.568193435669, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:44:52] (step=0000811) Train Loss: 0.2748, Train Steps/Sec: 0.13, Epoch: 0.015759813447337737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 812, "loss": 0.18993085622787476, "memory_gb": 7.721559524536133, "step_time_ms": 7457.318305969238, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:00] (step=0000812) Train Loss: 0.2271, Train Steps/Sec: 0.13, Epoch: 0.01577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 813, "loss": 0.2848302125930786, "memory_gb": 7.721559524536133, "step_time_ms": 7426.182746887207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:08] (step=0000813) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.01579867858530898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 814, "loss": 0.28360727429389954, "memory_gb": 7.721559524536133, "step_time_ms": 7234.877347946167, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:16] (step=0000814) Train Loss: 0.2884, Train Steps/Sec: 0.13, Epoch: 0.015818111154294597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 815, "loss": 0.2731155753135681, "memory_gb": 7.721559524536133, "step_time_ms": 7320.727348327637, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:24] (step=0000815) Train Loss: 0.2754, Train Steps/Sec: 0.13, Epoch: 0.015837543723280216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 816, "loss": 0.3064331114292145, "memory_gb": 7.721559524536133, "step_time_ms": 7013.732433319092, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:31] (step=0000816) Train Loss: 0.2801, Train Steps/Sec: 0.14, Epoch: 0.01585697629226584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 817, "loss": 0.23221296072006226, "memory_gb": 7.721559524536133, "step_time_ms": 5926.185369491577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:37] (step=0000817) Train Loss: 0.2679, Train Steps/Sec: 0.15, Epoch: 0.015876408861251457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 818, "loss": 0.20916640758514404, "memory_gb": 7.721559524536133, "step_time_ms": 7469.568729400635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:46] (step=0000818) Train Loss: 0.2296, Train Steps/Sec: 0.12, Epoch: 0.015895841430237076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 819, "loss": 0.27612894773483276, "memory_gb": 7.721559524536133, "step_time_ms": 7478.71470451355, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:45:54] (step=0000819) Train Loss: 0.2297, Train Steps/Sec: 0.12, Epoch: 0.015915273999222698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 820, "loss": 0.2796940803527832, "memory_gb": 7.721559524536133, "step_time_ms": 7452.264070510864, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:02] (step=0000820) Train Loss: 0.2955, Train Steps/Sec: 0.13, Epoch: 0.015934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 821, "loss": 0.17259705066680908, "memory_gb": 7.721559524536133, "step_time_ms": 7517.043828964233, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:10] (step=0000821) Train Loss: 0.2569, Train Steps/Sec: 0.12, Epoch: 0.015954139137193936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 822, "loss": 0.31640195846557617, "memory_gb": 7.721559524536133, "step_time_ms": 7505.81955909729, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:18] (step=0000822) Train Loss: 0.2368, Train Steps/Sec: 0.12, Epoch: 0.015973571706179558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 823, "loss": 0.17491650581359863, "memory_gb": 7.721559524536133, "step_time_ms": 7469.165802001953, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:26] (step=0000823) Train Loss: 0.2135, Train Steps/Sec: 0.12, Epoch: 0.015993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 824, "loss": 0.18812260031700134, "memory_gb": 7.721559524536133, "step_time_ms": 7520.762205123901, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:34] (step=0000824) Train Loss: 0.2012, Train Steps/Sec: 0.12, Epoch: 0.016012436844150796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 825, "loss": 0.31194478273391724, "memory_gb": 7.721559524536133, "step_time_ms": 7564.295291900635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:42] (step=0000825) Train Loss: 0.2513, Train Steps/Sec: 0.12, Epoch: 0.016031869413136418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 826, "loss": 0.11092101782560349, "memory_gb": 7.721559524536133, "step_time_ms": 7485.736846923828, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:50] (step=0000826) Train Loss: 0.1464, Train Steps/Sec: 0.12, Epoch: 0.016051301982122037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:46:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 827, "loss": 0.34283530712127686, "memory_gb": 7.721559524536133, "step_time_ms": 7514.949560165405, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:46:58] (step=0000827) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.016070734551107656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 828, "loss": 0.23067574203014374, "memory_gb": 7.721559524536133, "step_time_ms": 7532.902240753174, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:06] (step=0000828) Train Loss: 0.2822, Train Steps/Sec: 0.12, Epoch: 0.016090167120093278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 829, "loss": 0.2215997278690338, "memory_gb": 7.721559524536133, "step_time_ms": 7506.01863861084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:14] (step=0000829) Train Loss: 0.2382, Train Steps/Sec: 0.12, Epoch: 0.016109599689078897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 830, "loss": 0.23906229436397552, "memory_gb": 7.721559524536133, "step_time_ms": 7510.050296783447, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:22] (step=0000830) Train Loss: 0.2290, Train Steps/Sec: 0.12, Epoch: 0.016129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 831, "loss": 0.24863074719905853, "memory_gb": 7.721559524536133, "step_time_ms": 7509.803771972656, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:30] (step=0000831) Train Loss: 0.2039, Train Steps/Sec: 0.12, Epoch: 0.016148464827050135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 832, "loss": 0.18961751461029053, "memory_gb": 7.721559524536133, "step_time_ms": 7502.3252964019775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:38] (step=0000832) Train Loss: 0.2045, Train Steps/Sec: 0.12, Epoch: 0.016167897396035757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 833, "loss": 0.37996208667755127, "memory_gb": 7.721559524536133, "step_time_ms": 7539.496183395386, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:46] (step=0000833) Train Loss: 0.2792, Train Steps/Sec: 0.12, Epoch: 0.016187329965021376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:47:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 834, "loss": 0.12958088517189026, "memory_gb": 7.721559524536133, "step_time_ms": 7589.640140533447, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:47:54] (step=0000834) Train Loss: 0.2127, Train Steps/Sec: 0.12, Epoch: 0.016206762534006994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 835, "loss": 0.3901123106479645, "memory_gb": 7.721559524536133, "step_time_ms": 7507.5178146362305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:02] (step=0000835) Train Loss: 0.3670, Train Steps/Sec: 0.12, Epoch: 0.016226195102992617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 836, "loss": 0.2803872227668762, "memory_gb": 7.721559524536133, "step_time_ms": 7607.388496398926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:11] (step=0000836) Train Loss: 0.2808, Train Steps/Sec: 0.12, Epoch: 0.016245627671978236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 837, "loss": 0.2055550515651703, "memory_gb": 7.721559524536133, "step_time_ms": 7684.068202972412, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:19] (step=0000837) Train Loss: 0.2386, Train Steps/Sec: 0.12, Epoch: 0.016265060240963854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 838, "loss": 0.21841441094875336, "memory_gb": 7.721559524536133, "step_time_ms": 7569.878101348877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:27] (step=0000838) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.016284492809949477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 839, "loss": 0.24198302626609802, "memory_gb": 7.721559524536133, "step_time_ms": 7660.3498458862305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:35] (step=0000839) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.016303925378935095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 840, "loss": 0.35182633996009827, "memory_gb": 7.721559524536133, "step_time_ms": 7606.313705444336, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:43] (step=0000840) Train Loss: 0.3030, Train Steps/Sec: 0.12, Epoch: 0.016323357947920714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 841, "loss": 0.1470600813627243, "memory_gb": 7.721559524536133, "step_time_ms": 7565.756320953369, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:51] (step=0000841) Train Loss: 0.1557, Train Steps/Sec: 0.12, Epoch: 0.016342790516906337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:48:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 842, "loss": 0.22518528997898102, "memory_gb": 7.721559524536133, "step_time_ms": 7525.170564651489, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:48:59] (step=0000842) Train Loss: 0.2500, Train Steps/Sec: 0.12, Epoch: 0.016362223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 843, "loss": 0.25499534606933594, "memory_gb": 7.721559524536133, "step_time_ms": 7612.59126663208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:07] (step=0000843) Train Loss: 0.2911, Train Steps/Sec: 0.12, Epoch: 0.016381655654877574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 844, "loss": 0.19047954678535461, "memory_gb": 7.721559524536133, "step_time_ms": 7431.856155395508, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:15] (step=0000844) Train Loss: 0.2434, Train Steps/Sec: 0.13, Epoch: 0.016401088223863193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 845, "loss": 0.28430062532424927, "memory_gb": 7.721559524536133, "step_time_ms": 7256.763935089111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:22] (step=0000845) Train Loss: 0.2458, Train Steps/Sec: 0.13, Epoch: 0.016420520792848815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 846, "loss": 0.36901330947875977, "memory_gb": 7.721559524536133, "step_time_ms": 5904.212713241577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:29] (step=0000846) Train Loss: 0.3232, Train Steps/Sec: 0.16, Epoch: 0.016439953361834434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 847, "loss": 0.26314619183540344, "memory_gb": 7.721559524536133, "step_time_ms": 7534.038066864014, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:37] (step=0000847) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.016459385930820053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 848, "loss": 0.20544320344924927, "memory_gb": 7.721559524536133, "step_time_ms": 7573.779821395874, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:45] (step=0000848) Train Loss: 0.1996, Train Steps/Sec: 0.12, Epoch: 0.016478818499805675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:49:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 849, "loss": 0.22575126588344574, "memory_gb": 7.721559524536133, "step_time_ms": 7521.556854248047, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:49:53] (step=0000849) Train Loss: 0.2802, Train Steps/Sec: 0.12, Epoch: 0.016498251068791294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 850, "loss": 0.22959285974502563, "memory_gb": 7.721559524536133, "step_time_ms": 7488.444089889526, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:01] (step=0000850) Train Loss: 0.2496, Train Steps/Sec: 0.12, Epoch: 0.016517683637776913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 851, "loss": 0.27616631984710693, "memory_gb": 7.721559524536133, "step_time_ms": 7542.770147323608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:09] (step=0000851) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.016537116206762535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 852, "loss": 0.187738835811615, "memory_gb": 7.721559524536133, "step_time_ms": 7494.155645370483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:17] (step=0000852) Train Loss: 0.1919, Train Steps/Sec: 0.12, Epoch: 0.016556548775748154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 853, "loss": 0.18594849109649658, "memory_gb": 7.721559524536133, "step_time_ms": 7503.302574157715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:25] (step=0000853) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.016575981344733773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 854, "loss": 0.17045101523399353, "memory_gb": 7.721559524536133, "step_time_ms": 7507.999897003174, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:33] (step=0000854) Train Loss: 0.1788, Train Steps/Sec: 0.12, Epoch: 0.016595413913719395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 855, "loss": 0.3103252947330475, "memory_gb": 7.721559524536133, "step_time_ms": 7415.656089782715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:41] (step=0000855) Train Loss: 0.2986, Train Steps/Sec: 0.12, Epoch: 0.016614846482705014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 856, "loss": 0.3414306342601776, "memory_gb": 7.721559524536133, "step_time_ms": 7487.663269042969, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:49] (step=0000856) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.016634279051690633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 857, "loss": 0.3190652132034302, "memory_gb": 7.721559524536133, "step_time_ms": 7527.785539627075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:50:57] (step=0000857) Train Loss: 0.3133, Train Steps/Sec: 0.12, Epoch: 0.016653711620676255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 858, "loss": 0.20487025380134583, "memory_gb": 7.721559524536133, "step_time_ms": 7434.815883636475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:05] (step=0000858) Train Loss: 0.2135, Train Steps/Sec: 0.12, Epoch: 0.016673144189661874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 859, "loss": 0.25084570050239563, "memory_gb": 7.721559524536133, "step_time_ms": 7471.770763397217, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:13] (step=0000859) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.016692576758647493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 860, "loss": 0.3244767487049103, "memory_gb": 7.721559524536133, "step_time_ms": 7511.267185211182, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:21] (step=0000860) Train Loss: 0.2661, Train Steps/Sec: 0.12, Epoch: 0.01671200932763311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 861, "loss": 0.27134376764297485, "memory_gb": 7.721559524536133, "step_time_ms": 7456.65717124939, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:29] (step=0000861) Train Loss: 0.2931, Train Steps/Sec: 0.13, Epoch: 0.016731441896618734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 862, "loss": 0.25359171628952026, "memory_gb": 7.721559524536133, "step_time_ms": 7462.5585079193115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:37] (step=0000862) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.016750874465604353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 863, "loss": 0.23009711503982544, "memory_gb": 7.721559524536133, "step_time_ms": 7489.389657974243, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:45] (step=0000863) Train Loss: 0.1987, Train Steps/Sec: 0.12, Epoch: 0.01677030703458997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:51:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 864, "loss": 0.22380486130714417, "memory_gb": 7.721559524536133, "step_time_ms": 7427.086353302002, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:51:53] (step=0000864) Train Loss: 0.2064, Train Steps/Sec: 0.12, Epoch: 0.016789739603575594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 865, "loss": 0.23548081517219543, "memory_gb": 7.721559524536133, "step_time_ms": 7442.471265792847, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:01] (step=0000865) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.016809172172561213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 866, "loss": 0.1693124622106552, "memory_gb": 7.721559524536133, "step_time_ms": 7468.68634223938, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:10] (step=0000866) Train Loss: 0.2062, Train Steps/Sec: 0.12, Epoch: 0.01682860474154683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 867, "loss": 0.2571631669998169, "memory_gb": 7.721559524536133, "step_time_ms": 7400.909662246704, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:18] (step=0000867) Train Loss: 0.2792, Train Steps/Sec: 0.12, Epoch: 0.016848037310532454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 868, "loss": 0.17674954235553741, "memory_gb": 7.721559524536133, "step_time_ms": 7436.691522598267, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:26] (step=0000868) Train Loss: 0.2141, Train Steps/Sec: 0.12, Epoch: 0.016867469879518072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 869, "loss": 0.23120298981666565, "memory_gb": 7.721559524536133, "step_time_ms": 7459.7272872924805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:34] (step=0000869) Train Loss: 0.2188, Train Steps/Sec: 0.13, Epoch: 0.01688690244850369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 870, "loss": 0.28169697523117065, "memory_gb": 7.721559524536133, "step_time_ms": 7414.038181304932, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:42] (step=0000870) Train Loss: 0.2984, Train Steps/Sec: 0.12, Epoch: 0.016906335017489314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 871, "loss": 0.21032756567001343, "memory_gb": 7.721559524536133, "step_time_ms": 7463.342189788818, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:50] (step=0000871) Train Loss: 0.2188, Train Steps/Sec: 0.12, Epoch: 0.016925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:52:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 872, "loss": 0.22793042659759521, "memory_gb": 7.721559524536133, "step_time_ms": 7475.287437438965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:52:58] (step=0000872) Train Loss: 0.2522, Train Steps/Sec: 0.12, Epoch: 0.01694520015546055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 873, "loss": 0.2376660704612732, "memory_gb": 7.721559524536133, "step_time_ms": 7284.165382385254, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:06] (step=0000873) Train Loss: 0.1878, Train Steps/Sec: 0.13, Epoch: 0.016964632724446174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 874, "loss": 0.2637583017349243, "memory_gb": 7.721559524536133, "step_time_ms": 7437.788248062134, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:14] (step=0000874) Train Loss: 0.2550, Train Steps/Sec: 0.13, Epoch: 0.016984065293431792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 875, "loss": 0.2533327639102936, "memory_gb": 7.721559524536133, "step_time_ms": 5442.32439994812, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:19] (step=0000875) Train Loss: 0.2158, Train Steps/Sec: 0.17, Epoch: 0.01700349786241741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 876, "loss": 0.19662800431251526, "memory_gb": 7.721559524536133, "step_time_ms": 7487.341642379761, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:27] (step=0000876) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.01702293043140303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 877, "loss": 0.34285253286361694, "memory_gb": 7.721559524536133, "step_time_ms": 7460.200309753418, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:35] (step=0000877) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.017042363000388652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 878, "loss": 0.2727568745613098, "memory_gb": 7.721559524536133, "step_time_ms": 7411.852598190308, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:43] (step=0000878) Train Loss: 0.2649, Train Steps/Sec: 0.12, Epoch: 0.01706179556937427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 879, "loss": 0.11914848536252975, "memory_gb": 7.721559524536133, "step_time_ms": 7478.057861328125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:51] (step=0000879) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.01708122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 880, "loss": 0.29977160692214966, "memory_gb": 7.721559524536133, "step_time_ms": 7221.548318862915, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:53:59] (step=0000880) Train Loss: 0.2218, Train Steps/Sec: 0.13, Epoch: 0.017100660707345512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 881, "loss": 0.22428032755851746, "memory_gb": 7.721559524536133, "step_time_ms": 7400.404453277588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:07] (step=0000881) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.01712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 882, "loss": 0.17744527757167816, "memory_gb": 7.721559524536133, "step_time_ms": 7497.053623199463, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:15] (step=0000882) Train Loss: 0.1834, Train Steps/Sec: 0.12, Epoch: 0.01713952584531675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 883, "loss": 0.23430660367012024, "memory_gb": 7.721559524536133, "step_time_ms": 7400.592088699341, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:23] (step=0000883) Train Loss: 0.2536, Train Steps/Sec: 0.12, Epoch: 0.017158958414302372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 884, "loss": 0.22404354810714722, "memory_gb": 7.721559524536133, "step_time_ms": 7435.105562210083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:32] (step=0000884) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.01717839098328799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 885, "loss": 0.22163593769073486, "memory_gb": 7.721559524536133, "step_time_ms": 7501.506567001343, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:40] (step=0000885) Train Loss: 0.1782, Train Steps/Sec: 0.12, Epoch: 0.01719782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 886, "loss": 0.18087691068649292, "memory_gb": 7.721559524536133, "step_time_ms": 7501.474618911743, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:48] (step=0000886) Train Loss: 0.1987, Train Steps/Sec: 0.13, Epoch: 0.017217256121259232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 887, "loss": 0.1408924013376236, "memory_gb": 7.721559524536133, "step_time_ms": 7455.06477355957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:54:56] (step=0000887) Train Loss: 0.2039, Train Steps/Sec: 0.12, Epoch: 0.01723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 888, "loss": 0.39351701736450195, "memory_gb": 7.715639114379883, "step_time_ms": 7491.03307723999, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:04] (step=0000888) Train Loss: 0.3788, Train Steps/Sec: 0.12, Epoch: 0.01725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 889, "loss": 0.22280216217041016, "memory_gb": 7.715639114379883, "step_time_ms": 7430.987119674683, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:12] (step=0000889) Train Loss: 0.2581, Train Steps/Sec: 0.12, Epoch: 0.01727555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 890, "loss": 0.23443962633609772, "memory_gb": 7.721559524536133, "step_time_ms": 7467.468023300171, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:20] (step=0000890) Train Loss: 0.2100, Train Steps/Sec: 0.12, Epoch: 0.01729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 891, "loss": 0.3164812922477722, "memory_gb": 7.721559524536133, "step_time_ms": 7537.343740463257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:28] (step=0000891) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.01731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 892, "loss": 0.2719736695289612, "memory_gb": 7.721559524536133, "step_time_ms": 7679.1064739227295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:36] (step=0000892) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.01733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 893, "loss": 0.24304243922233582, "memory_gb": 7.721559524536133, "step_time_ms": 7488.74831199646, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:44] (step=0000893) Train Loss: 0.2761, Train Steps/Sec: 0.12, Epoch: 0.01735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 894, "loss": 0.29613280296325684, "memory_gb": 7.721559524536133, "step_time_ms": 7578.523635864258, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:55:52] (step=0000894) Train Loss: 0.2735, Train Steps/Sec: 0.12, Epoch: 0.01737271667314419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 895, "loss": 0.2569041848182678, "memory_gb": 7.721559524536133, "step_time_ms": 7638.86022567749, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:00] (step=0000895) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.01739214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 896, "loss": 0.214747816324234, "memory_gb": 7.721559524536133, "step_time_ms": 7540.451526641846, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:08] (step=0000896) Train Loss: 0.1753, Train Steps/Sec: 0.12, Epoch: 0.01741158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 897, "loss": 0.23707124590873718, "memory_gb": 7.721559524536133, "step_time_ms": 7563.760757446289, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:16] (step=0000897) Train Loss: 0.2391, Train Steps/Sec: 0.12, Epoch: 0.01743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 898, "loss": 0.24030724167823792, "memory_gb": 7.721559524536133, "step_time_ms": 7495.00298500061, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:24] (step=0000898) Train Loss: 0.1923, Train Steps/Sec: 0.12, Epoch: 0.01745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 899, "loss": 0.2147454023361206, "memory_gb": 7.721559524536133, "step_time_ms": 7509.572505950928, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:32] (step=0000899) Train Loss: 0.2486, Train Steps/Sec: 0.13, Epoch: 0.01746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 900, "loss": 0.31611600518226624, "memory_gb": 7.721559524536133, "step_time_ms": 7563.363790512085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:40] (step=0000900) Train Loss: 0.2584, Train Steps/Sec: 0.12, Epoch: 0.01748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 901, "loss": 0.3157615065574646, "memory_gb": 7.721559524536133, "step_time_ms": 7592.53191947937, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:48] (step=0000901) Train Loss: 0.2313, Train Steps/Sec: 0.12, Epoch: 0.017508744656043528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:56:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 902, "loss": 0.2591809034347534, "memory_gb": 7.721559524536133, "step_time_ms": 7414.592742919922, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:56:56] (step=0000902) Train Loss: 0.2608, Train Steps/Sec: 0.13, Epoch: 0.01752817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 903, "loss": 0.29315564036369324, "memory_gb": 7.721559524536133, "step_time_ms": 7585.424184799194, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:04] (step=0000903) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.01754760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 904, "loss": 0.27279943227767944, "memory_gb": 7.721559524536133, "step_time_ms": 5660.689353942871, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:10] (step=0000904) Train Loss: 0.2632, Train Steps/Sec: 0.17, Epoch: 0.017567042363000388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 905, "loss": 0.2045041024684906, "memory_gb": 7.721559524536133, "step_time_ms": 7571.483850479126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:18] (step=0000905) Train Loss: 0.2323, Train Steps/Sec: 0.13, Epoch: 0.017586474931986007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 906, "loss": 0.2679005265235901, "memory_gb": 7.721559524536133, "step_time_ms": 7584.12504196167, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:26] (step=0000906) Train Loss: 0.2075, Train Steps/Sec: 0.12, Epoch: 0.01760590750097163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 907, "loss": 0.33256420493125916, "memory_gb": 7.721559524536133, "step_time_ms": 7498.512268066406, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:34] (step=0000907) Train Loss: 0.2656, Train Steps/Sec: 0.13, Epoch: 0.017625340069957248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 908, "loss": 0.24824100732803345, "memory_gb": 7.721559524536133, "step_time_ms": 7629.568576812744, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:42] (step=0000908) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.017644772638942867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 909, "loss": 0.2692539095878601, "memory_gb": 7.721559524536133, "step_time_ms": 7566.0107135772705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:50] (step=0000909) Train Loss: 0.1766, Train Steps/Sec: 0.12, Epoch: 0.01766420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:57:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 910, "loss": 0.21293118596076965, "memory_gb": 7.721559524536133, "step_time_ms": 7517.189741134644, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:57:58] (step=0000910) Train Loss: 0.2800, Train Steps/Sec: 0.12, Epoch: 0.017683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 911, "loss": 0.2210230529308319, "memory_gb": 7.721559524536133, "step_time_ms": 7612.23030090332, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:06] (step=0000911) Train Loss: 0.1987, Train Steps/Sec: 0.12, Epoch: 0.017703070345899727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 912, "loss": 0.28650718927383423, "memory_gb": 7.721559524536133, "step_time_ms": 7520.650625228882, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:14] (step=0000912) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.01772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 913, "loss": 0.25108951330184937, "memory_gb": 7.721559524536133, "step_time_ms": 7468.006134033203, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:22] (step=0000913) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.017741935483870968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 914, "loss": 0.2485732138156891, "memory_gb": 7.721559524536133, "step_time_ms": 7511.269569396973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:30] (step=0000914) Train Loss: 0.2479, Train Steps/Sec: 0.12, Epoch: 0.017761368052856587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 915, "loss": 0.2938969135284424, "memory_gb": 7.721559524536133, "step_time_ms": 7508.073091506958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:38] (step=0000915) Train Loss: 0.2407, Train Steps/Sec: 0.12, Epoch: 0.01778080062184221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 916, "loss": 0.19393515586853027, "memory_gb": 7.721559524536133, "step_time_ms": 7437.877655029297, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:46] (step=0000916) Train Loss: 0.2333, Train Steps/Sec: 0.12, Epoch: 0.017800233190827828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:58:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 917, "loss": 0.25376325845718384, "memory_gb": 7.721559524536133, "step_time_ms": 7558.178424835205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:58:55] (step=0000917) Train Loss: 0.2139, Train Steps/Sec: 0.12, Epoch: 0.017819665759813447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 918, "loss": 0.2359699308872223, "memory_gb": 7.721559524536133, "step_time_ms": 7430.891752243042, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:03] (step=0000918) Train Loss: 0.1964, Train Steps/Sec: 0.12, Epoch: 0.017839098328799066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 919, "loss": 0.17683225870132446, "memory_gb": 7.721559524536133, "step_time_ms": 7401.207208633423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:11] (step=0000919) Train Loss: 0.2254, Train Steps/Sec: 0.12, Epoch: 0.017858530897784688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 920, "loss": 0.19133135676383972, "memory_gb": 7.721559524536133, "step_time_ms": 7530.762434005737, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:19] (step=0000920) Train Loss: 0.2346, Train Steps/Sec: 0.12, Epoch: 0.017877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 921, "loss": 0.19155874848365784, "memory_gb": 7.721559524536133, "step_time_ms": 7507.527589797974, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:27] (step=0000921) Train Loss: 0.1939, Train Steps/Sec: 0.12, Epoch: 0.017897396035755925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 922, "loss": 0.26577073335647583, "memory_gb": 7.721559524536133, "step_time_ms": 7421.482563018799, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:35] (step=0000922) Train Loss: 0.3111, Train Steps/Sec: 0.12, Epoch: 0.017916828604741548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 923, "loss": 0.20765884220600128, "memory_gb": 7.721559524536133, "step_time_ms": 7482.863187789917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:43] (step=0000923) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.017936261173727167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 924, "loss": 0.3321699798107147, "memory_gb": 7.721559524536133, "step_time_ms": 7451.860427856445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:51] (step=0000924) Train Loss: 0.2971, Train Steps/Sec: 0.12, Epoch: 0.017955693742712785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 19:59:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 925, "loss": 0.21548284590244293, "memory_gb": 7.721559524536133, "step_time_ms": 7408.189058303833, "trainable_params": 4718592, "method": "lora"} [2025-07-28 19:59:59] (step=0000925) Train Loss: 0.2426, Train Steps/Sec: 0.12, Epoch: 0.017975126311698408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 926, "loss": 0.23867720365524292, "memory_gb": 7.721559524536133, "step_time_ms": 7461.801290512085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:07] (step=0000926) Train Loss: 0.2799, Train Steps/Sec: 0.12, Epoch: 0.017994558880684026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 927, "loss": 0.2644094228744507, "memory_gb": 7.721559524536133, "step_time_ms": 7458.927631378174, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:15] (step=0000927) Train Loss: 0.2583, Train Steps/Sec: 0.12, Epoch: 0.018013991449669645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 928, "loss": 0.3016236126422882, "memory_gb": 7.721559524536133, "step_time_ms": 7459.046125411987, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:23] (step=0000928) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.018033424018655268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 929, "loss": 0.30463606119155884, "memory_gb": 7.721559524536133, "step_time_ms": 7504.4004917144775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:31] (step=0000929) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.018052856587640886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 930, "loss": 0.2720524072647095, "memory_gb": 7.721559524536133, "step_time_ms": 7501.335859298706, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:39] (step=0000930) Train Loss: 0.2734, Train Steps/Sec: 0.12, Epoch: 0.018072289156626505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 931, "loss": 0.2389606088399887, "memory_gb": 7.721559524536133, "step_time_ms": 7321.531772613525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:47] (step=0000931) Train Loss: 0.2585, Train Steps/Sec: 0.13, Epoch: 0.018091721725612128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 932, "loss": 0.23743827641010284, "memory_gb": 7.721559524536133, "step_time_ms": 7483.864068984985, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:00:55] (step=0000932) Train Loss: 0.2204, Train Steps/Sec: 0.13, Epoch: 0.018111154294597746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 933, "loss": 0.3103509545326233, "memory_gb": 7.721559524536133, "step_time_ms": 6406.445503234863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:02] (step=0000933) Train Loss: 0.2970, Train Steps/Sec: 0.15, Epoch: 0.018130586863583365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 934, "loss": 0.26338931918144226, "memory_gb": 7.721559524536133, "step_time_ms": 7423.519849777222, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:10] (step=0000934) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.018150019432568984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 935, "loss": 0.29212722182273865, "memory_gb": 7.721559524536133, "step_time_ms": 7471.603631973267, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:18] (step=0000935) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.018169452001554606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 936, "loss": 0.20139998197555542, "memory_gb": 7.721559524536133, "step_time_ms": 7402.883768081665, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:26] (step=0000936) Train Loss: 0.2438, Train Steps/Sec: 0.12, Epoch: 0.018188884570540225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 937, "loss": 0.24243828654289246, "memory_gb": 7.721559524536133, "step_time_ms": 7453.734397888184, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:34] (step=0000937) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.018208317139525844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 938, "loss": 0.19032225012779236, "memory_gb": 7.721559524536133, "step_time_ms": 7464.849948883057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:42] (step=0000938) Train Loss: 0.1707, Train Steps/Sec: 0.12, Epoch: 0.018227749708511466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 939, "loss": 0.2584075927734375, "memory_gb": 7.721559524536133, "step_time_ms": 7433.3813190460205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:50] (step=0000939) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.018247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 940, "loss": 0.20892742276191711, "memory_gb": 7.721559524536133, "step_time_ms": 7452.660799026489, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:01:58] (step=0000940) Train Loss: 0.2296, Train Steps/Sec: 0.12, Epoch: 0.018266614846482704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 941, "loss": 0.2677488327026367, "memory_gb": 7.721559524536133, "step_time_ms": 7556.354761123657, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:06] (step=0000941) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.018286047415468326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 942, "loss": 0.3197591304779053, "memory_gb": 7.721559524536133, "step_time_ms": 7447.76463508606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:14] (step=0000942) Train Loss: 0.3220, Train Steps/Sec: 0.13, Epoch: 0.018305479984453945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 943, "loss": 0.20555543899536133, "memory_gb": 7.721559524536133, "step_time_ms": 7498.911619186401, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:22] (step=0000943) Train Loss: 0.2518, Train Steps/Sec: 0.12, Epoch: 0.018324912553439564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 944, "loss": 0.23293153941631317, "memory_gb": 7.721559524536133, "step_time_ms": 7527.068376541138, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:30] (step=0000944) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.018344345122425186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 945, "loss": 0.2568584680557251, "memory_gb": 7.721559524536133, "step_time_ms": 7281.913042068481, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:38] (step=0000945) Train Loss: 0.2529, Train Steps/Sec: 0.13, Epoch: 0.018363777691410805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 946, "loss": 0.1454872488975525, "memory_gb": 7.721559524536133, "step_time_ms": 7559.0269565582275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:46] (step=0000946) Train Loss: 0.1991, Train Steps/Sec: 0.12, Epoch: 0.018383210260396424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:02:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 947, "loss": 0.2521324157714844, "memory_gb": 7.721559524536133, "step_time_ms": 7583.308935165405, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:02:54] (step=0000947) Train Loss: 0.2980, Train Steps/Sec: 0.13, Epoch: 0.018402642829382046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 948, "loss": 0.262991726398468, "memory_gb": 7.721559524536133, "step_time_ms": 7535.227060317993, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:02] (step=0000948) Train Loss: 0.2117, Train Steps/Sec: 0.12, Epoch: 0.018422075398367665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 949, "loss": 0.2848604619503021, "memory_gb": 7.721559524536133, "step_time_ms": 7643.61047744751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:10] (step=0000949) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.018441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 950, "loss": 0.21439659595489502, "memory_gb": 7.721559524536133, "step_time_ms": 7666.198015213013, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:18] (step=0000950) Train Loss: 0.2233, Train Steps/Sec: 0.12, Epoch: 0.018460940536338902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 951, "loss": 0.21568739414215088, "memory_gb": 7.721559524536133, "step_time_ms": 7584.055423736572, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:27] (step=0000951) Train Loss: 0.2590, Train Steps/Sec: 0.12, Epoch: 0.018480373105324525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 952, "loss": 0.2313615381717682, "memory_gb": 7.721559524536133, "step_time_ms": 7567.928791046143, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:35] (step=0000952) Train Loss: 0.2304, Train Steps/Sec: 0.12, Epoch: 0.018499805674310144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 953, "loss": 0.17547962069511414, "memory_gb": 7.721559524536133, "step_time_ms": 7552.7331829071045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:43] (step=0000953) Train Loss: 0.2048, Train Steps/Sec: 0.12, Epoch: 0.018519238243295762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 954, "loss": 0.20539908111095428, "memory_gb": 7.721559524536133, "step_time_ms": 7546.307802200317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:51] (step=0000954) Train Loss: 0.2351, Train Steps/Sec: 0.12, Epoch: 0.018538670812281385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:03:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 955, "loss": 0.2782668471336365, "memory_gb": 7.721559524536133, "step_time_ms": 7580.092191696167, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:03:59] (step=0000955) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.018558103381267003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 956, "loss": 0.2627663314342499, "memory_gb": 7.721559524536133, "step_time_ms": 7601.820230484009, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:07] (step=0000956) Train Loss: 0.2317, Train Steps/Sec: 0.12, Epoch: 0.018577535950252622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 957, "loss": 0.2221435159444809, "memory_gb": 7.721559524536133, "step_time_ms": 7517.198085784912, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:15] (step=0000957) Train Loss: 0.2633, Train Steps/Sec: 0.13, Epoch: 0.018596968519238245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 958, "loss": 0.23561906814575195, "memory_gb": 7.721559524536133, "step_time_ms": 7564.561128616333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:23] (step=0000958) Train Loss: 0.1940, Train Steps/Sec: 0.12, Epoch: 0.018616401088223863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 959, "loss": 0.35948580503463745, "memory_gb": 7.721559524536133, "step_time_ms": 7614.885330200195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:31] (step=0000959) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.018635833657209482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 960, "loss": 0.2577388882637024, "memory_gb": 7.721559524536133, "step_time_ms": 7396.574258804321, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:39] (step=0000960) Train Loss: 0.2366, Train Steps/Sec: 0.13, Epoch: 0.018655266226195105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 961, "loss": 0.14405503869056702, "memory_gb": 7.721559524536133, "step_time_ms": 7597.147703170776, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:47] (step=0000961) Train Loss: 0.1553, Train Steps/Sec: 0.13, Epoch: 0.018674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 962, "loss": 0.3696697950363159, "memory_gb": 7.721559524536133, "step_time_ms": 5411.981105804443, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:04:53] (step=0000962) Train Loss: 0.3811, Train Steps/Sec: 0.16, Epoch: 0.018694131364166342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 963, "loss": 0.2643772065639496, "memory_gb": 7.721559524536133, "step_time_ms": 7627.009153366089, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:01] (step=0000963) Train Loss: 0.2779, Train Steps/Sec: 0.12, Epoch: 0.01871356393315196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 964, "loss": 0.1835649460554123, "memory_gb": 7.721559524536133, "step_time_ms": 7550.378084182739, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:09] (step=0000964) Train Loss: 0.1679, Train Steps/Sec: 0.12, Epoch: 0.018732996502137583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 965, "loss": 0.3459380269050598, "memory_gb": 7.721559524536133, "step_time_ms": 7489.60280418396, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:17] (step=0000965) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.018752429071123202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 966, "loss": 0.27725863456726074, "memory_gb": 7.721559524536133, "step_time_ms": 7549.675941467285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:25] (step=0000966) Train Loss: 0.2596, Train Steps/Sec: 0.12, Epoch: 0.01877186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 967, "loss": 0.2290635108947754, "memory_gb": 7.721559524536133, "step_time_ms": 7516.476631164551, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:33] (step=0000967) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.018791294209094443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 968, "loss": 0.335854172706604, "memory_gb": 7.721559524536133, "step_time_ms": 7546.801805496216, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:41] (step=0000968) Train Loss: 0.3069, Train Steps/Sec: 0.12, Epoch: 0.018810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 969, "loss": 0.19149541854858398, "memory_gb": 7.721559524536133, "step_time_ms": 7528.133869171143, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:49] (step=0000969) Train Loss: 0.1903, Train Steps/Sec: 0.12, Epoch: 0.01883015934706568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 970, "loss": 0.2695595920085907, "memory_gb": 7.721559524536133, "step_time_ms": 7490.813493728638, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:05:57] (step=0000970) Train Loss: 0.2465, Train Steps/Sec: 0.12, Epoch: 0.018849591916051303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 971, "loss": 0.28965622186660767, "memory_gb": 7.721559524536133, "step_time_ms": 7412.086963653564, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:05] (step=0000971) Train Loss: 0.3307, Train Steps/Sec: 0.12, Epoch: 0.018869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 972, "loss": 0.20262590050697327, "memory_gb": 7.721559524536133, "step_time_ms": 7463.333606719971, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:14] (step=0000972) Train Loss: 0.1984, Train Steps/Sec: 0.12, Epoch: 0.01888845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 973, "loss": 0.23444867134094238, "memory_gb": 7.721559524536133, "step_time_ms": 7469.435691833496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:22] (step=0000973) Train Loss: 0.2956, Train Steps/Sec: 0.12, Epoch: 0.018907889623008163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 974, "loss": 0.23956972360610962, "memory_gb": 7.721559524536133, "step_time_ms": 7421.677350997925, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:30] (step=0000974) Train Loss: 0.2302, Train Steps/Sec: 0.12, Epoch: 0.018927322191993782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 975, "loss": 0.23908711969852448, "memory_gb": 7.721559524536133, "step_time_ms": 7416.668891906738, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:38] (step=0000975) Train Loss: 0.2727, Train Steps/Sec: 0.13, Epoch: 0.0189467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 976, "loss": 0.2443428933620453, "memory_gb": 7.721559524536133, "step_time_ms": 7483.511686325073, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:46] (step=0000976) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.018966187329965023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:06:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 977, "loss": 0.2841772437095642, "memory_gb": 7.721559524536133, "step_time_ms": 7383.114814758301, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:06:54] (step=0000977) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.018985619898950642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 978, "loss": 0.20202624797821045, "memory_gb": 7.721559524536133, "step_time_ms": 7481.051921844482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:02] (step=0000978) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.01900505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 979, "loss": 0.2580622732639313, "memory_gb": 7.715639114379883, "step_time_ms": 7510.381698608398, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:10] (step=0000979) Train Loss: 0.2091, Train Steps/Sec: 0.12, Epoch: 0.01902448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 980, "loss": 0.21081843972206116, "memory_gb": 7.721559524536133, "step_time_ms": 7412.283897399902, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:18] (step=0000980) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.019043917605907502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 981, "loss": 0.3074180781841278, "memory_gb": 7.721559524536133, "step_time_ms": 7537.90807723999, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:26] (step=0000981) Train Loss: 0.3056, Train Steps/Sec: 0.13, Epoch: 0.01906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 982, "loss": 0.24629884958267212, "memory_gb": 7.721559524536133, "step_time_ms": 7493.386030197144, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:34] (step=0000982) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.01908278274387874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 983, "loss": 0.23274561762809753, "memory_gb": 7.721559524536133, "step_time_ms": 7410.735607147217, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:42] (step=0000983) Train Loss: 0.1740, Train Steps/Sec: 0.12, Epoch: 0.01910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 984, "loss": 0.2194443643093109, "memory_gb": 7.721559524536133, "step_time_ms": 7480.8173179626465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:50] (step=0000984) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.01912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:07:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 985, "loss": 0.2168109118938446, "memory_gb": 7.721559524536133, "step_time_ms": 7482.522487640381, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:07:58] (step=0000985) Train Loss: 0.2689, Train Steps/Sec: 0.12, Epoch: 0.0191410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 986, "loss": 0.2846296727657318, "memory_gb": 7.715639114379883, "step_time_ms": 7449.720859527588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:06] (step=0000986) Train Loss: 0.2236, Train Steps/Sec: 0.12, Epoch: 0.01916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 987, "loss": 0.18604959547519684, "memory_gb": 7.721559524536133, "step_time_ms": 7489.507436752319, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:14] (step=0000987) Train Loss: 0.2217, Train Steps/Sec: 0.12, Epoch: 0.01917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 988, "loss": 0.35417795181274414, "memory_gb": 7.721559524536133, "step_time_ms": 7513.932943344116, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:22] (step=0000988) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.01919937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 989, "loss": 0.25056496262550354, "memory_gb": 7.721559524536133, "step_time_ms": 7274.84655380249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:30] (step=0000989) Train Loss: 0.2766, Train Steps/Sec: 0.13, Epoch: 0.01921881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 990, "loss": 0.23997551202774048, "memory_gb": 7.715639114379883, "step_time_ms": 7215.721607208252, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:38] (step=0000990) Train Loss: 0.2712, Train Steps/Sec: 0.13, Epoch: 0.0192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 991, "loss": 0.26419878005981445, "memory_gb": 7.721559524536133, "step_time_ms": 6022.378206253052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:44] (step=0000991) Train Loss: 0.2780, Train Steps/Sec: 0.16, Epoch: 0.01925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:08:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 992, "loss": 0.17722392082214355, "memory_gb": 7.721559524536133, "step_time_ms": 7478.529453277588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:08:52] (step=0000992) Train Loss: 0.1627, Train Steps/Sec: 0.12, Epoch: 0.01927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 993, "loss": 0.12537431716918945, "memory_gb": 7.721559524536133, "step_time_ms": 7398.239612579346, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:00] (step=0000993) Train Loss: 0.1494, Train Steps/Sec: 0.12, Epoch: 0.01929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 994, "loss": 0.1860465556383133, "memory_gb": 7.721559524536133, "step_time_ms": 7274.282455444336, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:08] (step=0000994) Train Loss: 0.2060, Train Steps/Sec: 0.13, Epoch: 0.01931597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 995, "loss": 0.3345777988433838, "memory_gb": 7.721559524536133, "step_time_ms": 7412.951231002808, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:16] (step=0000995) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.019335406140691798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 996, "loss": 0.25414448976516724, "memory_gb": 7.721559524536133, "step_time_ms": 7444.467544555664, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:24] (step=0000996) Train Loss: 0.2080, Train Steps/Sec: 0.12, Epoch: 0.01935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 997, "loss": 0.2007739543914795, "memory_gb": 7.721559524536133, "step_time_ms": 7403.386116027832, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:32] (step=0000997) Train Loss: 0.2318, Train Steps/Sec: 0.12, Epoch: 0.01937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 998, "loss": 0.27710598707199097, "memory_gb": 7.721559524536133, "step_time_ms": 7489.764451980591, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:40] (step=0000998) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.019393703847648658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 999, "loss": 0.17768292129039764, "memory_gb": 7.721559524536133, "step_time_ms": 7490.0946617126465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:48] (step=0000999) Train Loss: 0.2070, Train Steps/Sec: 0.13, Epoch: 0.01941313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:09:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1000, "loss": 0.15440607070922852, "memory_gb": 7.721559524536133, "step_time_ms": 7421.451568603516, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:09:56] (step=0001000) Train Loss: 0.1689, Train Steps/Sec: 0.12, Epoch: 0.0194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1001, "loss": 0.20372994244098663, "memory_gb": 7.721559524536133, "step_time_ms": 7498.769283294678, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:04] (step=0001001) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.019452001554605518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1002, "loss": 0.19669044017791748, "memory_gb": 7.721559524536133, "step_time_ms": 7470.0117111206055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:12] (step=0001002) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.01947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1003, "loss": 0.3221455216407776, "memory_gb": 7.721559524536133, "step_time_ms": 7442.4145221710205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:20] (step=0001003) Train Loss: 0.3013, Train Steps/Sec: 0.12, Epoch: 0.01949086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1004, "loss": 0.20936083793640137, "memory_gb": 7.721559524536133, "step_time_ms": 7528.4693241119385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:29] (step=0001004) Train Loss: 0.2464, Train Steps/Sec: 0.12, Epoch: 0.019510299261562378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1005, "loss": 0.23814581334590912, "memory_gb": 7.721559524536133, "step_time_ms": 7543.592214584351, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:37] (step=0001005) Train Loss: 0.2229, Train Steps/Sec: 0.12, Epoch: 0.019529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1006, "loss": 0.26157546043395996, "memory_gb": 7.721559524536133, "step_time_ms": 7435.470104217529, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:45] (step=0001006) Train Loss: 0.2420, Train Steps/Sec: 0.12, Epoch: 0.01954916439953362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:10:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1007, "loss": 0.1950015425682068, "memory_gb": 7.721559524536133, "step_time_ms": 7498.485803604126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:10:53] (step=0001007) Train Loss: 0.2264, Train Steps/Sec: 0.12, Epoch: 0.019568596968519238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1008, "loss": 0.2235935628414154, "memory_gb": 7.721559524536133, "step_time_ms": 7575.7434368133545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:01] (step=0001008) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.019588029537504856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1009, "loss": 0.2590155601501465, "memory_gb": 7.721559524536133, "step_time_ms": 7535.0236892700195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:09] (step=0001009) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.01960746210649048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1010, "loss": 0.24153006076812744, "memory_gb": 7.721559524536133, "step_time_ms": 7545.026540756226, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:17] (step=0001010) Train Loss: 0.2114, Train Steps/Sec: 0.12, Epoch: 0.019626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1011, "loss": 0.28177016973495483, "memory_gb": 7.721559524536133, "step_time_ms": 7548.288822174072, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:25] (step=0001011) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.019646327244461716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1012, "loss": 0.2875494658946991, "memory_gb": 7.721559524536133, "step_time_ms": 7382.437229156494, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:33] (step=0001012) Train Loss: 0.3363, Train Steps/Sec: 0.12, Epoch: 0.01966575981344734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1013, "loss": 0.358870267868042, "memory_gb": 7.721559524536133, "step_time_ms": 7502.135276794434, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:41] (step=0001013) Train Loss: 0.2864, Train Steps/Sec: 0.12, Epoch: 0.019685192382432957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1014, "loss": 0.3007998466491699, "memory_gb": 7.721559524536133, "step_time_ms": 7604.478597640991, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:49] (step=0001014) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.019704624951418576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:11:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1015, "loss": 0.3147968053817749, "memory_gb": 7.721559524536133, "step_time_ms": 7499.851226806641, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:11:57] (step=0001015) Train Loss: 0.2976, Train Steps/Sec: 0.12, Epoch: 0.0197240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1016, "loss": 0.18279722332954407, "memory_gb": 7.721559524536133, "step_time_ms": 7497.733116149902, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:05] (step=0001016) Train Loss: 0.1786, Train Steps/Sec: 0.12, Epoch: 0.019743490089389817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1017, "loss": 0.24809060990810394, "memory_gb": 7.721559524536133, "step_time_ms": 7547.196865081787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:13] (step=0001017) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.019762922658375436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1018, "loss": 0.29118967056274414, "memory_gb": 7.721559524536133, "step_time_ms": 7334.242820739746, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:21] (step=0001018) Train Loss: 0.2456, Train Steps/Sec: 0.13, Epoch: 0.01978235522736106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1019, "loss": 0.25119006633758545, "memory_gb": 7.721559524536133, "step_time_ms": 7552.46901512146, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:29] (step=0001019) Train Loss: 0.2841, Train Steps/Sec: 0.13, Epoch: 0.019801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1020, "loss": 0.3066507577896118, "memory_gb": 7.721559524536133, "step_time_ms": 5053.377866744995, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:35] (step=0001020) Train Loss: 0.3439, Train Steps/Sec: 0.18, Epoch: 0.019821220365332296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1021, "loss": 0.2579060196876526, "memory_gb": 7.721559524536133, "step_time_ms": 7568.869113922119, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:43] (step=0001021) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.01984065293431792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1022, "loss": 0.2501061260700226, "memory_gb": 7.721559524536133, "step_time_ms": 7513.651609420776, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:51] (step=0001022) Train Loss: 0.2727, Train Steps/Sec: 0.12, Epoch: 0.019860085503303537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:12:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1023, "loss": 0.3898683786392212, "memory_gb": 7.721559524536133, "step_time_ms": 7458.332777023315, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:12:59] (step=0001023) Train Loss: 0.3750, Train Steps/Sec: 0.12, Epoch: 0.019879518072289156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1024, "loss": 0.31795376539230347, "memory_gb": 7.721559524536133, "step_time_ms": 7573.0602741241455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:07] (step=0001024) Train Loss: 0.3326, Train Steps/Sec: 0.12, Epoch: 0.019898950641274775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1025, "loss": 0.25617673993110657, "memory_gb": 7.721559524536133, "step_time_ms": 7490.651607513428, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:15] (step=0001025) Train Loss: 0.2615, Train Steps/Sec: 0.13, Epoch: 0.019918383210260397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1026, "loss": 0.2423669844865799, "memory_gb": 7.721559524536133, "step_time_ms": 7444.65184211731, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:23] (step=0001026) Train Loss: 0.2141, Train Steps/Sec: 0.12, Epoch: 0.019937815779246016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1027, "loss": 0.29673391580581665, "memory_gb": 7.721559524536133, "step_time_ms": 7577.300548553467, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:31] (step=0001027) Train Loss: 0.2772, Train Steps/Sec: 0.12, Epoch: 0.019957248348231635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1028, "loss": 0.2991877496242523, "memory_gb": 7.721559524536133, "step_time_ms": 7457.4596881866455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:39] (step=0001028) Train Loss: 0.2668, Train Steps/Sec: 0.12, Epoch: 0.019976680917217257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1029, "loss": 0.25748366117477417, "memory_gb": 7.721559524536133, "step_time_ms": 7526.9775390625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:47] (step=0001029) Train Loss: 0.2576, Train Steps/Sec: 0.12, Epoch: 0.019996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:13:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1030, "loss": 0.14243707060813904, "memory_gb": 7.721559524536133, "step_time_ms": 7485.027551651001, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:13:55] (step=0001030) Train Loss: 0.2074, Train Steps/Sec: 0.12, Epoch: 0.020015546055188495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1031, "loss": 0.20068323612213135, "memory_gb": 7.721559524536133, "step_time_ms": 7494.418382644653, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:03] (step=0001031) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.020034978624174117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1032, "loss": 0.19532066583633423, "memory_gb": 7.721559524536133, "step_time_ms": 7461.925745010376, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:11] (step=0001032) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.020054411193159736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1033, "loss": 0.23802146315574646, "memory_gb": 7.721559524536133, "step_time_ms": 7501.6279220581055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:19] (step=0001033) Train Loss: 0.2519, Train Steps/Sec: 0.12, Epoch: 0.020073843762145355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1034, "loss": 0.2631318271160126, "memory_gb": 7.721559524536133, "step_time_ms": 7524.454593658447, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:27] (step=0001034) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.020093276331130977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1035, "loss": 0.25406235456466675, "memory_gb": 7.721559524536133, "step_time_ms": 7432.344913482666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:35] (step=0001035) Train Loss: 0.2971, Train Steps/Sec: 0.12, Epoch: 0.020112708900116596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1036, "loss": 0.21694524586200714, "memory_gb": 7.721559524536133, "step_time_ms": 7499.579429626465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:43] (step=0001036) Train Loss: 0.2367, Train Steps/Sec: 0.12, Epoch: 0.020132141469102215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1037, "loss": 0.19551753997802734, "memory_gb": 7.721559524536133, "step_time_ms": 7428.369998931885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:51] (step=0001037) Train Loss: 0.2213, Train Steps/Sec: 0.13, Epoch: 0.020151574038087837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1038, "loss": 0.2916998863220215, "memory_gb": 7.721559524536133, "step_time_ms": 7380.704879760742, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:14:59] (step=0001038) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.020171006607073456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1039, "loss": 0.322860449552536, "memory_gb": 7.721559524536133, "step_time_ms": 7455.105304718018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:08] (step=0001039) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.020190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1040, "loss": 0.27592891454696655, "memory_gb": 7.721559524536133, "step_time_ms": 7425.102710723877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:16] (step=0001040) Train Loss: 0.3011, Train Steps/Sec: 0.12, Epoch: 0.020209871745044693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1041, "loss": 0.23738807439804077, "memory_gb": 7.721559524536133, "step_time_ms": 7447.256088256836, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:24] (step=0001041) Train Loss: 0.2338, Train Steps/Sec: 0.12, Epoch: 0.020229304314030316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1042, "loss": 0.2742345929145813, "memory_gb": 7.721559524536133, "step_time_ms": 7483.815908432007, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:32] (step=0001042) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.020248736883015934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1043, "loss": 0.2770087718963623, "memory_gb": 7.721559524536133, "step_time_ms": 7420.656442642212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:40] (step=0001043) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.020268169452001553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1044, "loss": 0.31846198439598083, "memory_gb": 7.721559524536133, "step_time_ms": 7434.021234512329, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:48] (step=0001044) Train Loss: 0.3072, Train Steps/Sec: 0.12, Epoch: 0.020287602020987176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1045, "loss": 0.3272033929824829, "memory_gb": 7.721559524536133, "step_time_ms": 7449.47361946106, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:15:56] (step=0001045) Train Loss: 0.3092, Train Steps/Sec: 0.13, Epoch: 0.020307034589972794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1046, "loss": 0.17897823452949524, "memory_gb": 7.715639114379883, "step_time_ms": 7419.636249542236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:04] (step=0001046) Train Loss: 0.1973, Train Steps/Sec: 0.12, Epoch: 0.020326467158958413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1047, "loss": 0.19397485256195068, "memory_gb": 7.721559524536133, "step_time_ms": 7314.702749252319, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:12] (step=0001047) Train Loss: 0.2119, Train Steps/Sec: 0.13, Epoch: 0.020345899727944036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1048, "loss": 0.1315409243106842, "memory_gb": 7.721559524536133, "step_time_ms": 7531.616687774658, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:20] (step=0001048) Train Loss: 0.1660, Train Steps/Sec: 0.12, Epoch: 0.020365332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1049, "loss": 0.14855721592903137, "memory_gb": 7.721559524536133, "step_time_ms": 5141.492128372192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:25] (step=0001049) Train Loss: 0.2074, Train Steps/Sec: 0.18, Epoch: 0.020384764865915273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1050, "loss": 0.20099487900733948, "memory_gb": 7.721559524536133, "step_time_ms": 7498.703479766846, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:34] (step=0001050) Train Loss: 0.2332, Train Steps/Sec: 0.12, Epoch: 0.020404197434900895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1051, "loss": 0.23318663239479065, "memory_gb": 7.721559524536133, "step_time_ms": 7423.2306480407715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:42] (step=0001051) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.020423630003886514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1052, "loss": 0.24719467759132385, "memory_gb": 7.721559524536133, "step_time_ms": 7445.06049156189, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:50] (step=0001052) Train Loss: 0.2583, Train Steps/Sec: 0.12, Epoch: 0.020443062572872133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:16:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1053, "loss": 0.23341737687587738, "memory_gb": 7.721559524536133, "step_time_ms": 7558.151721954346, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:16:58] (step=0001053) Train Loss: 0.2585, Train Steps/Sec: 0.12, Epoch: 0.020462495141857752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1054, "loss": 0.183390811085701, "memory_gb": 7.721559524536133, "step_time_ms": 7487.336158752441, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:06] (step=0001054) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.020481927710843374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1055, "loss": 0.14296478033065796, "memory_gb": 7.721559524536133, "step_time_ms": 7464.136838912964, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:14] (step=0001055) Train Loss: 0.1790, Train Steps/Sec: 0.13, Epoch: 0.020501360279828993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1056, "loss": 0.22601251304149628, "memory_gb": 7.721559524536133, "step_time_ms": 7563.633918762207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:22] (step=0001056) Train Loss: 0.2638, Train Steps/Sec: 0.12, Epoch: 0.020520792848814612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1057, "loss": 0.21203388273715973, "memory_gb": 7.721559524536133, "step_time_ms": 7544.5556640625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:30] (step=0001057) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.020540225417800234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1058, "loss": 0.34174275398254395, "memory_gb": 7.721559524536133, "step_time_ms": 7458.93931388855, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:38] (step=0001058) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.020559657986785853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1059, "loss": 0.2203306257724762, "memory_gb": 7.721559524536133, "step_time_ms": 7544.867038726807, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:46] (step=0001059) Train Loss: 0.2065, Train Steps/Sec: 0.12, Epoch: 0.020579090555771472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:17:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1060, "loss": 0.24678057432174683, "memory_gb": 7.721559524536133, "step_time_ms": 7535.832643508911, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:17:54] (step=0001060) Train Loss: 0.2619, Train Steps/Sec: 0.12, Epoch: 0.020598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1061, "loss": 0.14821383357048035, "memory_gb": 7.721559524536133, "step_time_ms": 7512.043237686157, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:02] (step=0001061) Train Loss: 0.1782, Train Steps/Sec: 0.12, Epoch: 0.020617955693742713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1062, "loss": 0.18398644030094147, "memory_gb": 7.721559524536133, "step_time_ms": 7544.627904891968, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:10] (step=0001062) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.02063738826272833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1063, "loss": 0.2210782766342163, "memory_gb": 7.721559524536133, "step_time_ms": 7547.987699508667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:18] (step=0001063) Train Loss: 0.2018, Train Steps/Sec: 0.13, Epoch: 0.020656820831713954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1064, "loss": 0.2878086268901825, "memory_gb": 7.721559524536133, "step_time_ms": 7540.1763916015625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:26] (step=0001064) Train Loss: 0.2876, Train Steps/Sec: 0.12, Epoch: 0.020676253400699573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1065, "loss": 0.25770169496536255, "memory_gb": 7.721559524536133, "step_time_ms": 7578.204393386841, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:34] (step=0001065) Train Loss: 0.2158, Train Steps/Sec: 0.12, Epoch: 0.02069568596968519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1066, "loss": 0.26295536756515503, "memory_gb": 7.721559524536133, "step_time_ms": 7507.91597366333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:42] (step=0001066) Train Loss: 0.2690, Train Steps/Sec: 0.12, Epoch: 0.020715118538670814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1067, "loss": 0.24937587976455688, "memory_gb": 7.721559524536133, "step_time_ms": 7454.179286956787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:50] (step=0001067) Train Loss: 0.2749, Train Steps/Sec: 0.13, Epoch: 0.020734551107656433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:18:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1068, "loss": 0.32737812399864197, "memory_gb": 7.721559524536133, "step_time_ms": 7598.175525665283, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:18:59] (step=0001068) Train Loss: 0.2831, Train Steps/Sec: 0.12, Epoch: 0.02075398367664205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1069, "loss": 0.2226032316684723, "memory_gb": 7.721559524536133, "step_time_ms": 7711.012840270996, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:07] (step=0001069) Train Loss: 0.2212, Train Steps/Sec: 0.12, Epoch: 0.02077341624562767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1070, "loss": 0.2660020589828491, "memory_gb": 7.721559524536133, "step_time_ms": 7516.504764556885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:15] (step=0001070) Train Loss: 0.2673, Train Steps/Sec: 0.12, Epoch: 0.020792848814613293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1071, "loss": 0.14788924157619476, "memory_gb": 7.721559524536133, "step_time_ms": 7571.630954742432, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:23] (step=0001071) Train Loss: 0.1603, Train Steps/Sec: 0.12, Epoch: 0.02081228138359891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1072, "loss": 0.17228063941001892, "memory_gb": 7.721559524536133, "step_time_ms": 7528.737545013428, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:31] (step=0001072) Train Loss: 0.2438, Train Steps/Sec: 0.12, Epoch: 0.02083171395258453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1073, "loss": 0.35633593797683716, "memory_gb": 7.721559524536133, "step_time_ms": 7586.265802383423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:39] (step=0001073) Train Loss: 0.3277, Train Steps/Sec: 0.12, Epoch: 0.020851146521570153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1074, "loss": 0.1915181279182434, "memory_gb": 7.721559524536133, "step_time_ms": 7599.162817001343, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:47] (step=0001074) Train Loss: 0.2449, Train Steps/Sec: 0.12, Epoch: 0.02087057909055577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:19:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1075, "loss": 0.28350961208343506, "memory_gb": 7.721559524536133, "step_time_ms": 7537.206411361694, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:19:55] (step=0001075) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.02089001165954139, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1076, "loss": 0.2722470164299011, "memory_gb": 7.721559524536133, "step_time_ms": 7316.652536392212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:03] (step=0001076) Train Loss: 0.2810, Train Steps/Sec: 0.13, Epoch: 0.020909444228527013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1077, "loss": 0.27352187037467957, "memory_gb": 7.721559524536133, "step_time_ms": 7528.435468673706, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:11] (step=0001077) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.02092887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1078, "loss": 0.19430339336395264, "memory_gb": 7.721559524536133, "step_time_ms": 5034.78479385376, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:17] (step=0001078) Train Loss: 0.1920, Train Steps/Sec: 0.17, Epoch: 0.02094830936649825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1079, "loss": 0.24180841445922852, "memory_gb": 7.721559524536133, "step_time_ms": 7547.854900360107, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:25] (step=0001079) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.020967741935483872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1080, "loss": 0.3506819009780884, "memory_gb": 7.721559524536133, "step_time_ms": 7384.852409362793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:33] (step=0001080) Train Loss: 0.3202, Train Steps/Sec: 0.13, Epoch: 0.02098717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1081, "loss": 0.20142558217048645, "memory_gb": 7.721559524536133, "step_time_ms": 7461.197376251221, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:41] (step=0001081) Train Loss: 0.1820, Train Steps/Sec: 0.12, Epoch: 0.02100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1082, "loss": 0.2549065351486206, "memory_gb": 7.721559524536133, "step_time_ms": 7511.75856590271, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:49] (step=0001082) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.021026039642440732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:20:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1083, "loss": 0.27567967772483826, "memory_gb": 7.721559524536133, "step_time_ms": 7421.524286270142, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:20:57] (step=0001083) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.02104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1084, "loss": 0.2178701013326645, "memory_gb": 7.721559524536133, "step_time_ms": 7482.630968093872, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:05] (step=0001084) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.02106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1085, "loss": 0.15739858150482178, "memory_gb": 7.721559524536133, "step_time_ms": 7499.735116958618, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:13] (step=0001085) Train Loss: 0.1736, Train Steps/Sec: 0.12, Epoch: 0.02108433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1086, "loss": 0.2450563907623291, "memory_gb": 7.721559524536133, "step_time_ms": 7504.426956176758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:21] (step=0001086) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.02110376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1087, "loss": 0.21422165632247925, "memory_gb": 7.721559524536133, "step_time_ms": 7411.777973175049, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:29] (step=0001087) Train Loss: 0.2248, Train Steps/Sec: 0.13, Epoch: 0.02112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1088, "loss": 0.23068898916244507, "memory_gb": 7.715639114379883, "step_time_ms": 7438.594818115234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:37] (step=0001088) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.02114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1089, "loss": 0.31847459077835083, "memory_gb": 7.721559524536133, "step_time_ms": 7419.769287109375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:45] (step=0001089) Train Loss: 0.2254, Train Steps/Sec: 0.13, Epoch: 0.02116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1090, "loss": 0.21050970256328583, "memory_gb": 7.721559524536133, "step_time_ms": 7390.591621398926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:21:53] (step=0001090) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.02118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1091, "loss": 0.34286195039749146, "memory_gb": 7.721559524536133, "step_time_ms": 7464.74552154541, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:01] (step=0001091) Train Loss: 0.3153, Train Steps/Sec: 0.12, Epoch: 0.02120093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1092, "loss": 0.15943996608257294, "memory_gb": 7.721559524536133, "step_time_ms": 7429.79097366333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:09] (step=0001092) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.02122036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1093, "loss": 0.26231616735458374, "memory_gb": 7.721559524536133, "step_time_ms": 7438.8182163238525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:17] (step=0001093) Train Loss: 0.3039, Train Steps/Sec: 0.12, Epoch: 0.02123979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1094, "loss": 0.17242467403411865, "memory_gb": 7.721559524536133, "step_time_ms": 7539.257049560547, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:25] (step=0001094) Train Loss: 0.1788, Train Steps/Sec: 0.12, Epoch: 0.02125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1095, "loss": 0.30843114852905273, "memory_gb": 7.721559524536133, "step_time_ms": 7445.7573890686035, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:33] (step=0001095) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.02127866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1096, "loss": 0.18254750967025757, "memory_gb": 7.721559524536133, "step_time_ms": 7466.213226318359, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:42] (step=0001096) Train Loss: 0.2709, Train Steps/Sec: 0.12, Epoch: 0.02129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1097, "loss": 0.2589167058467865, "memory_gb": 7.721559524536133, "step_time_ms": 7523.386240005493, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:50] (step=0001097) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.02131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:22:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1098, "loss": 0.2273271530866623, "memory_gb": 7.721559524536133, "step_time_ms": 7509.531021118164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:22:58] (step=0001098) Train Loss: 0.2095, Train Steps/Sec: 0.12, Epoch: 0.021336960746210647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1099, "loss": 0.28059524297714233, "memory_gb": 7.721559524536133, "step_time_ms": 7424.141883850098, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:06] (step=0001099) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.02135639331519627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1100, "loss": 0.2077222615480423, "memory_gb": 7.721559524536133, "step_time_ms": 7497.164964675903, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:14] (step=0001100) Train Loss: 0.2350, Train Steps/Sec: 0.12, Epoch: 0.02137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1101, "loss": 0.18111425638198853, "memory_gb": 7.721559524536133, "step_time_ms": 7474.127292633057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:22] (step=0001101) Train Loss: 0.1870, Train Steps/Sec: 0.12, Epoch: 0.021395258453167507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1102, "loss": 0.28791046142578125, "memory_gb": 7.721559524536133, "step_time_ms": 7408.4577560424805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:30] (step=0001102) Train Loss: 0.2719, Train Steps/Sec: 0.12, Epoch: 0.02141469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1103, "loss": 0.24947993457317352, "memory_gb": 7.721559524536133, "step_time_ms": 7493.875026702881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:38] (step=0001103) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.02143412359113875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1104, "loss": 0.35503023862838745, "memory_gb": 7.721559524536133, "step_time_ms": 7486.2730503082275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:46] (step=0001104) Train Loss: 0.2723, Train Steps/Sec: 0.12, Epoch: 0.021453556160124367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:23:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1105, "loss": 0.16453078389167786, "memory_gb": 7.721559524536133, "step_time_ms": 7296.648979187012, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:23:54] (step=0001105) Train Loss: 0.2073, Train Steps/Sec: 0.13, Epoch: 0.02147298872910999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1106, "loss": 0.26218706369400024, "memory_gb": 7.721559524536133, "step_time_ms": 7510.72883605957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:02] (step=0001106) Train Loss: 0.2251, Train Steps/Sec: 0.13, Epoch: 0.02149242129809561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1107, "loss": 0.34398505091667175, "memory_gb": 7.721559524536133, "step_time_ms": 5335.920333862305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:08] (step=0001107) Train Loss: 0.2546, Train Steps/Sec: 0.18, Epoch: 0.021511853867081227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1108, "loss": 0.30776703357696533, "memory_gb": 7.721559524536133, "step_time_ms": 7538.737058639526, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:16] (step=0001108) Train Loss: 0.2180, Train Steps/Sec: 0.12, Epoch: 0.02153128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1109, "loss": 0.1552586704492569, "memory_gb": 7.721559524536133, "step_time_ms": 7468.883514404297, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:24] (step=0001109) Train Loss: 0.1602, Train Steps/Sec: 0.12, Epoch: 0.02155071900505247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1110, "loss": 0.298850953578949, "memory_gb": 7.721559524536133, "step_time_ms": 7514.818429946899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:32] (step=0001110) Train Loss: 0.3174, Train Steps/Sec: 0.12, Epoch: 0.021570151574038087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1111, "loss": 0.221989244222641, "memory_gb": 7.721559524536133, "step_time_ms": 7511.192321777344, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:40] (step=0001111) Train Loss: 0.2064, Train Steps/Sec: 0.12, Epoch: 0.02158958414302371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1112, "loss": 0.3200165033340454, "memory_gb": 7.721559524536133, "step_time_ms": 7453.362464904785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:48] (step=0001112) Train Loss: 0.2412, Train Steps/Sec: 0.12, Epoch: 0.021609016712009328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:24:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1113, "loss": 0.22594577074050903, "memory_gb": 7.721559524536133, "step_time_ms": 7463.876724243164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:24:56] (step=0001113) Train Loss: 0.2526, Train Steps/Sec: 0.12, Epoch: 0.021628449280994947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1114, "loss": 0.231724351644516, "memory_gb": 7.721559524536133, "step_time_ms": 7545.793533325195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:04] (step=0001114) Train Loss: 0.2577, Train Steps/Sec: 0.12, Epoch: 0.021647881849980566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1115, "loss": 0.18210378289222717, "memory_gb": 7.721559524536133, "step_time_ms": 7526.772737503052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:12] (step=0001115) Train Loss: 0.1628, Train Steps/Sec: 0.12, Epoch: 0.021667314418966188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1116, "loss": 0.1950226128101349, "memory_gb": 7.721559524536133, "step_time_ms": 7565.50407409668, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:20] (step=0001116) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.021686746987951807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1117, "loss": 0.31410789489746094, "memory_gb": 7.721559524536133, "step_time_ms": 7564.88037109375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:28] (step=0001117) Train Loss: 0.2420, Train Steps/Sec: 0.12, Epoch: 0.021706179556937426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1118, "loss": 0.3046531081199646, "memory_gb": 7.721559524536133, "step_time_ms": 7613.847255706787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:36] (step=0001118) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.021725612125923048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1119, "loss": 0.3362734317779541, "memory_gb": 7.721559524536133, "step_time_ms": 7488.916397094727, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:44] (step=0001119) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.021745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:25:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1120, "loss": 0.3909616470336914, "memory_gb": 7.721559524536133, "step_time_ms": 7623.9588260650635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:25:52] (step=0001120) Train Loss: 0.3255, Train Steps/Sec: 0.12, Epoch: 0.021764477263894286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1121, "loss": 0.26539546251296997, "memory_gb": 7.721559524536133, "step_time_ms": 7478.118181228638, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:00] (step=0001121) Train Loss: 0.1920, Train Steps/Sec: 0.12, Epoch: 0.021783909832879908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1122, "loss": 0.22460372745990753, "memory_gb": 7.721559524536133, "step_time_ms": 7479.639768600464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:08] (step=0001122) Train Loss: 0.3040, Train Steps/Sec: 0.12, Epoch: 0.021803342401865527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1123, "loss": 0.18660223484039307, "memory_gb": 7.721559524536133, "step_time_ms": 7557.143926620483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:16] (step=0001123) Train Loss: 0.2219, Train Steps/Sec: 0.12, Epoch: 0.021822774970851146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1124, "loss": 0.2234221249818802, "memory_gb": 7.721559524536133, "step_time_ms": 7553.096771240234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:24] (step=0001124) Train Loss: 0.2382, Train Steps/Sec: 0.13, Epoch: 0.021842207539836768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1125, "loss": 0.2600352466106415, "memory_gb": 7.721559524536133, "step_time_ms": 7468.970537185669, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:32] (step=0001125) Train Loss: 0.2394, Train Steps/Sec: 0.13, Epoch: 0.021861640108822387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1126, "loss": 0.23572401702404022, "memory_gb": 7.721559524536133, "step_time_ms": 7558.4716796875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:40] (step=0001126) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.021881072677808006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1127, "loss": 0.2852063775062561, "memory_gb": 7.721559524536133, "step_time_ms": 7527.286052703857, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:49] (step=0001127) Train Loss: 0.2898, Train Steps/Sec: 0.12, Epoch: 0.021900505246793624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:26:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1128, "loss": 0.31415778398513794, "memory_gb": 7.721559524536133, "step_time_ms": 7490.338563919067, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:26:57] (step=0001128) Train Loss: 0.2779, Train Steps/Sec: 0.12, Epoch: 0.021919937815779247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1129, "loss": 0.2720264196395874, "memory_gb": 7.721559524536133, "step_time_ms": 7611.947774887085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:05] (step=0001129) Train Loss: 0.2522, Train Steps/Sec: 0.12, Epoch: 0.021939370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1130, "loss": 0.21336738765239716, "memory_gb": 7.721559524536133, "step_time_ms": 7561.481237411499, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:13] (step=0001130) Train Loss: 0.1865, Train Steps/Sec: 0.12, Epoch: 0.021958802953750484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1131, "loss": 0.1923075020313263, "memory_gb": 7.721559524536133, "step_time_ms": 7510.616064071655, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:21] (step=0001131) Train Loss: 0.2108, Train Steps/Sec: 0.12, Epoch: 0.021978235522736107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1132, "loss": 0.2541377544403076, "memory_gb": 7.721559524536133, "step_time_ms": 7632.81512260437, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:29] (step=0001132) Train Loss: 0.2394, Train Steps/Sec: 0.12, Epoch: 0.021997668091721725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1133, "loss": 0.35165250301361084, "memory_gb": 7.721559524536133, "step_time_ms": 7601.285457611084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:37] (step=0001133) Train Loss: 0.3629, Train Steps/Sec: 0.12, Epoch: 0.022017100660707344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1134, "loss": 0.2291404902935028, "memory_gb": 7.721559524536133, "step_time_ms": 7420.389175415039, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:45] (step=0001134) Train Loss: 0.2118, Train Steps/Sec: 0.13, Epoch: 0.022036533229692967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1135, "loss": 0.23702211678028107, "memory_gb": 7.721559524536133, "step_time_ms": 7589.965581893921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:53] (step=0001135) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.022055965798678585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:27:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1136, "loss": 0.2783651351928711, "memory_gb": 7.721559524536133, "step_time_ms": 5175.650119781494, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:27:59] (step=0001136) Train Loss: 0.2669, Train Steps/Sec: 0.18, Epoch: 0.022075398367664204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1137, "loss": 0.1814710795879364, "memory_gb": 7.721559524536133, "step_time_ms": 7561.320066452026, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:07] (step=0001137) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.022094830936649826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1138, "loss": 0.22374331951141357, "memory_gb": 7.721559524536133, "step_time_ms": 7465.128421783447, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:15] (step=0001138) Train Loss: 0.2306, Train Steps/Sec: 0.12, Epoch: 0.022114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1139, "loss": 0.2093799114227295, "memory_gb": 7.721559524536133, "step_time_ms": 7462.513446807861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:23] (step=0001139) Train Loss: 0.1906, Train Steps/Sec: 0.12, Epoch: 0.022133696074621064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1140, "loss": 0.2473229318857193, "memory_gb": 7.721559524536133, "step_time_ms": 7532.17339515686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:31] (step=0001140) Train Loss: 0.2798, Train Steps/Sec: 0.12, Epoch: 0.022153128643606686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1141, "loss": 0.3293222486972809, "memory_gb": 7.721559524536133, "step_time_ms": 7530.236721038818, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:39] (step=0001141) Train Loss: 0.3337, Train Steps/Sec: 0.12, Epoch: 0.022172561212592305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1142, "loss": 0.30785617232322693, "memory_gb": 7.721559524536133, "step_time_ms": 7495.30553817749, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:47] (step=0001142) Train Loss: 0.2479, Train Steps/Sec: 0.12, Epoch: 0.022191993781577924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:28:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1143, "loss": 0.1726594716310501, "memory_gb": 7.721559524536133, "step_time_ms": 7570.9216594696045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:28:55] (step=0001143) Train Loss: 0.2300, Train Steps/Sec: 0.12, Epoch: 0.022211426350563543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1144, "loss": 0.29935982823371887, "memory_gb": 7.721559524536133, "step_time_ms": 7450.466156005859, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:03] (step=0001144) Train Loss: 0.2986, Train Steps/Sec: 0.13, Epoch: 0.022230858919549165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1145, "loss": 0.19967855513095856, "memory_gb": 7.721559524536133, "step_time_ms": 7233.680486679077, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:11] (step=0001145) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.022250291488534784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1146, "loss": 0.264291912317276, "memory_gb": 7.721559524536133, "step_time_ms": 7492.19536781311, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:19] (step=0001146) Train Loss: 0.2240, Train Steps/Sec: 0.12, Epoch: 0.022269724057520403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1147, "loss": 0.1470404416322708, "memory_gb": 7.721559524536133, "step_time_ms": 7474.485874176025, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:27] (step=0001147) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.022289156626506025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1148, "loss": 0.26387321949005127, "memory_gb": 7.721559524536133, "step_time_ms": 7468.859434127808, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:35] (step=0001148) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.022308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1149, "loss": 0.23951224982738495, "memory_gb": 7.721559524536133, "step_time_ms": 7505.1984786987305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:43] (step=0001149) Train Loss: 0.2951, Train Steps/Sec: 0.12, Epoch: 0.022328021764477263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1150, "loss": 0.15628233551979065, "memory_gb": 7.721559524536133, "step_time_ms": 7426.553726196289, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:51] (step=0001150) Train Loss: 0.1835, Train Steps/Sec: 0.12, Epoch: 0.022347454333462885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:29:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1151, "loss": 0.32709184288978577, "memory_gb": 7.721559524536133, "step_time_ms": 7438.557386398315, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:29:59] (step=0001151) Train Loss: 0.2894, Train Steps/Sec: 0.12, Epoch: 0.022366886902448504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1152, "loss": 0.2660117447376251, "memory_gb": 7.721559524536133, "step_time_ms": 7502.6116371154785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:07] (step=0001152) Train Loss: 0.2363, Train Steps/Sec: 0.12, Epoch: 0.022386319471434123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1153, "loss": 0.2525542378425598, "memory_gb": 7.721559524536133, "step_time_ms": 7501.446008682251, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:15] (step=0001153) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.022405752040419745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1154, "loss": 0.3239821791648865, "memory_gb": 7.715639114379883, "step_time_ms": 7411.199569702148, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:23] (step=0001154) Train Loss: 0.3233, Train Steps/Sec: 0.12, Epoch: 0.022425184609405364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1155, "loss": 0.3427622318267822, "memory_gb": 7.721559524536133, "step_time_ms": 7513.530254364014, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:31] (step=0001155) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.022444617178390983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1156, "loss": 0.1721433699131012, "memory_gb": 7.721559524536133, "step_time_ms": 7436.20228767395, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:39] (step=0001156) Train Loss: 0.1968, Train Steps/Sec: 0.12, Epoch: 0.022464049747376605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1157, "loss": 0.27056393027305603, "memory_gb": 7.715639114379883, "step_time_ms": 7517.819404602051, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:47] (step=0001157) Train Loss: 0.2951, Train Steps/Sec: 0.13, Epoch: 0.022483482316362224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1158, "loss": 0.31598493456840515, "memory_gb": 7.721559524536133, "step_time_ms": 7485.43119430542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:30:56] (step=0001158) Train Loss: 0.3133, Train Steps/Sec: 0.12, Epoch: 0.022502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1159, "loss": 0.2747320532798767, "memory_gb": 7.721559524536133, "step_time_ms": 7438.254117965698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:04] (step=0001159) Train Loss: 0.3156, Train Steps/Sec: 0.12, Epoch: 0.02252234745433346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1160, "loss": 0.27413210272789, "memory_gb": 7.721559524536133, "step_time_ms": 7487.1625900268555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:12] (step=0001160) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.022541780023319084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1161, "loss": 0.25372377038002014, "memory_gb": 7.721559524536133, "step_time_ms": 7506.602764129639, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:20] (step=0001161) Train Loss: 0.2348, Train Steps/Sec: 0.12, Epoch: 0.022561212592304702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1162, "loss": 0.2560943365097046, "memory_gb": 7.721559524536133, "step_time_ms": 7484.3690395355225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:28] (step=0001162) Train Loss: 0.2784, Train Steps/Sec: 0.12, Epoch: 0.02258064516129032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1163, "loss": 0.22854481637477875, "memory_gb": 7.721559524536133, "step_time_ms": 7315.669536590576, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:36] (step=0001163) Train Loss: 0.1924, Train Steps/Sec: 0.13, Epoch: 0.022600077730275944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1164, "loss": 0.1564059853553772, "memory_gb": 7.721559524536133, "step_time_ms": 7518.6872482299805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:44] (step=0001164) Train Loss: 0.1935, Train Steps/Sec: 0.12, Epoch: 0.022619510299261562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1165, "loss": 0.16639642417430878, "memory_gb": 7.721559524536133, "step_time_ms": 5076.088666915894, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:50] (step=0001165) Train Loss: 0.2118, Train Steps/Sec: 0.17, Epoch: 0.02263894286824718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:31:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1166, "loss": 0.3122069537639618, "memory_gb": 7.721559524536133, "step_time_ms": 7543.407440185547, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:31:58] (step=0001166) Train Loss: 0.3017, Train Steps/Sec: 0.12, Epoch: 0.022658375437232803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1167, "loss": 0.1911807358264923, "memory_gb": 7.721559524536133, "step_time_ms": 7457.46111869812, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:06] (step=0001167) Train Loss: 0.2280, Train Steps/Sec: 0.13, Epoch: 0.022677808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1168, "loss": 0.15597933530807495, "memory_gb": 7.721559524536133, "step_time_ms": 7473.022222518921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:14] (step=0001168) Train Loss: 0.2004, Train Steps/Sec: 0.12, Epoch: 0.02269724057520404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1169, "loss": 0.2241898775100708, "memory_gb": 7.721559524536133, "step_time_ms": 7564.440488815308, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:22] (step=0001169) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.022716673144189663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1170, "loss": 0.3226258158683777, "memory_gb": 7.721559524536133, "step_time_ms": 7498.162031173706, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:30] (step=0001170) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.022736105713175282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1171, "loss": 0.2681969106197357, "memory_gb": 7.721559524536133, "step_time_ms": 7555.931329727173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:38] (step=0001171) Train Loss: 0.2959, Train Steps/Sec: 0.12, Epoch: 0.0227555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1172, "loss": 0.1721685379743576, "memory_gb": 7.721559524536133, "step_time_ms": 7575.545787811279, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:46] (step=0001172) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.02277497085114652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:32:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1173, "loss": 0.19257977604866028, "memory_gb": 7.721559524536133, "step_time_ms": 7470.193862915039, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:32:54] (step=0001173) Train Loss: 0.1862, Train Steps/Sec: 0.12, Epoch: 0.022794403420132142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1174, "loss": 0.2554313540458679, "memory_gb": 7.721559524536133, "step_time_ms": 7442.172288894653, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:02] (step=0001174) Train Loss: 0.3113, Train Steps/Sec: 0.12, Epoch: 0.02281383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1175, "loss": 0.19714829325675964, "memory_gb": 7.721559524536133, "step_time_ms": 7522.884368896484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:10] (step=0001175) Train Loss: 0.2315, Train Steps/Sec: 0.12, Epoch: 0.02283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1176, "loss": 0.20781931281089783, "memory_gb": 7.721559524536133, "step_time_ms": 7512.610197067261, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:18] (step=0001176) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.022852701127089002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1177, "loss": 0.23863404989242554, "memory_gb": 7.721559524536133, "step_time_ms": 7490.232467651367, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:26] (step=0001177) Train Loss: 0.2176, Train Steps/Sec: 0.12, Epoch: 0.02287213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1178, "loss": 0.1433354914188385, "memory_gb": 7.721559524536133, "step_time_ms": 7536.318778991699, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:34] (step=0001178) Train Loss: 0.2290, Train Steps/Sec: 0.12, Epoch: 0.02289156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1179, "loss": 0.23232993483543396, "memory_gb": 7.721559524536133, "step_time_ms": 7510.869979858398, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:42] (step=0001179) Train Loss: 0.3219, Train Steps/Sec: 0.12, Epoch: 0.022910998834045862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1180, "loss": 0.17355971038341522, "memory_gb": 7.721559524536133, "step_time_ms": 7492.812395095825, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:50] (step=0001180) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.02293043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:33:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1181, "loss": 0.3816320300102234, "memory_gb": 7.721559524536133, "step_time_ms": 7589.218378067017, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:33:58] (step=0001181) Train Loss: 0.3122, Train Steps/Sec: 0.12, Epoch: 0.0229498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1182, "loss": 0.17750488221645355, "memory_gb": 7.721559524536133, "step_time_ms": 7566.416263580322, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:07] (step=0001182) Train Loss: 0.1793, Train Steps/Sec: 0.12, Epoch: 0.022969296541002722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1183, "loss": 0.2184390127658844, "memory_gb": 7.721559524536133, "step_time_ms": 7505.5201053619385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:15] (step=0001183) Train Loss: 0.2589, Train Steps/Sec: 0.13, Epoch: 0.02298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1184, "loss": 0.2928175926208496, "memory_gb": 7.721559524536133, "step_time_ms": 7598.20032119751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:23] (step=0001184) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.02300816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1185, "loss": 0.08849795907735825, "memory_gb": 7.721559524536133, "step_time_ms": 7505.696773529053, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:30] (step=0001185) Train Loss: 0.2130, Train Steps/Sec: 0.13, Epoch: 0.023027594247959582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1186, "loss": 0.1994439661502838, "memory_gb": 7.721559524536133, "step_time_ms": 7494.733095169067, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:39] (step=0001186) Train Loss: 0.2205, Train Steps/Sec: 0.12, Epoch: 0.0230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1187, "loss": 0.19545766711235046, "memory_gb": 7.721559524536133, "step_time_ms": 7537.7373695373535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:47] (step=0001187) Train Loss: 0.2628, Train Steps/Sec: 0.12, Epoch: 0.02306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:34:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1188, "loss": 0.2420332282781601, "memory_gb": 7.721559524536133, "step_time_ms": 7488.616228103638, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:34:55] (step=0001188) Train Loss: 0.2300, Train Steps/Sec: 0.12, Epoch: 0.02308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1189, "loss": 0.17337054014205933, "memory_gb": 7.721559524536133, "step_time_ms": 7515.014410018921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:03] (step=0001189) Train Loss: 0.1797, Train Steps/Sec: 0.12, Epoch: 0.02310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1190, "loss": 0.23193341493606567, "memory_gb": 7.721559524536133, "step_time_ms": 7631.979942321777, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:11] (step=0001190) Train Loss: 0.2585, Train Steps/Sec: 0.12, Epoch: 0.02312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1191, "loss": 0.184549480676651, "memory_gb": 7.721559524536133, "step_time_ms": 7483.967542648315, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:19] (step=0001191) Train Loss: 0.2419, Train Steps/Sec: 0.13, Epoch: 0.023144189661873298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1192, "loss": 0.2302432656288147, "memory_gb": 7.721559524536133, "step_time_ms": 7330.023527145386, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:27] (step=0001192) Train Loss: 0.2507, Train Steps/Sec: 0.13, Epoch: 0.02316362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1193, "loss": 0.2564619779586792, "memory_gb": 7.721559524536133, "step_time_ms": 7489.306211471558, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:35] (step=0001193) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.02318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1194, "loss": 0.22492754459381104, "memory_gb": 7.721559524536133, "step_time_ms": 5038.452863693237, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:41] (step=0001194) Train Loss: 0.2492, Train Steps/Sec: 0.18, Epoch: 0.023202487368830158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1195, "loss": 0.22910502552986145, "memory_gb": 7.721559524536133, "step_time_ms": 7523.555755615234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:49] (step=0001195) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.02322191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:35:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1196, "loss": 0.23974481225013733, "memory_gb": 7.721559524536133, "step_time_ms": 7447.596311569214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:35:57] (step=0001196) Train Loss: 0.2742, Train Steps/Sec: 0.12, Epoch: 0.0232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1197, "loss": 0.20884627103805542, "memory_gb": 7.721559524536133, "step_time_ms": 7450.800895690918, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:05] (step=0001197) Train Loss: 0.2352, Train Steps/Sec: 0.12, Epoch: 0.023260785075787018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1198, "loss": 0.18642345070838928, "memory_gb": 7.721559524536133, "step_time_ms": 7467.7276611328125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:13] (step=0001198) Train Loss: 0.2517, Train Steps/Sec: 0.12, Epoch: 0.02328021764477264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1199, "loss": 0.23339098691940308, "memory_gb": 7.721559524536133, "step_time_ms": 7409.194707870483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:21] (step=0001199) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.02329965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1200, "loss": 0.11122030019760132, "memory_gb": 7.721559524536133, "step_time_ms": 7405.716896057129, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:29] (step=0001200) Train Loss: 0.1473, Train Steps/Sec: 0.12, Epoch: 0.023319082782743878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1201, "loss": 0.29483288526535034, "memory_gb": 7.721559524536133, "step_time_ms": 7492.501497268677, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:37] (step=0001201) Train Loss: 0.3332, Train Steps/Sec: 0.12, Epoch: 0.0233385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1202, "loss": 0.2897246479988098, "memory_gb": 7.721559524536133, "step_time_ms": 7463.576078414917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:45] (step=0001202) Train Loss: 0.2756, Train Steps/Sec: 0.12, Epoch: 0.02335794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:36:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1203, "loss": 0.22867394983768463, "memory_gb": 7.721559524536133, "step_time_ms": 7505.463123321533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:36:53] (step=0001203) Train Loss: 0.2265, Train Steps/Sec: 0.12, Epoch: 0.023377380489700738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1204, "loss": 0.2566022574901581, "memory_gb": 7.721559524536133, "step_time_ms": 7572.52049446106, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:01] (step=0001204) Train Loss: 0.2316, Train Steps/Sec: 0.12, Epoch: 0.023396813058686357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1205, "loss": 0.21396377682685852, "memory_gb": 7.721559524536133, "step_time_ms": 7593.590021133423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:09] (step=0001205) Train Loss: 0.2134, Train Steps/Sec: 0.13, Epoch: 0.02341624562767198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1206, "loss": 0.41665786504745483, "memory_gb": 7.721559524536133, "step_time_ms": 7428.531169891357, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:17] (step=0001206) Train Loss: 0.3312, Train Steps/Sec: 0.12, Epoch: 0.023435678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1207, "loss": 0.24355119466781616, "memory_gb": 7.721559524536133, "step_time_ms": 7537.8124713897705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:25] (step=0001207) Train Loss: 0.3005, Train Steps/Sec: 0.12, Epoch: 0.023455110765643217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1208, "loss": 0.25828999280929565, "memory_gb": 7.721559524536133, "step_time_ms": 7449.703693389893, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:33] (step=0001208) Train Loss: 0.2818, Train Steps/Sec: 0.12, Epoch: 0.02347454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1209, "loss": 0.3045308291912079, "memory_gb": 7.721559524536133, "step_time_ms": 7464.40052986145, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:41] (step=0001209) Train Loss: 0.3163, Train Steps/Sec: 0.12, Epoch: 0.023493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1210, "loss": 0.19927167892456055, "memory_gb": 7.721559524536133, "step_time_ms": 7486.915826797485, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:49] (step=0001210) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.023513408472600077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:37:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1211, "loss": 0.2316630333662033, "memory_gb": 7.721559524536133, "step_time_ms": 7436.182737350464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:37:58] (step=0001211) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.0235328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1212, "loss": 0.2846199870109558, "memory_gb": 7.721559524536133, "step_time_ms": 7464.897155761719, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:05] (step=0001212) Train Loss: 0.2707, Train Steps/Sec: 0.13, Epoch: 0.023552273610571318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1213, "loss": 0.21275335550308228, "memory_gb": 7.721559524536133, "step_time_ms": 7523.784875869751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:14] (step=0001213) Train Loss: 0.2170, Train Steps/Sec: 0.12, Epoch: 0.023571706179556937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1214, "loss": 0.2679358124732971, "memory_gb": 7.721559524536133, "step_time_ms": 7506.596088409424, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:22] (step=0001214) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.02359113874854256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1215, "loss": 0.20341184735298157, "memory_gb": 7.721559524536133, "step_time_ms": 7439.813852310181, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:30] (step=0001215) Train Loss: 0.1780, Train Steps/Sec: 0.12, Epoch: 0.023610571317528178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1216, "loss": 0.29141873121261597, "memory_gb": 7.721559524536133, "step_time_ms": 7486.075401306152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:38] (step=0001216) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.023630003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1217, "loss": 0.27229148149490356, "memory_gb": 7.721559524536133, "step_time_ms": 7396.302223205566, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:46] (step=0001217) Train Loss: 0.2539, Train Steps/Sec: 0.13, Epoch: 0.023649436455499415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:38:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1218, "loss": 0.18626192212104797, "memory_gb": 7.721559524536133, "step_time_ms": 7398.062229156494, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:38:54] (step=0001218) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.023668869024485038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1219, "loss": 0.26338833570480347, "memory_gb": 7.721559524536133, "step_time_ms": 7464.9293422698975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:02] (step=0001219) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.023688301593470656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1220, "loss": 0.13014759123325348, "memory_gb": 7.721559524536133, "step_time_ms": 7387.79616355896, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:10] (step=0001220) Train Loss: 0.1472, Train Steps/Sec: 0.12, Epoch: 0.023707734162456275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1221, "loss": 0.2533224821090698, "memory_gb": 7.721559524536133, "step_time_ms": 7265.021800994873, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:18] (step=0001221) Train Loss: 0.2269, Train Steps/Sec: 0.13, Epoch: 0.023727166731441898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1222, "loss": 0.23971927165985107, "memory_gb": 7.721559524536133, "step_time_ms": 7491.808891296387, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:26] (step=0001222) Train Loss: 0.2456, Train Steps/Sec: 0.12, Epoch: 0.023746599300427516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1223, "loss": 0.3258764147758484, "memory_gb": 7.721559524536133, "step_time_ms": 4999.8674392700195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:32] (step=0001223) Train Loss: 0.2848, Train Steps/Sec: 0.17, Epoch: 0.023766031869413135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1224, "loss": 0.30964457988739014, "memory_gb": 7.721559524536133, "step_time_ms": 7500.757217407227, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:40] (step=0001224) Train Loss: 0.3240, Train Steps/Sec: 0.12, Epoch: 0.023785464438398757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1225, "loss": 0.2279140055179596, "memory_gb": 7.721559524536133, "step_time_ms": 7493.20387840271, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:48] (step=0001225) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.023804897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:39:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1226, "loss": 0.28756433725357056, "memory_gb": 7.721559524536133, "step_time_ms": 7444.576263427734, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:39:56] (step=0001226) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.023824329576369995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1227, "loss": 0.26001328229904175, "memory_gb": 7.721559524536133, "step_time_ms": 7539.824724197388, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:04] (step=0001227) Train Loss: 0.2643, Train Steps/Sec: 0.12, Epoch: 0.023843762145355617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1228, "loss": 0.2238835096359253, "memory_gb": 7.721559524536133, "step_time_ms": 7420.886754989624, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:12] (step=0001228) Train Loss: 0.1974, Train Steps/Sec: 0.12, Epoch: 0.023863194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1229, "loss": 0.3173316717147827, "memory_gb": 7.721559524536133, "step_time_ms": 7447.704076766968, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:20] (step=0001229) Train Loss: 0.3093, Train Steps/Sec: 0.12, Epoch: 0.023882627283326855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1230, "loss": 0.3247136175632477, "memory_gb": 7.721559524536133, "step_time_ms": 7467.110395431519, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:28] (step=0001230) Train Loss: 0.2808, Train Steps/Sec: 0.12, Epoch: 0.023902059852312477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1231, "loss": 0.23430848121643066, "memory_gb": 7.721559524536133, "step_time_ms": 7391.660690307617, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:36] (step=0001231) Train Loss: 0.2684, Train Steps/Sec: 0.13, Epoch: 0.023921492421298096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1232, "loss": 0.2735467851161957, "memory_gb": 7.721559524536133, "step_time_ms": 7389.616012573242, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:44] (step=0001232) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.023940924990283715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1233, "loss": 0.28375697135925293, "memory_gb": 7.721559524536133, "step_time_ms": 7487.00213432312, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:40:52] (step=0001233) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.023960357559269334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1234, "loss": 0.20764702558517456, "memory_gb": 7.721559524536133, "step_time_ms": 7423.820495605469, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:00] (step=0001234) Train Loss: 0.1928, Train Steps/Sec: 0.12, Epoch: 0.023979790128254956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1235, "loss": 0.17583705484867096, "memory_gb": 7.721559524536133, "step_time_ms": 7489.178419113159, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:08] (step=0001235) Train Loss: 0.1995, Train Steps/Sec: 0.12, Epoch: 0.023999222697240575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1236, "loss": 0.25126945972442627, "memory_gb": 7.721559524536133, "step_time_ms": 7525.8073806762695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:16] (step=0001236) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.024018655266226194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1237, "loss": 0.21688364446163177, "memory_gb": 7.721559524536133, "step_time_ms": 7482.682943344116, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:25] (step=0001237) Train Loss: 0.2028, Train Steps/Sec: 0.12, Epoch: 0.024038087835211816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1238, "loss": 0.1534416377544403, "memory_gb": 7.721559524536133, "step_time_ms": 7467.020034790039, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:33] (step=0001238) Train Loss: 0.1783, Train Steps/Sec: 0.12, Epoch: 0.024057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1239, "loss": 0.26201754808425903, "memory_gb": 7.721559524536133, "step_time_ms": 7532.4866771698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:41] (step=0001239) Train Loss: 0.2671, Train Steps/Sec: 0.12, Epoch: 0.024076952973183054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1240, "loss": 0.33513566851615906, "memory_gb": 7.721559524536133, "step_time_ms": 7478.174448013306, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:49] (step=0001240) Train Loss: 0.2494, Train Steps/Sec: 0.13, Epoch: 0.024096385542168676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:41:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1241, "loss": 0.19776418805122375, "memory_gb": 7.721559524536133, "step_time_ms": 7442.938566207886, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:41:57] (step=0001241) Train Loss: 0.1874, Train Steps/Sec: 0.13, Epoch: 0.024115818111154295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1242, "loss": 0.2970690131187439, "memory_gb": 7.721559524536133, "step_time_ms": 7526.291131973267, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:05] (step=0001242) Train Loss: 0.3165, Train Steps/Sec: 0.12, Epoch: 0.024135250680139914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1243, "loss": 0.1817784607410431, "memory_gb": 7.721559524536133, "step_time_ms": 7468.103647232056, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:13] (step=0001243) Train Loss: 0.1851, Train Steps/Sec: 0.13, Epoch: 0.024154683249125536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1244, "loss": 0.23742042481899261, "memory_gb": 7.721559524536133, "step_time_ms": 7441.25771522522, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:20] (step=0001244) Train Loss: 0.2694, Train Steps/Sec: 0.13, Epoch: 0.024174115818111155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1245, "loss": 0.2390678972005844, "memory_gb": 7.721559524536133, "step_time_ms": 7667.457342147827, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:29] (step=0001245) Train Loss: 0.2824, Train Steps/Sec: 0.12, Epoch: 0.024193548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1246, "loss": 0.29590606689453125, "memory_gb": 7.721559524536133, "step_time_ms": 7525.195121765137, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:37] (step=0001246) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.024212980956082396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1247, "loss": 0.29771965742111206, "memory_gb": 7.721559524536133, "step_time_ms": 7520.528078079224, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:45] (step=0001247) Train Loss: 0.2503, Train Steps/Sec: 0.12, Epoch: 0.024232413525068015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:42:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1248, "loss": 0.21694228053092957, "memory_gb": 7.721559524536133, "step_time_ms": 7546.307802200317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:42:53] (step=0001248) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.024251846094053633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1249, "loss": 0.2506580352783203, "memory_gb": 7.721559524536133, "step_time_ms": 7550.748109817505, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:01] (step=0001249) Train Loss: 0.2379, Train Steps/Sec: 0.12, Epoch: 0.024271278663039252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1250, "loss": 0.23637397587299347, "memory_gb": 7.721559524536133, "step_time_ms": 7360.219478607178, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:09] (step=0001250) Train Loss: 0.1738, Train Steps/Sec: 0.13, Epoch: 0.024290711232024875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1251, "loss": 0.19236540794372559, "memory_gb": 7.721559524536133, "step_time_ms": 7538.227081298828, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:17] (step=0001251) Train Loss: 0.1774, Train Steps/Sec: 0.12, Epoch: 0.024310143801010493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1252, "loss": 0.18092823028564453, "memory_gb": 7.721559524536133, "step_time_ms": 5380.648612976074, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:23] (step=0001252) Train Loss: 0.1605, Train Steps/Sec: 0.17, Epoch: 0.024329576369996112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1253, "loss": 0.16450688242912292, "memory_gb": 7.721559524536133, "step_time_ms": 7573.745012283325, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:31] (step=0001253) Train Loss: 0.1881, Train Steps/Sec: 0.12, Epoch: 0.024349008938981734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1254, "loss": 0.2481812834739685, "memory_gb": 7.721559524536133, "step_time_ms": 7525.511026382446, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:39] (step=0001254) Train Loss: 0.2680, Train Steps/Sec: 0.12, Epoch: 0.024368441507967353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1255, "loss": 0.2742244601249695, "memory_gb": 7.721559524536133, "step_time_ms": 7598.9344120025635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:47] (step=0001255) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.024387874076952972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1256, "loss": 0.33344656229019165, "memory_gb": 7.721559524536133, "step_time_ms": 7639.991760253906, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:43:55] (step=0001256) Train Loss: 0.2887, Train Steps/Sec: 0.12, Epoch: 0.024407306645938594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1257, "loss": 0.25814396142959595, "memory_gb": 7.721559524536133, "step_time_ms": 7566.089153289795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:03] (step=0001257) Train Loss: 0.2492, Train Steps/Sec: 0.13, Epoch: 0.024426739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1258, "loss": 0.22639252245426178, "memory_gb": 7.721559524536133, "step_time_ms": 7505.480051040649, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:11] (step=0001258) Train Loss: 0.1917, Train Steps/Sec: 0.13, Epoch: 0.024446171783909832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1259, "loss": 0.30751341581344604, "memory_gb": 7.721559524536133, "step_time_ms": 7523.9577293396, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:19] (step=0001259) Train Loss: 0.2455, Train Steps/Sec: 0.12, Epoch: 0.024465604352895454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1260, "loss": 0.2533087432384491, "memory_gb": 7.721559524536133, "step_time_ms": 7521.406173706055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:27] (step=0001260) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.024485036921881073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1261, "loss": 0.30666136741638184, "memory_gb": 7.721559524536133, "step_time_ms": 7487.788915634155, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:35] (step=0001261) Train Loss: 0.2651, Train Steps/Sec: 0.12, Epoch: 0.024504469490866692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1262, "loss": 0.21034976840019226, "memory_gb": 7.721559524536133, "step_time_ms": 7536.3922119140625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:43] (step=0001262) Train Loss: 0.2047, Train Steps/Sec: 0.12, Epoch: 0.02452390205985231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1263, "loss": 0.28504887223243713, "memory_gb": 7.721559524536133, "step_time_ms": 7524.496555328369, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:51] (step=0001263) Train Loss: 0.2665, Train Steps/Sec: 0.12, Epoch: 0.024543334628837933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:44:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1264, "loss": 0.1818343997001648, "memory_gb": 7.721559524536133, "step_time_ms": 7479.830980300903, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:44:59] (step=0001264) Train Loss: 0.2283, Train Steps/Sec: 0.12, Epoch: 0.024562767197823552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1265, "loss": 0.21963495016098022, "memory_gb": 7.721559524536133, "step_time_ms": 7536.294460296631, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:07] (step=0001265) Train Loss: 0.2943, Train Steps/Sec: 0.12, Epoch: 0.02458219976680917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1266, "loss": 0.21415776014328003, "memory_gb": 7.721559524536133, "step_time_ms": 7506.561517715454, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:15] (step=0001266) Train Loss: 0.2209, Train Steps/Sec: 0.12, Epoch: 0.024601632335794793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1267, "loss": 0.2513936161994934, "memory_gb": 7.721559524536133, "step_time_ms": 7457.066059112549, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:23] (step=0001267) Train Loss: 0.3195, Train Steps/Sec: 0.12, Epoch: 0.024621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1268, "loss": 0.18884387612342834, "memory_gb": 7.721559524536133, "step_time_ms": 7478.125810623169, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:31] (step=0001268) Train Loss: 0.1904, Train Steps/Sec: 0.12, Epoch: 0.02464049747376603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1269, "loss": 0.24462662637233734, "memory_gb": 7.721559524536133, "step_time_ms": 7571.550130844116, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:40] (step=0001269) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.024659930042751653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1270, "loss": 0.3355969488620758, "memory_gb": 7.721559524536133, "step_time_ms": 7459.761142730713, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:48] (step=0001270) Train Loss: 0.2944, Train Steps/Sec: 0.12, Epoch: 0.024679362611737272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:45:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1271, "loss": 0.17735812067985535, "memory_gb": 7.721559524536133, "step_time_ms": 7484.031438827515, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:45:56] (step=0001271) Train Loss: 0.1941, Train Steps/Sec: 0.12, Epoch: 0.02469879518072289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1272, "loss": 0.1839093565940857, "memory_gb": 7.721559524536133, "step_time_ms": 7480.278730392456, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:04] (step=0001272) Train Loss: 0.2036, Train Steps/Sec: 0.13, Epoch: 0.024718227749708513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1273, "loss": 0.30048370361328125, "memory_gb": 7.721559524536133, "step_time_ms": 7411.193370819092, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:12] (step=0001273) Train Loss: 0.3011, Train Steps/Sec: 0.12, Epoch: 0.02473766031869413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1274, "loss": 0.2610885500907898, "memory_gb": 7.721559524536133, "step_time_ms": 7516.968011856079, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:20] (step=0001274) Train Loss: 0.2867, Train Steps/Sec: 0.12, Epoch: 0.02475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1275, "loss": 0.21644330024719238, "memory_gb": 7.721559524536133, "step_time_ms": 7465.284824371338, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:28] (step=0001275) Train Loss: 0.2373, Train Steps/Sec: 0.12, Epoch: 0.024776525456665373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1276, "loss": 0.2772296965122223, "memory_gb": 7.721559524536133, "step_time_ms": 7459.4244956970215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:36] (step=0001276) Train Loss: 0.2525, Train Steps/Sec: 0.12, Epoch: 0.02479595802565099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1277, "loss": 0.13502921164035797, "memory_gb": 7.721559524536133, "step_time_ms": 7505.776643753052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:44] (step=0001277) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.02481539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1278, "loss": 0.2635124921798706, "memory_gb": 7.721559524536133, "step_time_ms": 7508.186101913452, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:46:52] (step=0001278) Train Loss: 0.3122, Train Steps/Sec: 0.12, Epoch: 0.02483482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1279, "loss": 0.18597471714019775, "memory_gb": 7.721559524536133, "step_time_ms": 7238.5289669036865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:00] (step=0001279) Train Loss: 0.1841, Train Steps/Sec: 0.13, Epoch: 0.02485425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1280, "loss": 0.1916787028312683, "memory_gb": 7.721559524536133, "step_time_ms": 7486.218690872192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:08] (step=0001280) Train Loss: 0.1906, Train Steps/Sec: 0.13, Epoch: 0.02487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1281, "loss": 0.1670314371585846, "memory_gb": 7.721559524536133, "step_time_ms": 5297.811508178711, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:13] (step=0001281) Train Loss: 0.1691, Train Steps/Sec: 0.18, Epoch: 0.02489312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1282, "loss": 0.1250748187303543, "memory_gb": 7.721559524536133, "step_time_ms": 7505.166053771973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:21] (step=0001282) Train Loss: 0.1956, Train Steps/Sec: 0.12, Epoch: 0.02491255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1283, "loss": 0.2501370906829834, "memory_gb": 7.721559524536133, "step_time_ms": 7505.666494369507, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:29] (step=0001283) Train Loss: 0.2793, Train Steps/Sec: 0.12, Epoch: 0.02493198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1284, "loss": 0.3469107747077942, "memory_gb": 7.721559524536133, "step_time_ms": 7457.383632659912, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:37] (step=0001284) Train Loss: 0.3504, Train Steps/Sec: 0.12, Epoch: 0.02495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1285, "loss": 0.23942258954048157, "memory_gb": 7.721559524536133, "step_time_ms": 7516.576528549194, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:45] (step=0001285) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.02497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1286, "loss": 0.17987146973609924, "memory_gb": 7.721559524536133, "step_time_ms": 7446.960926055908, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:47:53] (step=0001286) Train Loss: 0.2289, Train Steps/Sec: 0.12, Epoch: 0.02499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1287, "loss": 0.2792355418205261, "memory_gb": 7.721559524536133, "step_time_ms": 7442.188739776611, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:01] (step=0001287) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.02500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1288, "loss": 0.24362754821777344, "memory_gb": 7.721559524536133, "step_time_ms": 7571.365594863892, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:10] (step=0001288) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.02502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1289, "loss": 0.3720594048500061, "memory_gb": 7.721559524536133, "step_time_ms": 7529.071092605591, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:18] (step=0001289) Train Loss: 0.3335, Train Steps/Sec: 0.12, Epoch: 0.02504858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1290, "loss": 0.35247474908828735, "memory_gb": 7.721559524536133, "step_time_ms": 7472.3801612854, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:26] (step=0001290) Train Loss: 0.3277, Train Steps/Sec: 0.12, Epoch: 0.02506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1291, "loss": 0.2574489712715149, "memory_gb": 7.721559524536133, "step_time_ms": 7576.364278793335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:34] (step=0001291) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.025087446560435288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1292, "loss": 0.18627715110778809, "memory_gb": 7.721559524536133, "step_time_ms": 7614.603519439697, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:42] (step=0001292) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.02510687912942091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1293, "loss": 0.22047516703605652, "memory_gb": 7.715639114379883, "step_time_ms": 7378.427028656006, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:50] (step=0001293) Train Loss: 0.2744, Train Steps/Sec: 0.12, Epoch: 0.02512631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:48:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1294, "loss": 0.368343710899353, "memory_gb": 7.721559524536133, "step_time_ms": 7465.672731399536, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:48:58] (step=0001294) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.025145744267392148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1295, "loss": 0.21525289118289948, "memory_gb": 7.721559524536133, "step_time_ms": 7470.4554080963135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:06] (step=0001295) Train Loss: 0.2184, Train Steps/Sec: 0.13, Epoch: 0.02516517683637777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1296, "loss": 0.20028093457221985, "memory_gb": 7.721559524536133, "step_time_ms": 7432.393550872803, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:14] (step=0001296) Train Loss: 0.2671, Train Steps/Sec: 0.13, Epoch: 0.02518460940536339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1297, "loss": 0.13170984387397766, "memory_gb": 7.721559524536133, "step_time_ms": 7484.902143478394, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:22] (step=0001297) Train Loss: 0.1806, Train Steps/Sec: 0.12, Epoch: 0.025204041974349008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1298, "loss": 0.24267102777957916, "memory_gb": 7.721559524536133, "step_time_ms": 7491.13392829895, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:30] (step=0001298) Train Loss: 0.3210, Train Steps/Sec: 0.12, Epoch: 0.02522347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1299, "loss": 0.17151612043380737, "memory_gb": 7.721559524536133, "step_time_ms": 7467.375993728638, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:38] (step=0001299) Train Loss: 0.2562, Train Steps/Sec: 0.12, Epoch: 0.02524290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1300, "loss": 0.3363703489303589, "memory_gb": 7.721559524536133, "step_time_ms": 7494.486093521118, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:46] (step=0001300) Train Loss: 0.3292, Train Steps/Sec: 0.12, Epoch: 0.025262339681305868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1301, "loss": 0.27482154965400696, "memory_gb": 7.721559524536133, "step_time_ms": 7449.6169090271, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:49:54] (step=0001301) Train Loss: 0.2943, Train Steps/Sec: 0.13, Epoch: 0.02528177225029149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1302, "loss": 0.22315546870231628, "memory_gb": 7.721559524536133, "step_time_ms": 7458.909273147583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:02] (step=0001302) Train Loss: 0.1657, Train Steps/Sec: 0.12, Epoch: 0.02530120481927711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1303, "loss": 0.20075714588165283, "memory_gb": 7.721559524536133, "step_time_ms": 7477.732181549072, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:10] (step=0001303) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.025320637388262728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1304, "loss": 0.28793543577194214, "memory_gb": 7.721559524536133, "step_time_ms": 7512.903451919556, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:18] (step=0001304) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.02534006995724835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1305, "loss": 0.32575374841690063, "memory_gb": 7.721559524536133, "step_time_ms": 7396.470546722412, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:26] (step=0001305) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.02535950252623397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1306, "loss": 0.20333239436149597, "memory_gb": 7.721559524536133, "step_time_ms": 7448.582887649536, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:34] (step=0001306) Train Loss: 0.2306, Train Steps/Sec: 0.12, Epoch: 0.025378935095219587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1307, "loss": 0.24347716569900513, "memory_gb": 7.721559524536133, "step_time_ms": 7537.184476852417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:42] (step=0001307) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.025398367664205206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1308, "loss": 0.2897713780403137, "memory_gb": 7.721559524536133, "step_time_ms": 7305.741786956787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:50] (step=0001308) Train Loss: 0.2821, Train Steps/Sec: 0.13, Epoch: 0.02541780023319083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1309, "loss": 0.2795088291168213, "memory_gb": 7.721559524536133, "step_time_ms": 7582.810878753662, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:50:58] (step=0001309) Train Loss: 0.2681, Train Steps/Sec: 0.13, Epoch: 0.025437232802176447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1310, "loss": 0.2957313060760498, "memory_gb": 7.721559524536133, "step_time_ms": 5331.779956817627, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:04] (step=0001310) Train Loss: 0.2614, Train Steps/Sec: 0.17, Epoch: 0.025456665371162066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1311, "loss": 0.1416824758052826, "memory_gb": 7.721559524536133, "step_time_ms": 7586.247205734253, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:12] (step=0001311) Train Loss: 0.1614, Train Steps/Sec: 0.12, Epoch: 0.02547609794014769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1312, "loss": 0.2914065718650818, "memory_gb": 7.721559524536133, "step_time_ms": 7548.82550239563, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:20] (step=0001312) Train Loss: 0.2321, Train Steps/Sec: 0.12, Epoch: 0.025495530509133307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1313, "loss": 0.24787141382694244, "memory_gb": 7.715639114379883, "step_time_ms": 7451.944351196289, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:28] (step=0001313) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.025514963078118926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1314, "loss": 0.2430015504360199, "memory_gb": 7.721559524536133, "step_time_ms": 7574.878215789795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:36] (step=0001314) Train Loss: 0.2216, Train Steps/Sec: 0.12, Epoch: 0.02553439564710455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1315, "loss": 0.22008126974105835, "memory_gb": 7.721559524536133, "step_time_ms": 7487.190008163452, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:44] (step=0001315) Train Loss: 0.2541, Train Steps/Sec: 0.12, Epoch: 0.025553828216090167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:51:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1316, "loss": 0.21014858782291412, "memory_gb": 7.721559524536133, "step_time_ms": 7557.013988494873, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:51:52] (step=0001316) Train Loss: 0.1823, Train Steps/Sec: 0.12, Epoch: 0.025573260785075786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1317, "loss": 0.248779758810997, "memory_gb": 7.721559524536133, "step_time_ms": 7553.454875946045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:00] (step=0001317) Train Loss: 0.2217, Train Steps/Sec: 0.12, Epoch: 0.02559269335406141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1318, "loss": 0.17417041957378387, "memory_gb": 7.721559524536133, "step_time_ms": 7557.689905166626, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:08] (step=0001318) Train Loss: 0.1858, Train Steps/Sec: 0.12, Epoch: 0.025612125923047027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1319, "loss": 0.2712324261665344, "memory_gb": 7.721559524536133, "step_time_ms": 7538.509130477905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:16] (step=0001319) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.025631558492032646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1320, "loss": 0.174758642911911, "memory_gb": 7.721559524536133, "step_time_ms": 7605.728387832642, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:24] (step=0001320) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.02565099106101827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1321, "loss": 0.21259284019470215, "memory_gb": 7.721559524536133, "step_time_ms": 7581.782102584839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:32] (step=0001321) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.025670423630003887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1322, "loss": 0.3315877914428711, "memory_gb": 7.721559524536133, "step_time_ms": 7500.954151153564, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:40] (step=0001322) Train Loss: 0.2356, Train Steps/Sec: 0.12, Epoch: 0.025689856198989506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1323, "loss": 0.23030634224414825, "memory_gb": 7.721559524536133, "step_time_ms": 7509.530544281006, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:49] (step=0001323) Train Loss: 0.2906, Train Steps/Sec: 0.12, Epoch: 0.025709288767975125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:52:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1324, "loss": 0.14559924602508545, "memory_gb": 7.721559524536133, "step_time_ms": 7514.297962188721, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:52:57] (step=0001324) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.025728721336960747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1325, "loss": 0.23885130882263184, "memory_gb": 7.715639114379883, "step_time_ms": 7399.002313613892, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:05] (step=0001325) Train Loss: 0.2373, Train Steps/Sec: 0.12, Epoch: 0.025748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1326, "loss": 0.32007986307144165, "memory_gb": 7.721559524536133, "step_time_ms": 7473.309755325317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:13] (step=0001326) Train Loss: 0.2725, Train Steps/Sec: 0.12, Epoch: 0.025767586474931985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1327, "loss": 0.17928200960159302, "memory_gb": 7.721559524536133, "step_time_ms": 7506.949424743652, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:21] (step=0001327) Train Loss: 0.1780, Train Steps/Sec: 0.12, Epoch: 0.025787019043917607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1328, "loss": 0.20100867748260498, "memory_gb": 7.721559524536133, "step_time_ms": 7449.891567230225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:29] (step=0001328) Train Loss: 0.2011, Train Steps/Sec: 0.13, Epoch: 0.025806451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1329, "loss": 0.12185204029083252, "memory_gb": 7.721559524536133, "step_time_ms": 7472.127199172974, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:37] (step=0001329) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.025825884181888845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1330, "loss": 0.2251080721616745, "memory_gb": 7.721559524536133, "step_time_ms": 7482.471227645874, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:45] (step=0001330) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.025845316750874467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1331, "loss": 0.278441846370697, "memory_gb": 7.715639114379883, "step_time_ms": 7428.858757019043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:53:53] (step=0001331) Train Loss: 0.2450, Train Steps/Sec: 0.13, Epoch: 0.025864749319860086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1332, "loss": 0.2039213478565216, "memory_gb": 7.721559524536133, "step_time_ms": 7476.786136627197, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:01] (step=0001332) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.025884181888845705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1333, "loss": 0.2525518238544464, "memory_gb": 7.721559524536133, "step_time_ms": 7623.149633407593, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:09] (step=0001333) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.025903614457831327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1334, "loss": 0.1401061713695526, "memory_gb": 7.721559524536133, "step_time_ms": 7394.022464752197, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:17] (step=0001334) Train Loss: 0.1405, Train Steps/Sec: 0.12, Epoch: 0.025923047026816946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1335, "loss": 0.3370687961578369, "memory_gb": 7.721559524536133, "step_time_ms": 7471.818447113037, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:25] (step=0001335) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.025942479595802564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1336, "loss": 0.26640594005584717, "memory_gb": 7.721559524536133, "step_time_ms": 7528.2697677612305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:33] (step=0001336) Train Loss: 0.3207, Train Steps/Sec: 0.12, Epoch: 0.025961912164788183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1337, "loss": 0.23122631013393402, "memory_gb": 7.721559524536133, "step_time_ms": 7310.497999191284, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:41] (step=0001337) Train Loss: 0.2265, Train Steps/Sec: 0.13, Epoch: 0.025981344733773806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1338, "loss": 0.22372683882713318, "memory_gb": 7.721559524536133, "step_time_ms": 7477.608919143677, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:49] (step=0001338) Train Loss: 0.2239, Train Steps/Sec: 0.13, Epoch: 0.026000777302759424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1339, "loss": 0.18207168579101562, "memory_gb": 7.721559524536133, "step_time_ms": 5515.463829040527, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:54:55] (step=0001339) Train Loss: 0.1954, Train Steps/Sec: 0.16, Epoch: 0.026020209871745043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1340, "loss": 0.32214847207069397, "memory_gb": 7.721559524536133, "step_time_ms": 7472.887277603149, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:03] (step=0001340) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.026039642440730666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1341, "loss": 0.35143566131591797, "memory_gb": 7.721559524536133, "step_time_ms": 7502.78115272522, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:11] (step=0001341) Train Loss: 0.2970, Train Steps/Sec: 0.12, Epoch: 0.026059075009716284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1342, "loss": 0.25669512152671814, "memory_gb": 7.721559524536133, "step_time_ms": 7427.81400680542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:19] (step=0001342) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.026078507578701903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1343, "loss": 0.22044017910957336, "memory_gb": 7.721559524536133, "step_time_ms": 7446.030378341675, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:27] (step=0001343) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.026097940147687525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1344, "loss": 0.23760780692100525, "memory_gb": 7.721559524536133, "step_time_ms": 7501.046895980835, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:35] (step=0001344) Train Loss: 0.2191, Train Steps/Sec: 0.12, Epoch: 0.026117372716673144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1345, "loss": 0.29140666127204895, "memory_gb": 7.721559524536133, "step_time_ms": 7245.221376419067, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:43] (step=0001345) Train Loss: 0.2388, Train Steps/Sec: 0.13, Epoch: 0.026136805285658763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1346, "loss": 0.31675711274147034, "memory_gb": 7.721559524536133, "step_time_ms": 7501.404762268066, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:51] (step=0001346) Train Loss: 0.2971, Train Steps/Sec: 0.12, Epoch: 0.026156237854644385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:55:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1347, "loss": 0.3100729286670685, "memory_gb": 7.721559524536133, "step_time_ms": 7495.3203201293945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:55:59] (step=0001347) Train Loss: 0.2768, Train Steps/Sec: 0.12, Epoch: 0.026175670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1348, "loss": 0.23162207007408142, "memory_gb": 7.721559524536133, "step_time_ms": 7443.563461303711, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:07] (step=0001348) Train Loss: 0.2282, Train Steps/Sec: 0.13, Epoch: 0.026195102992615623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1349, "loss": 0.204970583319664, "memory_gb": 7.721559524536133, "step_time_ms": 7491.039991378784, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:15] (step=0001349) Train Loss: 0.2239, Train Steps/Sec: 0.12, Epoch: 0.026214535561601245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1350, "loss": 0.299816370010376, "memory_gb": 7.715639114379883, "step_time_ms": 7448.544025421143, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:23] (step=0001350) Train Loss: 0.2241, Train Steps/Sec: 0.12, Epoch: 0.026233968130586864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1351, "loss": 0.18915775418281555, "memory_gb": 7.721559524536133, "step_time_ms": 7398.818731307983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:31] (step=0001351) Train Loss: 0.1770, Train Steps/Sec: 0.12, Epoch: 0.026253400699572483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1352, "loss": 0.35386788845062256, "memory_gb": 7.715639114379883, "step_time_ms": 7427.807092666626, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:39] (step=0001352) Train Loss: 0.2959, Train Steps/Sec: 0.12, Epoch: 0.026272833268558102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1353, "loss": 0.23553794622421265, "memory_gb": 7.721559524536133, "step_time_ms": 7474.10774230957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:47] (step=0001353) Train Loss: 0.2467, Train Steps/Sec: 0.12, Epoch: 0.026292265837543724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:56:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1354, "loss": 0.26105380058288574, "memory_gb": 7.721559524536133, "step_time_ms": 7428.600311279297, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:56:55] (step=0001354) Train Loss: 0.3017, Train Steps/Sec: 0.12, Epoch: 0.026311698406529343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1355, "loss": 0.20583921670913696, "memory_gb": 7.715639114379883, "step_time_ms": 7431.70690536499, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:04] (step=0001355) Train Loss: 0.2748, Train Steps/Sec: 0.12, Epoch: 0.02633113097551496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1356, "loss": 0.2339136302471161, "memory_gb": 7.721559524536133, "step_time_ms": 7488.847017288208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:12] (step=0001356) Train Loss: 0.2867, Train Steps/Sec: 0.12, Epoch: 0.026350563544500584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1357, "loss": 0.3721703886985779, "memory_gb": 7.721559524536133, "step_time_ms": 7392.580270767212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:20] (step=0001357) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.026369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1358, "loss": 0.15612944960594177, "memory_gb": 7.721559524536133, "step_time_ms": 7495.053768157959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:28] (step=0001358) Train Loss: 0.2197, Train Steps/Sec: 0.12, Epoch: 0.02638942868247182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1359, "loss": 0.3650559186935425, "memory_gb": 7.721559524536133, "step_time_ms": 7483.476877212524, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:36] (step=0001359) Train Loss: 0.3345, Train Steps/Sec: 0.12, Epoch: 0.026408861251457444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1360, "loss": 0.21287885308265686, "memory_gb": 7.721559524536133, "step_time_ms": 7454.4031620025635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:44] (step=0001360) Train Loss: 0.2578, Train Steps/Sec: 0.13, Epoch: 0.026428293820443063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1361, "loss": 0.1514693647623062, "memory_gb": 7.721559524536133, "step_time_ms": 7508.795738220215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:57:52] (step=0001361) Train Loss: 0.1452, Train Steps/Sec: 0.12, Epoch: 0.02644772638942868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1362, "loss": 0.2833424210548401, "memory_gb": 7.721559524536133, "step_time_ms": 7497.856378555298, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:00] (step=0001362) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.026467158958414304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1363, "loss": 0.21135762333869934, "memory_gb": 7.721559524536133, "step_time_ms": 7479.077577590942, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:08] (step=0001363) Train Loss: 0.2449, Train Steps/Sec: 0.12, Epoch: 0.026486591527399923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1364, "loss": 0.22820797562599182, "memory_gb": 7.721559524536133, "step_time_ms": 7461.196660995483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:16] (step=0001364) Train Loss: 0.2247, Train Steps/Sec: 0.12, Epoch: 0.02650602409638554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1365, "loss": 0.22388407588005066, "memory_gb": 7.721559524536133, "step_time_ms": 7485.895872116089, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:24] (step=0001365) Train Loss: 0.2331, Train Steps/Sec: 0.12, Epoch: 0.026525456665371164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1366, "loss": 0.2923322916030884, "memory_gb": 7.721559524536133, "step_time_ms": 7293.336391448975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:32] (step=0001366) Train Loss: 0.3170, Train Steps/Sec: 0.13, Epoch: 0.026544889234356783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1367, "loss": 0.240995854139328, "memory_gb": 7.721559524536133, "step_time_ms": 7546.565771102905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:40] (step=0001367) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.0265643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1368, "loss": 0.2731030583381653, "memory_gb": 7.721559524536133, "step_time_ms": 5360.240936279297, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:46] (step=0001368) Train Loss: 0.3112, Train Steps/Sec: 0.17, Epoch: 0.02658375437232802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:58:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1369, "loss": 0.21318921446800232, "memory_gb": 7.721559524536133, "step_time_ms": 7538.315057754517, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:58:54] (step=0001369) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.026603186941313643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1370, "loss": 0.24436290562152863, "memory_gb": 7.721559524536133, "step_time_ms": 7522.759675979614, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:02] (step=0001370) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.02662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1371, "loss": 0.3407147526741028, "memory_gb": 7.721559524536133, "step_time_ms": 7511.215686798096, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:10] (step=0001371) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.02664205207928488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1372, "loss": 0.19600644707679749, "memory_gb": 7.721559524536133, "step_time_ms": 7526.134729385376, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:18] (step=0001372) Train Loss: 0.2355, Train Steps/Sec: 0.12, Epoch: 0.026661484648270502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1373, "loss": 0.25313472747802734, "memory_gb": 7.721559524536133, "step_time_ms": 7561.419486999512, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:26] (step=0001373) Train Loss: 0.2470, Train Steps/Sec: 0.12, Epoch: 0.02668091721725612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1374, "loss": 0.3372527062892914, "memory_gb": 7.721559524536133, "step_time_ms": 7448.554277420044, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:34] (step=0001374) Train Loss: 0.3182, Train Steps/Sec: 0.13, Epoch: 0.02670034978624174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1375, "loss": 0.28955698013305664, "memory_gb": 7.721559524536133, "step_time_ms": 7567.618370056152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:42] (step=0001375) Train Loss: 0.2971, Train Steps/Sec: 0.12, Epoch: 0.026719782355227362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1376, "loss": 0.2781195640563965, "memory_gb": 7.721559524536133, "step_time_ms": 7510.161876678467, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:50] (step=0001376) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.02673921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 20:59:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1377, "loss": 0.2622416913509369, "memory_gb": 7.721559524536133, "step_time_ms": 7452.995777130127, "trainable_params": 4718592, "method": "lora"} [2025-07-28 20:59:58] (step=0001377) Train Loss: 0.2660, Train Steps/Sec: 0.12, Epoch: 0.0267586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1378, "loss": 0.1902998387813568, "memory_gb": 7.721559524536133, "step_time_ms": 7613.327264785767, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:06] (step=0001378) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.026778080062184222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1379, "loss": 0.21227340400218964, "memory_gb": 7.721559524536133, "step_time_ms": 7566.004276275635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:14] (step=0001379) Train Loss: 0.2579, Train Steps/Sec: 0.13, Epoch: 0.02679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1380, "loss": 0.33617788553237915, "memory_gb": 7.721559524536133, "step_time_ms": 7486.877679824829, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:22] (step=0001380) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.02681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1381, "loss": 0.2555820345878601, "memory_gb": 7.721559524536133, "step_time_ms": 7694.8816776275635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:30] (step=0001381) Train Loss: 0.2714, Train Steps/Sec: 0.12, Epoch: 0.02683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1382, "loss": 0.20645077526569366, "memory_gb": 7.721559524536133, "step_time_ms": 7543.318510055542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:39] (step=0001382) Train Loss: 0.2688, Train Steps/Sec: 0.12, Epoch: 0.0268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1383, "loss": 0.3356740474700928, "memory_gb": 7.721559524536133, "step_time_ms": 7502.374649047852, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:47] (step=0001383) Train Loss: 0.2744, Train Steps/Sec: 0.12, Epoch: 0.02687524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1384, "loss": 0.2010439932346344, "memory_gb": 7.721559524536133, "step_time_ms": 7535.094738006592, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:00:55] (step=0001384) Train Loss: 0.1758, Train Steps/Sec: 0.12, Epoch: 0.02689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1385, "loss": 0.2765316665172577, "memory_gb": 7.721559524536133, "step_time_ms": 7550.14705657959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:03] (step=0001385) Train Loss: 0.3210, Train Steps/Sec: 0.12, Epoch: 0.02691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1386, "loss": 0.2470189332962036, "memory_gb": 7.721559524536133, "step_time_ms": 7420.824766159058, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:11] (step=0001386) Train Loss: 0.2586, Train Steps/Sec: 0.13, Epoch: 0.02693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1387, "loss": 0.2891162633895874, "memory_gb": 7.715639114379883, "step_time_ms": 7477.483749389648, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:19] (step=0001387) Train Loss: 0.2962, Train Steps/Sec: 0.12, Epoch: 0.0269529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1388, "loss": 0.295729398727417, "memory_gb": 7.721559524536133, "step_time_ms": 7476.903676986694, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:27] (step=0001388) Train Loss: 0.3022, Train Steps/Sec: 0.12, Epoch: 0.02697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1389, "loss": 0.14421525597572327, "memory_gb": 7.721559524536133, "step_time_ms": 7406.580686569214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:35] (step=0001389) Train Loss: 0.2654, Train Steps/Sec: 0.12, Epoch: 0.02699183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1390, "loss": 0.30178704857826233, "memory_gb": 7.721559524536133, "step_time_ms": 7501.076221466064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:43] (step=0001390) Train Loss: 0.2926, Train Steps/Sec: 0.12, Epoch: 0.02701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1391, "loss": 0.15597476065158844, "memory_gb": 7.721559524536133, "step_time_ms": 7500.463485717773, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:51] (step=0001391) Train Loss: 0.1801, Train Steps/Sec: 0.12, Epoch: 0.02703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:01:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1392, "loss": 0.2206726372241974, "memory_gb": 7.721559524536133, "step_time_ms": 7455.759286880493, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:01:59] (step=0001392) Train Loss: 0.2441, Train Steps/Sec: 0.12, Epoch: 0.0270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1393, "loss": 0.18935847282409668, "memory_gb": 7.721559524536133, "step_time_ms": 7565.995216369629, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:07] (step=0001393) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.02706956859696852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1394, "loss": 0.16826161742210388, "memory_gb": 7.721559524536133, "step_time_ms": 7530.13801574707, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:15] (step=0001394) Train Loss: 0.1715, Train Steps/Sec: 0.12, Epoch: 0.02708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1395, "loss": 0.328828901052475, "memory_gb": 7.721559524536133, "step_time_ms": 7360.071420669556, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:23] (step=0001395) Train Loss: 0.2503, Train Steps/Sec: 0.13, Epoch: 0.02710843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1396, "loss": 0.22011859714984894, "memory_gb": 7.721559524536133, "step_time_ms": 7511.85154914856, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:31] (step=0001396) Train Loss: 0.2052, Train Steps/Sec: 0.13, Epoch: 0.02712786630392538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1397, "loss": 0.36533862352371216, "memory_gb": 7.715639114379883, "step_time_ms": 5310.012340545654, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:37] (step=0001397) Train Loss: 0.2586, Train Steps/Sec: 0.16, Epoch: 0.027147298872910997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1398, "loss": 0.26807427406311035, "memory_gb": 7.721559524536133, "step_time_ms": 7501.304864883423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:45] (step=0001398) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.02716673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:02:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1399, "loss": 0.13764949142932892, "memory_gb": 7.721559524536133, "step_time_ms": 7438.482761383057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:02:53] (step=0001399) Train Loss: 0.1615, Train Steps/Sec: 0.13, Epoch: 0.02718616401088224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1400, "loss": 0.33992213010787964, "memory_gb": 7.721559524536133, "step_time_ms": 7406.412124633789, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:01] (step=0001400) Train Loss: 0.2600, Train Steps/Sec: 0.12, Epoch: 0.027205596579867857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1401, "loss": 0.24806548655033112, "memory_gb": 7.721559524536133, "step_time_ms": 7542.335510253906, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:09] (step=0001401) Train Loss: 0.2898, Train Steps/Sec: 0.12, Epoch: 0.02722502914885348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1402, "loss": 0.21972939372062683, "memory_gb": 7.721559524536133, "step_time_ms": 7499.162197113037, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:17] (step=0001402) Train Loss: 0.2709, Train Steps/Sec: 0.12, Epoch: 0.027244461717839098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1403, "loss": 0.17364917695522308, "memory_gb": 7.721559524536133, "step_time_ms": 7491.656064987183, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:25] (step=0001403) Train Loss: 0.2314, Train Steps/Sec: 0.12, Epoch: 0.027263894286824717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1404, "loss": 0.28682807087898254, "memory_gb": 7.721559524536133, "step_time_ms": 7522.985935211182, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:33] (step=0001404) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.02728332685581034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1405, "loss": 0.1995939016342163, "memory_gb": 7.721559524536133, "step_time_ms": 7490.64564704895, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:41] (step=0001405) Train Loss: 0.1575, Train Steps/Sec: 0.12, Epoch: 0.027302759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1406, "loss": 0.16433298587799072, "memory_gb": 7.721559524536133, "step_time_ms": 7400.9082317352295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:50] (step=0001406) Train Loss: 0.1860, Train Steps/Sec: 0.12, Epoch: 0.027322191993781577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:03:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1407, "loss": 0.2490883469581604, "memory_gb": 7.721559524536133, "step_time_ms": 7532.633066177368, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:03:58] (step=0001407) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.0273416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1408, "loss": 0.3755520284175873, "memory_gb": 7.721559524536133, "step_time_ms": 7479.793310165405, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:06] (step=0001408) Train Loss: 0.2672, Train Steps/Sec: 0.12, Epoch: 0.027361057131752818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1409, "loss": 0.18645156919956207, "memory_gb": 7.721559524536133, "step_time_ms": 7432.888031005859, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:14] (step=0001409) Train Loss: 0.1929, Train Steps/Sec: 0.12, Epoch: 0.027380489700738437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1410, "loss": 0.252166211605072, "memory_gb": 7.721559524536133, "step_time_ms": 7511.685609817505, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:22] (step=0001410) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.02739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1411, "loss": 0.20754967629909515, "memory_gb": 7.721559524536133, "step_time_ms": 7494.266510009766, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:30] (step=0001411) Train Loss: 0.1817, Train Steps/Sec: 0.12, Epoch: 0.027419354838709678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1412, "loss": 0.1308159977197647, "memory_gb": 7.721559524536133, "step_time_ms": 7259.8161697387695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:38] (step=0001412) Train Loss: 0.1922, Train Steps/Sec: 0.13, Epoch: 0.027438787407695297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1413, "loss": 0.21141427755355835, "memory_gb": 7.721559524536133, "step_time_ms": 7477.15425491333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:46] (step=0001413) Train Loss: 0.2267, Train Steps/Sec: 0.12, Epoch: 0.027458219976680916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1414, "loss": 0.27363985776901245, "memory_gb": 7.715639114379883, "step_time_ms": 7446.304082870483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:04:54] (step=0001414) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.027477652545666538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1415, "loss": 0.2588953375816345, "memory_gb": 7.721559524536133, "step_time_ms": 7379.854917526245, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:02] (step=0001415) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.027497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1416, "loss": 0.29746291041374207, "memory_gb": 7.721559524536133, "step_time_ms": 7502.769708633423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:10] (step=0001416) Train Loss: 0.2923, Train Steps/Sec: 0.12, Epoch: 0.027516517683637776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1417, "loss": 0.272899329662323, "memory_gb": 7.721559524536133, "step_time_ms": 7481.055021286011, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:18] (step=0001417) Train Loss: 0.2572, Train Steps/Sec: 0.12, Epoch: 0.027535950252623398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1418, "loss": 0.20188622176647186, "memory_gb": 7.721559524536133, "step_time_ms": 7380.767583847046, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:26] (step=0001418) Train Loss: 0.2491, Train Steps/Sec: 0.12, Epoch: 0.027555382821609017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1419, "loss": 0.28202322125434875, "memory_gb": 7.721559524536133, "step_time_ms": 7474.672794342041, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:34] (step=0001419) Train Loss: 0.2029, Train Steps/Sec: 0.12, Epoch: 0.027574815390594636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1420, "loss": 0.2952306866645813, "memory_gb": 7.721559524536133, "step_time_ms": 7480.052709579468, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:42] (step=0001420) Train Loss: 0.2954, Train Steps/Sec: 0.12, Epoch: 0.027594247959580258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1421, "loss": 0.18100059032440186, "memory_gb": 7.721559524536133, "step_time_ms": 7582.126617431641, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:50] (step=0001421) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.027613680528565877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1422, "loss": 0.24521708488464355, "memory_gb": 7.721559524536133, "step_time_ms": 7482.710838317871, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:05:58] (step=0001422) Train Loss: 0.2541, Train Steps/Sec: 0.12, Epoch: 0.027633113097551495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1423, "loss": 0.3264811635017395, "memory_gb": 7.721559524536133, "step_time_ms": 7488.461256027222, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:06] (step=0001423) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.027652545666537118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1424, "loss": 0.3379301428794861, "memory_gb": 7.715639114379883, "step_time_ms": 7239.479541778564, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:14] (step=0001424) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.027671978235522737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1425, "loss": 0.2730492353439331, "memory_gb": 7.721559524536133, "step_time_ms": 7529.632568359375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:22] (step=0001425) Train Loss: 0.2878, Train Steps/Sec: 0.13, Epoch: 0.027691410804508355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1426, "loss": 0.2726285457611084, "memory_gb": 7.721559524536133, "step_time_ms": 5596.49395942688, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:28] (step=0001426) Train Loss: 0.2984, Train Steps/Sec: 0.17, Epoch: 0.027710843373493974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1427, "loss": 0.12666192650794983, "memory_gb": 7.721559524536133, "step_time_ms": 7553.573369979858, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:36] (step=0001427) Train Loss: 0.1712, Train Steps/Sec: 0.12, Epoch: 0.027730275942479597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1428, "loss": 0.20234635472297668, "memory_gb": 7.721559524536133, "step_time_ms": 7573.718070983887, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:44] (step=0001428) Train Loss: 0.2025, Train Steps/Sec: 0.12, Epoch: 0.027749708511465215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:06:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1429, "loss": 0.202745720744133, "memory_gb": 7.721559524536133, "step_time_ms": 7488.442420959473, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:06:52] (step=0001429) Train Loss: 0.2172, Train Steps/Sec: 0.13, Epoch: 0.027769141080450834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1430, "loss": 0.19648602604866028, "memory_gb": 7.721559524536133, "step_time_ms": 7581.271171569824, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:00] (step=0001430) Train Loss: 0.2025, Train Steps/Sec: 0.12, Epoch: 0.027788573649436456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1431, "loss": 0.16103239357471466, "memory_gb": 7.721559524536133, "step_time_ms": 7590.447664260864, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:08] (step=0001431) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.027808006218422075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1432, "loss": 0.31665605306625366, "memory_gb": 7.721559524536133, "step_time_ms": 7516.978025436401, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:16] (step=0001432) Train Loss: 0.2861, Train Steps/Sec: 0.12, Epoch: 0.027827438787407694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1433, "loss": 0.27712613344192505, "memory_gb": 7.721559524536133, "step_time_ms": 7572.70359992981, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:24] (step=0001433) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.027846871356393316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1434, "loss": 0.25883322954177856, "memory_gb": 7.721559524536133, "step_time_ms": 7556.570529937744, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:32] (step=0001434) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.027866303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1435, "loss": 0.24995513260364532, "memory_gb": 7.721559524536133, "step_time_ms": 7527.822256088257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:40] (step=0001435) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.027885736494364554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1436, "loss": 0.25685304403305054, "memory_gb": 7.721559524536133, "step_time_ms": 7660.412311553955, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:48] (step=0001436) Train Loss: 0.2402, Train Steps/Sec: 0.12, Epoch: 0.027905169063350176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1437, "loss": 0.2368076890707016, "memory_gb": 7.721559524536133, "step_time_ms": 7497.472047805786, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:07:56] (step=0001437) Train Loss: 0.2593, Train Steps/Sec: 0.13, Epoch: 0.027924601632335795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1438, "loss": 0.22381994128227234, "memory_gb": 7.721559524536133, "step_time_ms": 7471.43030166626, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:04] (step=0001438) Train Loss: 0.2419, Train Steps/Sec: 0.12, Epoch: 0.027944034201321414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1439, "loss": 0.23520588874816895, "memory_gb": 7.721559524536133, "step_time_ms": 7618.501663208008, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:13] (step=0001439) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.027963466770307036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1440, "loss": 0.15783922374248505, "memory_gb": 7.721559524536133, "step_time_ms": 7527.905225753784, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:21] (step=0001440) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.027982899339292655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1441, "loss": 0.1547241508960724, "memory_gb": 7.721559524536133, "step_time_ms": 7451.131343841553, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:29] (step=0001441) Train Loss: 0.1937, Train Steps/Sec: 0.12, Epoch: 0.028002331908278274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1442, "loss": 0.26849308609962463, "memory_gb": 7.721559524536133, "step_time_ms": 7580.029487609863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:37] (step=0001442) Train Loss: 0.2601, Train Steps/Sec: 0.12, Epoch: 0.028021764477263893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1443, "loss": 0.4229837656021118, "memory_gb": 7.721559524536133, "step_time_ms": 7538.3620262146, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:45] (step=0001443) Train Loss: 0.3434, Train Steps/Sec: 0.13, Epoch: 0.028041197046249515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:08:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1444, "loss": 0.293049693107605, "memory_gb": 7.721559524536133, "step_time_ms": 7458.831787109375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:08:53] (step=0001444) Train Loss: 0.2719, Train Steps/Sec: 0.13, Epoch: 0.028060629615235134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1445, "loss": 0.14176928997039795, "memory_gb": 7.721559524536133, "step_time_ms": 7517.809629440308, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:01] (step=0001445) Train Loss: 0.1575, Train Steps/Sec: 0.12, Epoch: 0.028080062184220753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1446, "loss": 0.35737714171409607, "memory_gb": 7.721559524536133, "step_time_ms": 7551.278114318848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:09] (step=0001446) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.028099494753206375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1447, "loss": 0.20638668537139893, "memory_gb": 7.721559524536133, "step_time_ms": 7485.679626464844, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:17] (step=0001447) Train Loss: 0.2757, Train Steps/Sec: 0.13, Epoch: 0.028118927322191994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1448, "loss": 0.21507097780704498, "memory_gb": 7.721559524536133, "step_time_ms": 7667.17791557312, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:25] (step=0001448) Train Loss: 0.2245, Train Steps/Sec: 0.12, Epoch: 0.028138359891177613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1449, "loss": 0.2001233994960785, "memory_gb": 7.721559524536133, "step_time_ms": 7547.053337097168, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:33] (step=0001449) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.028157792460163235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1450, "loss": 0.17708317935466766, "memory_gb": 7.721559524536133, "step_time_ms": 7460.036516189575, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:41] (step=0001450) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.028177225029148854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1451, "loss": 0.20355293154716492, "memory_gb": 7.721559524536133, "step_time_ms": 7509.124994277954, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:49] (step=0001451) Train Loss: 0.2561, Train Steps/Sec: 0.12, Epoch: 0.028196657598134472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:09:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1452, "loss": 0.170872300863266, "memory_gb": 7.721559524536133, "step_time_ms": 7430.225372314453, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:09:57] (step=0001452) Train Loss: 0.2535, Train Steps/Sec: 0.13, Epoch: 0.028216090167120095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1453, "loss": 0.3643742501735687, "memory_gb": 7.721559524536133, "step_time_ms": 7271.608352661133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:05] (step=0001453) Train Loss: 0.3300, Train Steps/Sec: 0.13, Epoch: 0.028235522736105714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1454, "loss": 0.190092533826828, "memory_gb": 7.721559524536133, "step_time_ms": 7539.653301239014, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:13] (step=0001454) Train Loss: 0.2088, Train Steps/Sec: 0.12, Epoch: 0.028254955305091332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1455, "loss": 0.34455591440200806, "memory_gb": 7.721559524536133, "step_time_ms": 5068.812608718872, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:19] (step=0001455) Train Loss: 0.3092, Train Steps/Sec: 0.17, Epoch: 0.028274387874076955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1456, "loss": 0.22962923347949982, "memory_gb": 7.721559524536133, "step_time_ms": 7538.999080657959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:27] (step=0001456) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.028293820443062574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1457, "loss": 0.21613235771656036, "memory_gb": 7.721559524536133, "step_time_ms": 7456.408977508545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:35] (step=0001457) Train Loss: 0.2416, Train Steps/Sec: 0.13, Epoch: 0.028313253012048192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1458, "loss": 0.25018495321273804, "memory_gb": 7.721559524536133, "step_time_ms": 7441.0529136657715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:43] (step=0001458) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.02833268558103381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1459, "loss": 0.2293931245803833, "memory_gb": 7.721559524536133, "step_time_ms": 7506.160020828247, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:51] (step=0001459) Train Loss: 0.2588, Train Steps/Sec: 0.12, Epoch: 0.028352118150019433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1460, "loss": 0.1718982756137848, "memory_gb": 7.721559524536133, "step_time_ms": 7506.187438964844, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:10:59] (step=0001460) Train Loss: 0.2046, Train Steps/Sec: 0.12, Epoch: 0.028371550719005052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1461, "loss": 0.13940048217773438, "memory_gb": 7.721559524536133, "step_time_ms": 7427.362442016602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:07] (step=0001461) Train Loss: 0.2046, Train Steps/Sec: 0.12, Epoch: 0.02839098328799067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1462, "loss": 0.23900055885314941, "memory_gb": 7.721559524536133, "step_time_ms": 7528.9647579193115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:15] (step=0001462) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.028410415856976293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1463, "loss": 0.20106005668640137, "memory_gb": 7.721559524536133, "step_time_ms": 7428.9610385894775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:23] (step=0001463) Train Loss: 0.1910, Train Steps/Sec: 0.12, Epoch: 0.028429848425961912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1464, "loss": 0.13272862136363983, "memory_gb": 7.721559524536133, "step_time_ms": 7456.778764724731, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:32] (step=0001464) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.02844928099494753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1465, "loss": 0.0931832417845726, "memory_gb": 7.721559524536133, "step_time_ms": 7505.706548690796, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:40] (step=0001465) Train Loss: 0.1895, Train Steps/Sec: 0.12, Epoch: 0.028468713563933153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1466, "loss": 0.290833979845047, "memory_gb": 7.721559524536133, "step_time_ms": 7455.454587936401, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:48] (step=0001466) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.028488146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:11:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1467, "loss": 0.23721492290496826, "memory_gb": 7.721559524536133, "step_time_ms": 7396.827936172485, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:11:56] (step=0001467) Train Loss: 0.2541, Train Steps/Sec: 0.13, Epoch: 0.02850757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1468, "loss": 0.24733750522136688, "memory_gb": 7.721559524536133, "step_time_ms": 7593.019962310791, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:04] (step=0001468) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.028527011270890013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1469, "loss": 0.2454160451889038, "memory_gb": 7.721559524536133, "step_time_ms": 7451.433420181274, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:12] (step=0001469) Train Loss: 0.2440, Train Steps/Sec: 0.12, Epoch: 0.028546443839875632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1470, "loss": 0.30828818678855896, "memory_gb": 7.721559524536133, "step_time_ms": 7414.634943008423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:20] (step=0001470) Train Loss: 0.2777, Train Steps/Sec: 0.12, Epoch: 0.02856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1471, "loss": 0.28355374932289124, "memory_gb": 7.721559524536133, "step_time_ms": 7541.630983352661, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:28] (step=0001471) Train Loss: 0.2779, Train Steps/Sec: 0.12, Epoch: 0.02858530897784687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1472, "loss": 0.14216139912605286, "memory_gb": 7.721559524536133, "step_time_ms": 7507.747650146484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:36] (step=0001472) Train Loss: 0.1884, Train Steps/Sec: 0.12, Epoch: 0.028604741546832492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1473, "loss": 0.13606834411621094, "memory_gb": 7.721559524536133, "step_time_ms": 7498.464107513428, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:44] (step=0001473) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.02862417411581811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:12:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1474, "loss": 0.2515171766281128, "memory_gb": 7.721559524536133, "step_time_ms": 7545.150995254517, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:12:52] (step=0001474) Train Loss: 0.2742, Train Steps/Sec: 0.12, Epoch: 0.02864360668480373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1475, "loss": 0.2609594464302063, "memory_gb": 7.721559524536133, "step_time_ms": 7519.059181213379, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:00] (step=0001475) Train Loss: 0.2395, Train Steps/Sec: 0.12, Epoch: 0.028663039253789352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1476, "loss": 0.2735012471675873, "memory_gb": 7.721559524536133, "step_time_ms": 7463.031768798828, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:08] (step=0001476) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.02868247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1477, "loss": 0.28648290038108826, "memory_gb": 7.721559524536133, "step_time_ms": 7550.343036651611, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:16] (step=0001477) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.02870190439176059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1478, "loss": 0.18008951842784882, "memory_gb": 7.721559524536133, "step_time_ms": 7513.7598514556885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:24] (step=0001478) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.028721336960746212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1479, "loss": 0.24422793090343475, "memory_gb": 7.721559524536133, "step_time_ms": 7487.143516540527, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:32] (step=0001479) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.02874076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1480, "loss": 0.18858692049980164, "memory_gb": 7.721559524536133, "step_time_ms": 7559.090852737427, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:40] (step=0001480) Train Loss: 0.2553, Train Steps/Sec: 0.12, Epoch: 0.02876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1481, "loss": 0.16170556843280792, "memory_gb": 7.721559524536133, "step_time_ms": 7333.064556121826, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:48] (step=0001481) Train Loss: 0.1877, Train Steps/Sec: 0.13, Epoch: 0.028779634667703072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1482, "loss": 0.1807539016008377, "memory_gb": 7.721559524536133, "step_time_ms": 7350.473165512085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:13:56] (step=0001482) Train Loss: 0.1979, Train Steps/Sec: 0.13, Epoch: 0.02879906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1483, "loss": 0.3029755651950836, "memory_gb": 7.721559524536133, "step_time_ms": 7582.958459854126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:04] (step=0001483) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.02881849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1484, "loss": 0.22778961062431335, "memory_gb": 7.721559524536133, "step_time_ms": 5225.526809692383, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:10] (step=0001484) Train Loss: 0.2038, Train Steps/Sec: 0.18, Epoch: 0.02883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1485, "loss": 0.30163198709487915, "memory_gb": 7.721559524536133, "step_time_ms": 7670.611143112183, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:18] (step=0001485) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.02885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1486, "loss": 0.24413952231407166, "memory_gb": 7.721559524536133, "step_time_ms": 7556.491374969482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:26] (step=0001486) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.02887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1487, "loss": 0.24520951509475708, "memory_gb": 7.721559524536133, "step_time_ms": 7517.311811447144, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:34] (step=0001487) Train Loss: 0.2301, Train Steps/Sec: 0.12, Epoch: 0.028896230081616788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1488, "loss": 0.20329852402210236, "memory_gb": 7.721559524536133, "step_time_ms": 7616.0948276519775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:42] (step=0001488) Train Loss: 0.2038, Train Steps/Sec: 0.12, Epoch: 0.02891566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1489, "loss": 0.1654442846775055, "memory_gb": 7.721559524536133, "step_time_ms": 7576.642036437988, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:50] (step=0001489) Train Loss: 0.2302, Train Steps/Sec: 0.12, Epoch: 0.02893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:14:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1490, "loss": 0.28251510858535767, "memory_gb": 7.721559524536133, "step_time_ms": 7537.240505218506, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:14:58] (step=0001490) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.028954527788573648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1491, "loss": 0.15966522693634033, "memory_gb": 7.721559524536133, "step_time_ms": 7613.023281097412, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:06] (step=0001491) Train Loss: 0.1737, Train Steps/Sec: 0.12, Epoch: 0.02897396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1492, "loss": 0.16996057331562042, "memory_gb": 7.721559524536133, "step_time_ms": 7608.932256698608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:14] (step=0001492) Train Loss: 0.2091, Train Steps/Sec: 0.12, Epoch: 0.02899339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1493, "loss": 0.15194247663021088, "memory_gb": 7.721559524536133, "step_time_ms": 7525.187969207764, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:23] (step=0001493) Train Loss: 0.1768, Train Steps/Sec: 0.12, Epoch: 0.029012825495530508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1494, "loss": 0.2063826024532318, "memory_gb": 7.721559524536133, "step_time_ms": 7569.012880325317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:31] (step=0001494) Train Loss: 0.1919, Train Steps/Sec: 0.12, Epoch: 0.02903225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1495, "loss": 0.2749066948890686, "memory_gb": 7.721559524536133, "step_time_ms": 7625.745534896851, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:39] (step=0001495) Train Loss: 0.2807, Train Steps/Sec: 0.12, Epoch: 0.02905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1496, "loss": 0.2662739157676697, "memory_gb": 7.721559524536133, "step_time_ms": 7530.29203414917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:47] (step=0001496) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.029071123202487368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:15:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1497, "loss": 0.3573540151119232, "memory_gb": 7.721559524536133, "step_time_ms": 7530.911684036255, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:15:55] (step=0001497) Train Loss: 0.3340, Train Steps/Sec: 0.12, Epoch: 0.02909055577147299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1498, "loss": 0.18593433499336243, "memory_gb": 7.721559524536133, "step_time_ms": 7492.636442184448, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:03] (step=0001498) Train Loss: 0.1595, Train Steps/Sec: 0.13, Epoch: 0.02910998834045861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1499, "loss": 0.3883060812950134, "memory_gb": 7.721559524536133, "step_time_ms": 7420.753002166748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:11] (step=0001499) Train Loss: 0.3056, Train Steps/Sec: 0.12, Epoch: 0.029129420909444228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1500, "loss": 0.16603893041610718, "memory_gb": 7.721559524536133, "step_time_ms": 7548.379182815552, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:19] (step=0001500) Train Loss: 0.1764, Train Steps/Sec: 0.12, Epoch: 0.029148853478429847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1501, "loss": 0.30909278988838196, "memory_gb": 7.721559524536133, "step_time_ms": 7490.381717681885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:27] (step=0001501) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.02916828604741547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1502, "loss": 0.1955433338880539, "memory_gb": 7.721559524536133, "step_time_ms": 7376.856327056885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:35] (step=0001502) Train Loss: 0.1966, Train Steps/Sec: 0.13, Epoch: 0.029187718616401088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1503, "loss": 0.16997990012168884, "memory_gb": 7.721559524536133, "step_time_ms": 7540.783643722534, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:43] (step=0001503) Train Loss: 0.1940, Train Steps/Sec: 0.12, Epoch: 0.029207151185386707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1504, "loss": 0.3822261095046997, "memory_gb": 7.721559524536133, "step_time_ms": 7508.069038391113, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:51] (step=0001504) Train Loss: 0.2848, Train Steps/Sec: 0.12, Epoch: 0.02922658375437233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1505, "loss": 0.29916197061538696, "memory_gb": 7.721559524536133, "step_time_ms": 7423.692464828491, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:16:59] (step=0001505) Train Loss: 0.2398, Train Steps/Sec: 0.13, Epoch: 0.029246016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1506, "loss": 0.23889698088169098, "memory_gb": 7.721559524536133, "step_time_ms": 7524.634122848511, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:07] (step=0001506) Train Loss: 0.2415, Train Steps/Sec: 0.12, Epoch: 0.029265448892343567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1507, "loss": 0.28921687602996826, "memory_gb": 7.721559524536133, "step_time_ms": 7440.645694732666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:15] (step=0001507) Train Loss: 0.2200, Train Steps/Sec: 0.13, Epoch: 0.02928488146132919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1508, "loss": 0.1623641550540924, "memory_gb": 7.721559524536133, "step_time_ms": 7427.528142929077, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:23] (step=0001508) Train Loss: 0.2581, Train Steps/Sec: 0.13, Epoch: 0.029304314030314808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1509, "loss": 0.32774826884269714, "memory_gb": 7.721559524536133, "step_time_ms": 7633.834362030029, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:31] (step=0001509) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.029323746599300426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1510, "loss": 0.19113817811012268, "memory_gb": 7.721559524536133, "step_time_ms": 7540.527105331421, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:39] (step=0001510) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.02934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1511, "loss": 0.28177499771118164, "memory_gb": 7.721559524536133, "step_time_ms": 7313.22979927063, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:47] (step=0001511) Train Loss: 0.2447, Train Steps/Sec: 0.13, Epoch: 0.029362611737271668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:17:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1512, "loss": 0.27296769618988037, "memory_gb": 7.721559524536133, "step_time_ms": 7449.096202850342, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:17:55] (step=0001512) Train Loss: 0.1921, Train Steps/Sec: 0.13, Epoch: 0.029382044306257286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1513, "loss": 0.18312019109725952, "memory_gb": 7.721559524536133, "step_time_ms": 5352.591514587402, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:00] (step=0001513) Train Loss: 0.1963, Train Steps/Sec: 0.17, Epoch: 0.02940147687524291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1514, "loss": 0.3222377598285675, "memory_gb": 7.721559524536133, "step_time_ms": 7554.4469356536865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:09] (step=0001514) Train Loss: 0.2995, Train Steps/Sec: 0.12, Epoch: 0.029420909444228528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1515, "loss": 0.21335172653198242, "memory_gb": 7.721559524536133, "step_time_ms": 7491.854190826416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:17] (step=0001515) Train Loss: 0.2775, Train Steps/Sec: 0.12, Epoch: 0.029440342013214146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1516, "loss": 0.20275932550430298, "memory_gb": 7.721559524536133, "step_time_ms": 7438.745975494385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:25] (step=0001516) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.029459774582199765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1517, "loss": 0.3677682876586914, "memory_gb": 7.721559524536133, "step_time_ms": 7548.640251159668, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:33] (step=0001517) Train Loss: 0.3021, Train Steps/Sec: 0.12, Epoch: 0.029479207151185387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1518, "loss": 0.2297372817993164, "memory_gb": 7.721559524536133, "step_time_ms": 7468.350648880005, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:41] (step=0001518) Train Loss: 0.2108, Train Steps/Sec: 0.12, Epoch: 0.029498639720171006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1519, "loss": 0.32985931634902954, "memory_gb": 7.721559524536133, "step_time_ms": 7424.584627151489, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:49] (step=0001519) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.029518072289156625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1520, "loss": 0.21520906686782837, "memory_gb": 7.721559524536133, "step_time_ms": 7501.361846923828, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:18:57] (step=0001520) Train Loss: 0.2377, Train Steps/Sec: 0.12, Epoch: 0.029537504858142247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1521, "loss": 0.2548912763595581, "memory_gb": 7.721559524536133, "step_time_ms": 7467.530250549316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:05] (step=0001521) Train Loss: 0.2678, Train Steps/Sec: 0.12, Epoch: 0.029556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1522, "loss": 0.20085600018501282, "memory_gb": 7.721559524536133, "step_time_ms": 7417.952537536621, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:13] (step=0001522) Train Loss: 0.1909, Train Steps/Sec: 0.12, Epoch: 0.029576369996113485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1523, "loss": 0.2902321219444275, "memory_gb": 7.721559524536133, "step_time_ms": 7469.514846801758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:21] (step=0001523) Train Loss: 0.2942, Train Steps/Sec: 0.12, Epoch: 0.029595802565099107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1524, "loss": 0.2109019011259079, "memory_gb": 7.721559524536133, "step_time_ms": 7473.19769859314, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:29] (step=0001524) Train Loss: 0.2379, Train Steps/Sec: 0.12, Epoch: 0.029615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1525, "loss": 0.19888603687286377, "memory_gb": 7.721559524536133, "step_time_ms": 7368.343353271484, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:37] (step=0001525) Train Loss: 0.1734, Train Steps/Sec: 0.12, Epoch: 0.029634667703070345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1526, "loss": 0.1726556122303009, "memory_gb": 7.721559524536133, "step_time_ms": 7351.483583450317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:45] (step=0001526) Train Loss: 0.2039, Train Steps/Sec: 0.12, Epoch: 0.029654100272055967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:19:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1527, "loss": 0.22005648910999298, "memory_gb": 7.721559524536133, "step_time_ms": 7426.813125610352, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:19:53] (step=0001527) Train Loss: 0.1755, Train Steps/Sec: 0.12, Epoch: 0.029673532841041586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1528, "loss": 0.28208181262016296, "memory_gb": 7.721559524536133, "step_time_ms": 7433.208227157593, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:01] (step=0001528) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.029692965410027205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1529, "loss": 0.35311639308929443, "memory_gb": 7.721559524536133, "step_time_ms": 7542.070627212524, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:09] (step=0001529) Train Loss: 0.2964, Train Steps/Sec: 0.12, Epoch: 0.029712397979012827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1530, "loss": 0.3063543736934662, "memory_gb": 7.721559524536133, "step_time_ms": 7520.450592041016, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:18] (step=0001530) Train Loss: 0.3151, Train Steps/Sec: 0.12, Epoch: 0.029731830547998446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1531, "loss": 0.2905716001987457, "memory_gb": 7.721559524536133, "step_time_ms": 7448.052644729614, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:26] (step=0001531) Train Loss: 0.2937, Train Steps/Sec: 0.12, Epoch: 0.029751263116984065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1532, "loss": 0.27312353253364563, "memory_gb": 7.721559524536133, "step_time_ms": 7459.699392318726, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:34] (step=0001532) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.029770695685969684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1533, "loss": 0.2066981941461563, "memory_gb": 7.721559524536133, "step_time_ms": 7515.125036239624, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:42] (step=0001533) Train Loss: 0.2031, Train Steps/Sec: 0.12, Epoch: 0.029790128254955306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1534, "loss": 0.17746253311634064, "memory_gb": 7.721559524536133, "step_time_ms": 7441.1461353302, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:50] (step=0001534) Train Loss: 0.2573, Train Steps/Sec: 0.13, Epoch: 0.029809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:20:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1535, "loss": 0.2167845368385315, "memory_gb": 7.721559524536133, "step_time_ms": 7472.860097885132, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:20:58] (step=0001535) Train Loss: 0.1989, Train Steps/Sec: 0.12, Epoch: 0.029828993392926544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1536, "loss": 0.2329188883304596, "memory_gb": 7.721559524536133, "step_time_ms": 7527.933120727539, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:06] (step=0001536) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.029848425961912166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1537, "loss": 0.23170024156570435, "memory_gb": 7.721559524536133, "step_time_ms": 7490.814685821533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:14] (step=0001537) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.029867858530897785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1538, "loss": 0.2377096712589264, "memory_gb": 7.721559524536133, "step_time_ms": 7516.081094741821, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:22] (step=0001538) Train Loss: 0.2062, Train Steps/Sec: 0.12, Epoch: 0.029887291099883403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1539, "loss": 0.18762145936489105, "memory_gb": 7.721559524536133, "step_time_ms": 7540.65203666687, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:30] (step=0001539) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.029906723668869026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1540, "loss": 0.3166124224662781, "memory_gb": 7.721559524536133, "step_time_ms": 7381.588459014893, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:38] (step=0001540) Train Loss: 0.2821, Train Steps/Sec: 0.13, Epoch: 0.029926156237854645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1541, "loss": 0.28715717792510986, "memory_gb": 7.721559524536133, "step_time_ms": 7536.378860473633, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:46] (step=0001541) Train Loss: 0.2616, Train Steps/Sec: 0.13, Epoch: 0.029945588806840263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:21:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1542, "loss": 0.24942079186439514, "memory_gb": 7.721559524536133, "step_time_ms": 5340.445995330811, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:21:52] (step=0001542) Train Loss: 0.2071, Train Steps/Sec: 0.17, Epoch: 0.029965021375825886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1543, "loss": 0.309437096118927, "memory_gb": 7.721559524536133, "step_time_ms": 7497.761249542236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:00] (step=0001543) Train Loss: 0.3256, Train Steps/Sec: 0.12, Epoch: 0.029984453944811505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1544, "loss": 0.31691575050354004, "memory_gb": 7.721559524536133, "step_time_ms": 7478.304386138916, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:08] (step=0001544) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.030003886513797123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1545, "loss": 0.35532480478286743, "memory_gb": 7.721559524536133, "step_time_ms": 7495.862722396851, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:16] (step=0001545) Train Loss: 0.3059, Train Steps/Sec: 0.12, Epoch: 0.030023319082782742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1546, "loss": 0.277884840965271, "memory_gb": 7.721559524536133, "step_time_ms": 7552.421092987061, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:24] (step=0001546) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.030042751651768364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1547, "loss": 0.1656087338924408, "memory_gb": 7.721559524536133, "step_time_ms": 7345.854759216309, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:32] (step=0001547) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.030062184220753983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1548, "loss": 0.3087192475795746, "memory_gb": 7.721559524536133, "step_time_ms": 7479.360580444336, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:40] (step=0001548) Train Loss: 0.2758, Train Steps/Sec: 0.13, Epoch: 0.030081616789739602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1549, "loss": 0.22585821151733398, "memory_gb": 7.721559524536133, "step_time_ms": 7583.227872848511, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:48] (step=0001549) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.030101049358725224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:22:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1550, "loss": 0.31783825159072876, "memory_gb": 7.721559524536133, "step_time_ms": 7532.310009002686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:22:56] (step=0001550) Train Loss: 0.2901, Train Steps/Sec: 0.12, Epoch: 0.030120481927710843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1551, "loss": 0.2749325633049011, "memory_gb": 7.721559524536133, "step_time_ms": 7497.34902381897, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:04] (step=0001551) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.030139914496696462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1552, "loss": 0.22793473303318024, "memory_gb": 7.721559524536133, "step_time_ms": 7598.994970321655, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:12] (step=0001552) Train Loss: 0.2229, Train Steps/Sec: 0.12, Epoch: 0.030159347065682084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1553, "loss": 0.3082171678543091, "memory_gb": 7.721559524536133, "step_time_ms": 7622.500658035278, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:20] (step=0001553) Train Loss: 0.2540, Train Steps/Sec: 0.12, Epoch: 0.030178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1554, "loss": 0.244315043091774, "memory_gb": 7.721559524536133, "step_time_ms": 7582.963228225708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:28] (step=0001554) Train Loss: 0.2749, Train Steps/Sec: 0.12, Epoch: 0.030198212203653322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1555, "loss": 0.31056660413742065, "memory_gb": 7.721559524536133, "step_time_ms": 7603.323221206665, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:36] (step=0001555) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.030217644772638944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1556, "loss": 0.22638389468193054, "memory_gb": 7.721559524536133, "step_time_ms": 7555.131912231445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:44] (step=0001556) Train Loss: 0.2806, Train Steps/Sec: 0.12, Epoch: 0.030237077341624563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:23:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1557, "loss": 0.28333723545074463, "memory_gb": 7.721559524536133, "step_time_ms": 7682.351589202881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:23:52] (step=0001557) Train Loss: 0.2319, Train Steps/Sec: 0.12, Epoch: 0.030256509910610182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1558, "loss": 0.2665410041809082, "memory_gb": 7.721559524536133, "step_time_ms": 7617.356777191162, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:00] (step=0001558) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.030275942479595804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1559, "loss": 0.2044876217842102, "memory_gb": 7.721559524536133, "step_time_ms": 7551.759243011475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:09] (step=0001559) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.030295375048581423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1560, "loss": 0.20534640550613403, "memory_gb": 7.721559524536133, "step_time_ms": 7507.112264633179, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:17] (step=0001560) Train Loss: 0.2256, Train Steps/Sec: 0.12, Epoch: 0.030314807617567042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1561, "loss": 0.3154484033584595, "memory_gb": 7.721559524536133, "step_time_ms": 7576.075792312622, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:25] (step=0001561) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.03033424018655266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1562, "loss": 0.23014217615127563, "memory_gb": 7.721559524536133, "step_time_ms": 7545.474529266357, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:33] (step=0001562) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.030353672755538283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1563, "loss": 0.28534796833992004, "memory_gb": 7.721559524536133, "step_time_ms": 7451.083660125732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:41] (step=0001563) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.030373105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1564, "loss": 0.27698343992233276, "memory_gb": 7.721559524536133, "step_time_ms": 7485.745429992676, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:49] (step=0001564) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.03039253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1565, "loss": 0.20551258325576782, "memory_gb": 7.721559524536133, "step_time_ms": 7514.75715637207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:24:57] (step=0001565) Train Loss: 0.1774, Train Steps/Sec: 0.12, Epoch: 0.030411970462495143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1566, "loss": 0.3221730887889862, "memory_gb": 7.721559524536133, "step_time_ms": 7478.1341552734375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:05] (step=0001566) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.03043140303148076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1567, "loss": 0.2512979209423065, "memory_gb": 7.721559524536133, "step_time_ms": 7463.144302368164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:13] (step=0001567) Train Loss: 0.2488, Train Steps/Sec: 0.12, Epoch: 0.03045083560046638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1568, "loss": 0.31100133061408997, "memory_gb": 7.721559524536133, "step_time_ms": 7510.013580322266, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:21] (step=0001568) Train Loss: 0.2459, Train Steps/Sec: 0.12, Epoch: 0.030470268169452003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1569, "loss": 0.22827976942062378, "memory_gb": 7.721559524536133, "step_time_ms": 7293.4088706970215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:29] (step=0001569) Train Loss: 0.2909, Train Steps/Sec: 0.13, Epoch: 0.03048970073843762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1570, "loss": 0.30325639247894287, "memory_gb": 7.721559524536133, "step_time_ms": 7407.848596572876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:37] (step=0001570) Train Loss: 0.2815, Train Steps/Sec: 0.13, Epoch: 0.03050913330742324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1571, "loss": 0.24816928803920746, "memory_gb": 7.721559524536133, "step_time_ms": 5654.839038848877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:43] (step=0001571) Train Loss: 0.2587, Train Steps/Sec: 0.16, Epoch: 0.030528565876408863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1572, "loss": 0.2449265867471695, "memory_gb": 7.721559524536133, "step_time_ms": 7484.646320343018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:51] (step=0001572) Train Loss: 0.2470, Train Steps/Sec: 0.12, Epoch: 0.03054799844539448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:25:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1573, "loss": 0.1254080832004547, "memory_gb": 7.721559524536133, "step_time_ms": 7561.005353927612, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:25:59] (step=0001573) Train Loss: 0.1780, Train Steps/Sec: 0.12, Epoch: 0.0305674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1574, "loss": 0.1941121518611908, "memory_gb": 7.721559524536133, "step_time_ms": 7532.581806182861, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:07] (step=0001574) Train Loss: 0.2232, Train Steps/Sec: 0.12, Epoch: 0.030586863583365723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1575, "loss": 0.21810266375541687, "memory_gb": 7.721559524536133, "step_time_ms": 7480.677127838135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:15] (step=0001575) Train Loss: 0.1844, Train Steps/Sec: 0.12, Epoch: 0.03060629615235134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1576, "loss": 0.23773598670959473, "memory_gb": 7.721559524536133, "step_time_ms": 7503.798246383667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:23] (step=0001576) Train Loss: 0.2192, Train Steps/Sec: 0.12, Epoch: 0.03062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1577, "loss": 0.2556725740432739, "memory_gb": 7.721559524536133, "step_time_ms": 7427.92534828186, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:31] (step=0001577) Train Loss: 0.2433, Train Steps/Sec: 0.13, Epoch: 0.03064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1578, "loss": 0.2459968775510788, "memory_gb": 7.721559524536133, "step_time_ms": 7482.868671417236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:39] (step=0001578) Train Loss: 0.1843, Train Steps/Sec: 0.12, Epoch: 0.0306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1579, "loss": 0.2944602370262146, "memory_gb": 7.721559524536133, "step_time_ms": 7460.8471393585205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:47] (step=0001579) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.03068402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1580, "loss": 0.1704910397529602, "memory_gb": 7.721559524536133, "step_time_ms": 7405.930280685425, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:26:55] (step=0001580) Train Loss: 0.1825, Train Steps/Sec: 0.12, Epoch: 0.03070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1581, "loss": 0.28818702697753906, "memory_gb": 7.721559524536133, "step_time_ms": 7475.613117218018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:03] (step=0001581) Train Loss: 0.2333, Train Steps/Sec: 0.12, Epoch: 0.03072289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1582, "loss": 0.1407317817211151, "memory_gb": 7.721559524536133, "step_time_ms": 7477.043867111206, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:11] (step=0001582) Train Loss: 0.1714, Train Steps/Sec: 0.12, Epoch: 0.03074232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1583, "loss": 0.24316859245300293, "memory_gb": 7.721559524536133, "step_time_ms": 7406.391859054565, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:19] (step=0001583) Train Loss: 0.2678, Train Steps/Sec: 0.13, Epoch: 0.0307617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1584, "loss": 0.12640877068042755, "memory_gb": 7.721559524536133, "step_time_ms": 7461.1639976501465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:27] (step=0001584) Train Loss: 0.1787, Train Steps/Sec: 0.12, Epoch: 0.03078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1585, "loss": 0.2949504256248474, "memory_gb": 7.721559524536133, "step_time_ms": 7460.2320194244385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:36] (step=0001585) Train Loss: 0.2848, Train Steps/Sec: 0.12, Epoch: 0.03080062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1586, "loss": 0.3219158947467804, "memory_gb": 7.721559524536133, "step_time_ms": 7388.967990875244, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:44] (step=0001586) Train Loss: 0.3140, Train Steps/Sec: 0.13, Epoch: 0.03082005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:27:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1587, "loss": 0.2714681625366211, "memory_gb": 7.721559524536133, "step_time_ms": 7488.502740859985, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:27:52] (step=0001587) Train Loss: 0.2503, Train Steps/Sec: 0.12, Epoch: 0.03083948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1588, "loss": 0.285028874874115, "memory_gb": 7.721559524536133, "step_time_ms": 7491.464614868164, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:00] (step=0001588) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.0308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1589, "loss": 0.21757762134075165, "memory_gb": 7.721559524536133, "step_time_ms": 7456.279754638672, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:08] (step=0001589) Train Loss: 0.1917, Train Steps/Sec: 0.12, Epoch: 0.03087835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1590, "loss": 0.20133355259895325, "memory_gb": 7.721559524536133, "step_time_ms": 7454.552888870239, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:16] (step=0001590) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.030897784687135638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1591, "loss": 0.1665641963481903, "memory_gb": 7.721559524536133, "step_time_ms": 7479.199171066284, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:24] (step=0001591) Train Loss: 0.1957, Train Steps/Sec: 0.12, Epoch: 0.03091721725612126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1592, "loss": 0.16598834097385406, "memory_gb": 7.721559524536133, "step_time_ms": 7392.900466918945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:32] (step=0001592) Train Loss: 0.2015, Train Steps/Sec: 0.12, Epoch: 0.03093664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1593, "loss": 0.2643023729324341, "memory_gb": 7.721559524536133, "step_time_ms": 7445.9388256073, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:40] (step=0001593) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.030956082394092498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1594, "loss": 0.23179224133491516, "memory_gb": 7.721559524536133, "step_time_ms": 7485.129117965698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:48] (step=0001594) Train Loss: 0.2448, Train Steps/Sec: 0.12, Epoch: 0.03097551496307812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:28:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1595, "loss": 0.34521445631980896, "memory_gb": 7.721559524536133, "step_time_ms": 7451.223611831665, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:28:56] (step=0001595) Train Loss: 0.3497, Train Steps/Sec: 0.12, Epoch: 0.03099494753206374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1596, "loss": 0.14538615942001343, "memory_gb": 7.721559524536133, "step_time_ms": 7480.20601272583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:04] (step=0001596) Train Loss: 0.2054, Train Steps/Sec: 0.12, Epoch: 0.031014380101049357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1597, "loss": 0.2890251874923706, "memory_gb": 7.721559524536133, "step_time_ms": 7467.368841171265, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:12] (step=0001597) Train Loss: 0.2739, Train Steps/Sec: 0.12, Epoch: 0.03103381267003498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1598, "loss": 0.27385416626930237, "memory_gb": 7.721559524536133, "step_time_ms": 7260.111331939697, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:20] (step=0001598) Train Loss: 0.2582, Train Steps/Sec: 0.13, Epoch: 0.0310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1599, "loss": 0.2690795660018921, "memory_gb": 7.721559524536133, "step_time_ms": 7367.346286773682, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:28] (step=0001599) Train Loss: 0.2914, Train Steps/Sec: 0.13, Epoch: 0.031072677808006217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1600, "loss": 0.2968534231185913, "memory_gb": 7.721559524536133, "step_time_ms": 5324.358224868774, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:34] (step=0001600) Train Loss: 0.2363, Train Steps/Sec: 0.16, Epoch: 0.03109211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1601, "loss": 0.34396252036094666, "memory_gb": 7.721559524536133, "step_time_ms": 7509.909629821777, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:42] (step=0001601) Train Loss: 0.3215, Train Steps/Sec: 0.12, Epoch: 0.03111154294597746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1602, "loss": 0.22756327688694, "memory_gb": 7.721559524536133, "step_time_ms": 7536.54670715332, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:50] (step=0001602) Train Loss: 0.2893, Train Steps/Sec: 0.12, Epoch: 0.031130975514963077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:29:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1603, "loss": 0.1852017343044281, "memory_gb": 7.721559524536133, "step_time_ms": 7440.686941146851, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:29:58] (step=0001603) Train Loss: 0.2297, Train Steps/Sec: 0.13, Epoch: 0.0311504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1604, "loss": 0.19638296961784363, "memory_gb": 7.721559524536133, "step_time_ms": 7484.795808792114, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:06] (step=0001604) Train Loss: 0.2075, Train Steps/Sec: 0.12, Epoch: 0.03116984065293432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1605, "loss": 0.22852368652820587, "memory_gb": 7.721559524536133, "step_time_ms": 7682.764768600464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:14] (step=0001605) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.031189273221919937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1606, "loss": 0.1612589955329895, "memory_gb": 7.721559524536133, "step_time_ms": 7476.610898971558, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:22] (step=0001606) Train Loss: 0.1670, Train Steps/Sec: 0.12, Epoch: 0.031208705790905556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1607, "loss": 0.21275770664215088, "memory_gb": 7.721559524536133, "step_time_ms": 7504.593849182129, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:31] (step=0001607) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.03122813835989118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1608, "loss": 0.2600531280040741, "memory_gb": 7.721559524536133, "step_time_ms": 7595.113039016724, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:39] (step=0001608) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.031247570928876797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1609, "loss": 0.16976021230220795, "memory_gb": 7.721559524536133, "step_time_ms": 7507.262468338013, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:47] (step=0001609) Train Loss: 0.2337, Train Steps/Sec: 0.12, Epoch: 0.03126700349786242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:30:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1610, "loss": 0.13917842507362366, "memory_gb": 7.721559524536133, "step_time_ms": 7561.524391174316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:30:55] (step=0001610) Train Loss: 0.1702, Train Steps/Sec: 0.12, Epoch: 0.031286436066848035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1611, "loss": 0.2261759340763092, "memory_gb": 7.721559524536133, "step_time_ms": 7632.310152053833, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:03] (step=0001611) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.03130586863583366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1612, "loss": 0.3018719553947449, "memory_gb": 7.721559524536133, "step_time_ms": 7327.2130489349365, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:11] (step=0001612) Train Loss: 0.3264, Train Steps/Sec: 0.13, Epoch: 0.03132530120481928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1613, "loss": 0.23708947002887726, "memory_gb": 7.721559524536133, "step_time_ms": 7631.687879562378, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:19] (step=0001613) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.031344733773804895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1614, "loss": 0.24173474311828613, "memory_gb": 7.721559524536133, "step_time_ms": 7595.393419265747, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:27] (step=0001614) Train Loss: 0.2475, Train Steps/Sec: 0.13, Epoch: 0.03136416634279052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1615, "loss": 0.2538262903690338, "memory_gb": 7.721559524536133, "step_time_ms": 7525.674343109131, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:35] (step=0001615) Train Loss: 0.2611, Train Steps/Sec: 0.12, Epoch: 0.03138359891177614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1616, "loss": 0.32747241854667664, "memory_gb": 7.721559524536133, "step_time_ms": 7562.649488449097, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:43] (step=0001616) Train Loss: 0.2772, Train Steps/Sec: 0.12, Epoch: 0.031403031480761755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1617, "loss": 0.217698872089386, "memory_gb": 7.721559524536133, "step_time_ms": 7643.036603927612, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:51] (step=0001617) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.03142246404974738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:31:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1618, "loss": 0.27562010288238525, "memory_gb": 7.721559524536133, "step_time_ms": 7489.936351776123, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:31:59] (step=0001618) Train Loss: 0.2719, Train Steps/Sec: 0.12, Epoch: 0.031441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1619, "loss": 0.23915547132492065, "memory_gb": 7.721559524536133, "step_time_ms": 7635.5345249176025, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:07] (step=0001619) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.031461329187718615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1620, "loss": 0.299179345369339, "memory_gb": 7.721559524536133, "step_time_ms": 7630.71608543396, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:15] (step=0001620) Train Loss: 0.2788, Train Steps/Sec: 0.12, Epoch: 0.03148076175670424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1621, "loss": 0.28283703327178955, "memory_gb": 7.721559524536133, "step_time_ms": 7537.086725234985, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:23] (step=0001621) Train Loss: 0.2270, Train Steps/Sec: 0.12, Epoch: 0.03150019432568986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1622, "loss": 0.2016579508781433, "memory_gb": 7.721559524536133, "step_time_ms": 7603.443384170532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:31] (step=0001622) Train Loss: 0.1842, Train Steps/Sec: 0.12, Epoch: 0.031519626894675475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1623, "loss": 0.21365825831890106, "memory_gb": 7.721559524536133, "step_time_ms": 7644.419193267822, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:39] (step=0001623) Train Loss: 0.1886, Train Steps/Sec: 0.12, Epoch: 0.0315390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1624, "loss": 0.20468129217624664, "memory_gb": 7.721559524536133, "step_time_ms": 7539.468050003052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:47] (step=0001624) Train Loss: 0.1932, Train Steps/Sec: 0.12, Epoch: 0.03155849203264672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1625, "loss": 0.20308345556259155, "memory_gb": 7.721559524536133, "step_time_ms": 7572.610378265381, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:32:55] (step=0001625) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.031577924601632335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1626, "loss": 0.20203644037246704, "memory_gb": 7.721559524536133, "step_time_ms": 7598.3686447143555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:03] (step=0001626) Train Loss: 0.2468, Train Steps/Sec: 0.12, Epoch: 0.03159735717061796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1627, "loss": 0.22524936497211456, "memory_gb": 7.721559524536133, "step_time_ms": 7325.974702835083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:11] (step=0001627) Train Loss: 0.2127, Train Steps/Sec: 0.13, Epoch: 0.03161678973960357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1628, "loss": 0.32960134744644165, "memory_gb": 7.721559524536133, "step_time_ms": 7351.20701789856, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:19] (step=0001628) Train Loss: 0.2713, Train Steps/Sec: 0.13, Epoch: 0.031636222308589194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1629, "loss": 0.31482136249542236, "memory_gb": 7.721559524536133, "step_time_ms": 5919.235467910767, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:25] (step=0001629) Train Loss: 0.2640, Train Steps/Sec: 0.16, Epoch: 0.03165565487757482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1630, "loss": 0.17235761880874634, "memory_gb": 7.721559524536133, "step_time_ms": 7504.801273345947, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:33] (step=0001630) Train Loss: 0.2013, Train Steps/Sec: 0.12, Epoch: 0.03167508744656043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1631, "loss": 0.27724018692970276, "memory_gb": 7.721559524536133, "step_time_ms": 7527.395486831665, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:41] (step=0001631) Train Loss: 0.2894, Train Steps/Sec: 0.12, Epoch: 0.031694520015546054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1632, "loss": 0.21739259362220764, "memory_gb": 7.721559524536133, "step_time_ms": 7455.39116859436, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:49] (step=0001632) Train Loss: 0.2701, Train Steps/Sec: 0.12, Epoch: 0.03171395258453168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1633, "loss": 0.18416748940944672, "memory_gb": 7.721559524536133, "step_time_ms": 7524.699687957764, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:33:57] (step=0001633) Train Loss: 0.1934, Train Steps/Sec: 0.12, Epoch: 0.03173338515351729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1634, "loss": 0.1790943443775177, "memory_gb": 7.721559524536133, "step_time_ms": 7521.19779586792, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:05] (step=0001634) Train Loss: 0.2465, Train Steps/Sec: 0.12, Epoch: 0.031752817722502914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1635, "loss": 0.18470346927642822, "memory_gb": 7.721559524536133, "step_time_ms": 7488.558292388916, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:13] (step=0001635) Train Loss: 0.1971, Train Steps/Sec: 0.13, Epoch: 0.03177225029148854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1636, "loss": 0.24905744194984436, "memory_gb": 7.721559524536133, "step_time_ms": 7509.295463562012, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:22] (step=0001636) Train Loss: 0.2327, Train Steps/Sec: 0.12, Epoch: 0.03179168286047415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1637, "loss": 0.16588446497917175, "memory_gb": 7.721559524536133, "step_time_ms": 7486.401319503784, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:30] (step=0001637) Train Loss: 0.2081, Train Steps/Sec: 0.13, Epoch: 0.031811115429459774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1638, "loss": 0.2462502419948578, "memory_gb": 7.721559524536133, "step_time_ms": 7506.511449813843, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:38] (step=0001638) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.031830547998445397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1639, "loss": 0.26005446910858154, "memory_gb": 7.721559524536133, "step_time_ms": 7502.005815505981, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:46] (step=0001639) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.03184998056743101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:34:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1640, "loss": 0.09150391817092896, "memory_gb": 7.721559524536133, "step_time_ms": 7520.55811882019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:34:54] (step=0001640) Train Loss: 0.1541, Train Steps/Sec: 0.12, Epoch: 0.031869413136416634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1641, "loss": 0.2713788151741028, "memory_gb": 7.721559524536133, "step_time_ms": 7485.050201416016, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:02] (step=0001641) Train Loss: 0.2801, Train Steps/Sec: 0.12, Epoch: 0.031888845705402256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1642, "loss": 0.25640130043029785, "memory_gb": 7.721559524536133, "step_time_ms": 7492.500066757202, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:10] (step=0001642) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.03190827827438787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1643, "loss": 0.31099316477775574, "memory_gb": 7.721559524536133, "step_time_ms": 7557.928562164307, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:18] (step=0001643) Train Loss: 0.2988, Train Steps/Sec: 0.12, Epoch: 0.031927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1644, "loss": 0.17123328149318695, "memory_gb": 7.721559524536133, "step_time_ms": 7468.363046646118, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:26] (step=0001644) Train Loss: 0.1982, Train Steps/Sec: 0.12, Epoch: 0.031947143412359116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1645, "loss": 0.2724442481994629, "memory_gb": 7.721559524536133, "step_time_ms": 7605.474948883057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:34] (step=0001645) Train Loss: 0.2529, Train Steps/Sec: 0.12, Epoch: 0.03196657598134473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1646, "loss": 0.26939842104911804, "memory_gb": 7.721559524536133, "step_time_ms": 7529.4153690338135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:42] (step=0001646) Train Loss: 0.2896, Train Steps/Sec: 0.12, Epoch: 0.031986008550330354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1647, "loss": 0.2506827414035797, "memory_gb": 7.721559524536133, "step_time_ms": 7427.151918411255, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:50] (step=0001647) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.032005441119315976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1648, "loss": 0.28489312529563904, "memory_gb": 7.721559524536133, "step_time_ms": 7455.135345458984, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:35:58] (step=0001648) Train Loss: 0.2516, Train Steps/Sec: 0.12, Epoch: 0.03202487368830159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1649, "loss": 0.19235941767692566, "memory_gb": 7.721559524536133, "step_time_ms": 7504.938364028931, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:06] (step=0001649) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.032044306257287214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1650, "loss": 0.16117951273918152, "memory_gb": 7.721559524536133, "step_time_ms": 7449.940204620361, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:14] (step=0001650) Train Loss: 0.1972, Train Steps/Sec: 0.12, Epoch: 0.032063738826272836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1651, "loss": 0.22826063632965088, "memory_gb": 7.721559524536133, "step_time_ms": 7494.284629821777, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:23] (step=0001651) Train Loss: 0.2271, Train Steps/Sec: 0.12, Epoch: 0.03208317139525845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1652, "loss": 0.24033434689044952, "memory_gb": 7.721559524536133, "step_time_ms": 7500.485897064209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:31] (step=0001652) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.032102603964244074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1653, "loss": 0.2802242934703827, "memory_gb": 7.721559524536133, "step_time_ms": 7438.092947006226, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:39] (step=0001653) Train Loss: 0.2598, Train Steps/Sec: 0.13, Epoch: 0.032122036533229696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1654, "loss": 0.207128643989563, "memory_gb": 7.721559524536133, "step_time_ms": 7440.374374389648, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:47] (step=0001654) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.03214146910221531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:36:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1655, "loss": 0.3534206748008728, "memory_gb": 7.721559524536133, "step_time_ms": 7478.61909866333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:36:55] (step=0001655) Train Loss: 0.3022, Train Steps/Sec: 0.12, Epoch: 0.032160901671200934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1656, "loss": 0.2713336944580078, "memory_gb": 7.721559524536133, "step_time_ms": 7291.351795196533, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:03] (step=0001656) Train Loss: 0.2333, Train Steps/Sec: 0.13, Epoch: 0.032180334240186556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1657, "loss": 0.15493039786815643, "memory_gb": 7.721559524536133, "step_time_ms": 6982.375860214233, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:10] (step=0001657) Train Loss: 0.1598, Train Steps/Sec: 0.14, Epoch: 0.03219976680917217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1658, "loss": 0.1374199092388153, "memory_gb": 7.721559524536133, "step_time_ms": 5806.725025177002, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:17] (step=0001658) Train Loss: 0.1968, Train Steps/Sec: 0.15, Epoch: 0.032219199378157794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1659, "loss": 0.2795260548591614, "memory_gb": 7.721559524536133, "step_time_ms": 7429.487228393555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:25] (step=0001659) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.03223863194714341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1660, "loss": 0.23754465579986572, "memory_gb": 7.721559524536133, "step_time_ms": 7506.645441055298, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:33] (step=0001660) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.03225806451612903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1661, "loss": 0.28553757071495056, "memory_gb": 7.721559524536133, "step_time_ms": 7396.247148513794, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:41] (step=0001661) Train Loss: 0.3031, Train Steps/Sec: 0.13, Epoch: 0.032277497085114654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1662, "loss": 0.16870614886283875, "memory_gb": 7.721559524536133, "step_time_ms": 7454.550266265869, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:49] (step=0001662) Train Loss: 0.1658, Train Steps/Sec: 0.12, Epoch: 0.03229692965410027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:37:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1663, "loss": 0.2652198374271393, "memory_gb": 7.721559524536133, "step_time_ms": 7522.363901138306, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:37:57] (step=0001663) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.03231636222308589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1664, "loss": 0.2446109801530838, "memory_gb": 7.721559524536133, "step_time_ms": 7430.165767669678, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:05] (step=0001664) Train Loss: 0.1941, Train Steps/Sec: 0.12, Epoch: 0.032335794792071514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1665, "loss": 0.3325205445289612, "memory_gb": 7.721559524536133, "step_time_ms": 7417.830944061279, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:13] (step=0001665) Train Loss: 0.3023, Train Steps/Sec: 0.12, Epoch: 0.03235522736105713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1666, "loss": 0.2912861704826355, "memory_gb": 7.721559524536133, "step_time_ms": 7469.2206382751465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:21] (step=0001666) Train Loss: 0.2379, Train Steps/Sec: 0.12, Epoch: 0.03237465993004275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1667, "loss": 0.3507186770439148, "memory_gb": 7.721559524536133, "step_time_ms": 7431.339263916016, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:29] (step=0001667) Train Loss: 0.3138, Train Steps/Sec: 0.12, Epoch: 0.032394092499028374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1668, "loss": 0.21874329447746277, "memory_gb": 7.721559524536133, "step_time_ms": 7446.705102920532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:37] (step=0001668) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.03241352506801399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1669, "loss": 0.2598907947540283, "memory_gb": 7.721559524536133, "step_time_ms": 7476.348161697388, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:45] (step=0001669) Train Loss: 0.2941, Train Steps/Sec: 0.12, Epoch: 0.03243295763699961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:38:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1670, "loss": 0.25746405124664307, "memory_gb": 7.721559524536133, "step_time_ms": 7393.4948444366455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:38:53] (step=0001670) Train Loss: 0.2885, Train Steps/Sec: 0.12, Epoch: 0.03245239020598523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1671, "loss": 0.3689197599887848, "memory_gb": 7.721559524536133, "step_time_ms": 7448.719263076782, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:01] (step=0001671) Train Loss: 0.2779, Train Steps/Sec: 0.12, Epoch: 0.03247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1672, "loss": 0.2578184902667999, "memory_gb": 7.721559524536133, "step_time_ms": 7454.939126968384, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:09] (step=0001672) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.03249125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1673, "loss": 0.22561174631118774, "memory_gb": 7.721559524536133, "step_time_ms": 7395.334720611572, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:17] (step=0001673) Train Loss: 0.2174, Train Steps/Sec: 0.13, Epoch: 0.03251068791294209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1674, "loss": 0.2538849115371704, "memory_gb": 7.721559524536133, "step_time_ms": 7402.021169662476, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:25] (step=0001674) Train Loss: 0.2330, Train Steps/Sec: 0.13, Epoch: 0.03253012048192771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1675, "loss": 0.2544311285018921, "memory_gb": 7.721559524536133, "step_time_ms": 7492.34676361084, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:33] (step=0001675) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.03254955305091333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1676, "loss": 0.16708078980445862, "memory_gb": 7.721559524536133, "step_time_ms": 7237.376689910889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:41] (step=0001676) Train Loss: 0.2459, Train Steps/Sec: 0.13, Epoch: 0.03256898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1677, "loss": 0.22059266269207, "memory_gb": 7.721559524536133, "step_time_ms": 7487.077951431274, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:49] (step=0001677) Train Loss: 0.2811, Train Steps/Sec: 0.12, Epoch: 0.03258841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1678, "loss": 0.2075527161359787, "memory_gb": 7.721559524536133, "step_time_ms": 7501.685857772827, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:39:57] (step=0001678) Train Loss: 0.2127, Train Steps/Sec: 0.12, Epoch: 0.03260785075787019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1679, "loss": 0.307257741689682, "memory_gb": 7.721559524536133, "step_time_ms": 7409.587860107422, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:05] (step=0001679) Train Loss: 0.2635, Train Steps/Sec: 0.12, Epoch: 0.03262728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1680, "loss": 0.27856919169425964, "memory_gb": 7.721559524536133, "step_time_ms": 7243.879079818726, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:13] (step=0001680) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.03264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1681, "loss": 0.26998040080070496, "memory_gb": 7.721559524536133, "step_time_ms": 7527.575731277466, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:21] (step=0001681) Train Loss: 0.2379, Train Steps/Sec: 0.12, Epoch: 0.03266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1682, "loss": 0.22557586431503296, "memory_gb": 7.721559524536133, "step_time_ms": 7447.259426116943, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:29] (step=0001682) Train Loss: 0.1995, Train Steps/Sec: 0.13, Epoch: 0.03268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1683, "loss": 0.19625964760780334, "memory_gb": 7.721559524536133, "step_time_ms": 7496.450901031494, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:37] (step=0001683) Train Loss: 0.1893, Train Steps/Sec: 0.12, Epoch: 0.03270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1684, "loss": 0.327319473028183, "memory_gb": 7.721559524536133, "step_time_ms": 7592.4317836761475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:46] (step=0001684) Train Loss: 0.2908, Train Steps/Sec: 0.12, Epoch: 0.03272444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:40:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1685, "loss": 0.24558180570602417, "memory_gb": 7.721559524536133, "step_time_ms": 7360.063791275024, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:40:53] (step=0001685) Train Loss: 0.2389, Train Steps/Sec: 0.13, Epoch: 0.03274387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1686, "loss": 0.23260360956192017, "memory_gb": 7.721559524536133, "step_time_ms": 7133.093357086182, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:01] (step=0001686) Train Loss: 0.2525, Train Steps/Sec: 0.14, Epoch: 0.03276331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1687, "loss": 0.23190933465957642, "memory_gb": 7.721559524536133, "step_time_ms": 5845.901727676392, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:07] (step=0001687) Train Loss: 0.1902, Train Steps/Sec: 0.16, Epoch: 0.03278274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1688, "loss": 0.28430819511413574, "memory_gb": 7.721559524536133, "step_time_ms": 7534.024953842163, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:15] (step=0001688) Train Loss: 0.3204, Train Steps/Sec: 0.12, Epoch: 0.032802176447726386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1689, "loss": 0.1730315387248993, "memory_gb": 7.721559524536133, "step_time_ms": 7562.107801437378, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:23] (step=0001689) Train Loss: 0.2190, Train Steps/Sec: 0.12, Epoch: 0.03282160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1690, "loss": 0.25535959005355835, "memory_gb": 7.721559524536133, "step_time_ms": 7583.051919937134, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:31] (step=0001690) Train Loss: 0.2513, Train Steps/Sec: 0.13, Epoch: 0.03284104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1691, "loss": 0.3090333342552185, "memory_gb": 7.721559524536133, "step_time_ms": 7590.944290161133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:39] (step=0001691) Train Loss: 0.2525, Train Steps/Sec: 0.12, Epoch: 0.032860474154683246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1692, "loss": 0.28990256786346436, "memory_gb": 7.721559524536133, "step_time_ms": 7725.745916366577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:47] (step=0001692) Train Loss: 0.2882, Train Steps/Sec: 0.13, Epoch: 0.03287990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:41:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1693, "loss": 0.24842488765716553, "memory_gb": 7.721559524536133, "step_time_ms": 7541.558027267456, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:41:55] (step=0001693) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.03289933929265449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1694, "loss": 0.23580920696258545, "memory_gb": 7.721559524536133, "step_time_ms": 7618.818521499634, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:03] (step=0001694) Train Loss: 0.2580, Train Steps/Sec: 0.12, Epoch: 0.032918771861640106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1695, "loss": 0.2065463811159134, "memory_gb": 7.721559524536133, "step_time_ms": 7674.281597137451, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:11] (step=0001695) Train Loss: 0.2256, Train Steps/Sec: 0.12, Epoch: 0.03293820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1696, "loss": 0.19882413744926453, "memory_gb": 7.721559524536133, "step_time_ms": 7570.567607879639, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:19] (step=0001696) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.03295763699961135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1697, "loss": 0.25974202156066895, "memory_gb": 7.721559524536133, "step_time_ms": 7610.802173614502, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:27] (step=0001697) Train Loss: 0.3126, Train Steps/Sec: 0.12, Epoch: 0.032977069568596966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1698, "loss": 0.35987311601638794, "memory_gb": 7.721559524536133, "step_time_ms": 7668.097257614136, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:35] (step=0001698) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.03299650213758259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1699, "loss": 0.29452621936798096, "memory_gb": 7.721559524536133, "step_time_ms": 7606.797933578491, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:43] (step=0001699) Train Loss: 0.2873, Train Steps/Sec: 0.12, Epoch: 0.03301593470656821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1700, "loss": 0.36665940284729004, "memory_gb": 7.721559524536133, "step_time_ms": 7635.844945907593, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:42:52] (step=0001700) Train Loss: 0.3080, Train Steps/Sec: 0.12, Epoch: 0.033035367275553826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1701, "loss": 0.21802018582820892, "memory_gb": 7.721559524536133, "step_time_ms": 7611.525535583496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:00] (step=0001701) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.03305479984453945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1702, "loss": 0.3200523853302002, "memory_gb": 7.721559524536133, "step_time_ms": 7576.519012451172, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:08] (step=0001702) Train Loss: 0.2843, Train Steps/Sec: 0.12, Epoch: 0.03307423241352507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1703, "loss": 0.26197290420532227, "memory_gb": 7.721559524536133, "step_time_ms": 7620.398998260498, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:16] (step=0001703) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.033093664982510686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1704, "loss": 0.19714349508285522, "memory_gb": 7.721559524536133, "step_time_ms": 7582.522869110107, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:24] (step=0001704) Train Loss: 0.2018, Train Steps/Sec: 0.12, Epoch: 0.03311309755149631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1705, "loss": 0.3781047463417053, "memory_gb": 7.721559524536133, "step_time_ms": 7535.186290740967, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:32] (step=0001705) Train Loss: 0.2797, Train Steps/Sec: 0.12, Epoch: 0.03313253012048193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1706, "loss": 0.21452128887176514, "memory_gb": 7.721559524536133, "step_time_ms": 7599.744081497192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:40] (step=0001706) Train Loss: 0.2606, Train Steps/Sec: 0.12, Epoch: 0.033151962689467546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1707, "loss": 0.31194645166397095, "memory_gb": 7.721559524536133, "step_time_ms": 7589.676141738892, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:48] (step=0001707) Train Loss: 0.2373, Train Steps/Sec: 0.12, Epoch: 0.03317139525845317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:43:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1708, "loss": 0.20866450667381287, "memory_gb": 7.721559524536133, "step_time_ms": 7426.020622253418, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:43:56] (step=0001708) Train Loss: 0.2335, Train Steps/Sec: 0.13, Epoch: 0.03319082782743879, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1709, "loss": 0.23678193986415863, "memory_gb": 7.721559524536133, "step_time_ms": 7470.5328941345215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:04] (step=0001709) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.033210260396424406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1710, "loss": 0.2714965045452118, "memory_gb": 7.721559524536133, "step_time_ms": 7541.727066040039, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:12] (step=0001710) Train Loss: 0.2301, Train Steps/Sec: 0.12, Epoch: 0.03322969296541003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1711, "loss": 0.22815904021263123, "memory_gb": 7.721559524536133, "step_time_ms": 7454.743385314941, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:20] (step=0001711) Train Loss: 0.2570, Train Steps/Sec: 0.13, Epoch: 0.03324912553439565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1712, "loss": 0.26550015807151794, "memory_gb": 7.721559524536133, "step_time_ms": 7497.768402099609, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:28] (step=0001712) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.033268558103381266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1713, "loss": 0.27037590742111206, "memory_gb": 7.721559524536133, "step_time_ms": 7479.638576507568, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:36] (step=0001713) Train Loss: 0.1925, Train Steps/Sec: 0.12, Epoch: 0.03328799067236689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1714, "loss": 0.2646821439266205, "memory_gb": 7.721559524536133, "step_time_ms": 7296.014785766602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:44] (step=0001714) Train Loss: 0.2970, Train Steps/Sec: 0.13, Epoch: 0.03330742324135251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1715, "loss": 0.33440011739730835, "memory_gb": 7.721559524536133, "step_time_ms": 7534.802198410034, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:52] (step=0001715) Train Loss: 0.3226, Train Steps/Sec: 0.13, Epoch: 0.033326855810338125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:44:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1716, "loss": 0.2020551860332489, "memory_gb": 7.721559524536133, "step_time_ms": 5762.102842330933, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:44:58] (step=0001716) Train Loss: 0.2427, Train Steps/Sec: 0.17, Epoch: 0.03334628837932375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1717, "loss": 0.22505682706832886, "memory_gb": 7.721559524536133, "step_time_ms": 7615.225791931152, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:06] (step=0001717) Train Loss: 0.2119, Train Steps/Sec: 0.12, Epoch: 0.03336572094830936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1718, "loss": 0.2580251395702362, "memory_gb": 7.721559524536133, "step_time_ms": 7522.787094116211, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:14] (step=0001718) Train Loss: 0.2348, Train Steps/Sec: 0.12, Epoch: 0.033385153517294985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1719, "loss": 0.22891956567764282, "memory_gb": 7.721559524536133, "step_time_ms": 7445.2996253967285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:22] (step=0001719) Train Loss: 0.2133, Train Steps/Sec: 0.13, Epoch: 0.03340458608628061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1720, "loss": 0.22873398661613464, "memory_gb": 7.721559524536133, "step_time_ms": 7508.48126411438, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:30] (step=0001720) Train Loss: 0.1993, Train Steps/Sec: 0.12, Epoch: 0.03342401865526622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1721, "loss": 0.22129949927330017, "memory_gb": 7.721559524536133, "step_time_ms": 7456.834316253662, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:38] (step=0001721) Train Loss: 0.1940, Train Steps/Sec: 0.13, Epoch: 0.033443451224251845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1722, "loss": 0.2218935191631317, "memory_gb": 7.721559524536133, "step_time_ms": 7497.208833694458, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:46] (step=0001722) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.03346288379323747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1723, "loss": 0.3015071153640747, "memory_gb": 7.715639114379883, "step_time_ms": 7550.265073776245, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:45:54] (step=0001723) Train Loss: 0.2978, Train Steps/Sec: 0.12, Epoch: 0.03348231636222308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1724, "loss": 0.29021525382995605, "memory_gb": 7.721559524536133, "step_time_ms": 7522.199392318726, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:03] (step=0001724) Train Loss: 0.2842, Train Steps/Sec: 0.12, Epoch: 0.033501748931208705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1725, "loss": 0.30814284086227417, "memory_gb": 7.721559524536133, "step_time_ms": 7443.2783126831055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:11] (step=0001725) Train Loss: 0.2543, Train Steps/Sec: 0.12, Epoch: 0.03352118150019433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1726, "loss": 0.27856913208961487, "memory_gb": 7.721559524536133, "step_time_ms": 7483.862400054932, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:19] (step=0001726) Train Loss: 0.2665, Train Steps/Sec: 0.12, Epoch: 0.03354061406917994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1727, "loss": 0.26858577132225037, "memory_gb": 7.721559524536133, "step_time_ms": 7471.617221832275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:27] (step=0001727) Train Loss: 0.2581, Train Steps/Sec: 0.13, Epoch: 0.033560046638165565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1728, "loss": 0.22498878836631775, "memory_gb": 7.721559524536133, "step_time_ms": 7412.534713745117, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:35] (step=0001728) Train Loss: 0.2179, Train Steps/Sec: 0.12, Epoch: 0.03357947920715119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1729, "loss": 0.22626474499702454, "memory_gb": 7.721559524536133, "step_time_ms": 7511.169910430908, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:43] (step=0001729) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.0335989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1730, "loss": 0.3010459542274475, "memory_gb": 7.721559524536133, "step_time_ms": 7469.380855560303, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:51] (step=0001730) Train Loss: 0.2949, Train Steps/Sec: 0.12, Epoch: 0.033618344345122425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1731, "loss": 0.3391087055206299, "memory_gb": 7.721559524536133, "step_time_ms": 7407.693386077881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:46:59] (step=0001731) Train Loss: 0.2879, Train Steps/Sec: 0.13, Epoch: 0.03363777691410805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1732, "loss": 0.17867782711982727, "memory_gb": 7.721559524536133, "step_time_ms": 7472.716331481934, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:07] (step=0001732) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.03365720948309366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1733, "loss": 0.32352885603904724, "memory_gb": 7.721559524536133, "step_time_ms": 7615.712881088257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:15] (step=0001733) Train Loss: 0.2664, Train Steps/Sec: 0.12, Epoch: 0.033676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1734, "loss": 0.3020099997520447, "memory_gb": 7.721559524536133, "step_time_ms": 7404.406547546387, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:23] (step=0001734) Train Loss: 0.2553, Train Steps/Sec: 0.12, Epoch: 0.03369607462106491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1735, "loss": 0.329704612493515, "memory_gb": 7.721559524536133, "step_time_ms": 7515.886306762695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:31] (step=0001735) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.03371550719005052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1736, "loss": 0.2555360198020935, "memory_gb": 7.721559524536133, "step_time_ms": 7499.70555305481, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:39] (step=0001736) Train Loss: 0.2518, Train Steps/Sec: 0.12, Epoch: 0.033734939759036145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1737, "loss": 0.28360632061958313, "memory_gb": 7.721559524536133, "step_time_ms": 7425.603628158569, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:47] (step=0001737) Train Loss: 0.3242, Train Steps/Sec: 0.12, Epoch: 0.03375437232802177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:47:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1738, "loss": 0.2489573359489441, "memory_gb": 7.721559524536133, "step_time_ms": 7514.507055282593, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:47:55] (step=0001738) Train Loss: 0.2573, Train Steps/Sec: 0.12, Epoch: 0.03377380489700738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1739, "loss": 0.15599969029426575, "memory_gb": 7.721559524536133, "step_time_ms": 7491.543769836426, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:03] (step=0001739) Train Loss: 0.2181, Train Steps/Sec: 0.12, Epoch: 0.033793237465993005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1740, "loss": 0.34669315814971924, "memory_gb": 7.721559524536133, "step_time_ms": 7476.18842124939, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:11] (step=0001740) Train Loss: 0.2926, Train Steps/Sec: 0.12, Epoch: 0.03381267003497863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1741, "loss": 0.1686115264892578, "memory_gb": 7.721559524536133, "step_time_ms": 7483.73556137085, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:19] (step=0001741) Train Loss: 0.2133, Train Steps/Sec: 0.12, Epoch: 0.03383210260396424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1742, "loss": 0.18956515192985535, "memory_gb": 7.721559524536133, "step_time_ms": 7416.701555252075, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:28] (step=0001742) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.033851535172949865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1743, "loss": 0.27454522252082825, "memory_gb": 7.721559524536133, "step_time_ms": 7085.490942001343, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:35] (step=0001743) Train Loss: 0.2631, Train Steps/Sec: 0.13, Epoch: 0.03387096774193549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1744, "loss": 0.2762770652770996, "memory_gb": 7.721559524536133, "step_time_ms": 7418.764352798462, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:43] (step=0001744) Train Loss: 0.2306, Train Steps/Sec: 0.13, Epoch: 0.0338904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1745, "loss": 0.24046437442302704, "memory_gb": 7.721559524536133, "step_time_ms": 4838.865518569946, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:48] (step=0001745) Train Loss: 0.2419, Train Steps/Sec: 0.20, Epoch: 0.033909832879906725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:48:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1746, "loss": 0.1945486068725586, "memory_gb": 7.721559524536133, "step_time_ms": 7362.854242324829, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:48:56] (step=0001746) Train Loss: 0.2510, Train Steps/Sec: 0.13, Epoch: 0.03392926544889235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1747, "loss": 0.27565068006515503, "memory_gb": 7.721559524536133, "step_time_ms": 7400.367736816406, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:05] (step=0001747) Train Loss: 0.2869, Train Steps/Sec: 0.12, Epoch: 0.03394869801787796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1748, "loss": 0.2143469899892807, "memory_gb": 7.721559524536133, "step_time_ms": 7429.717063903809, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:13] (step=0001748) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.033968130586863585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1749, "loss": 0.11748944222927094, "memory_gb": 7.721559524536133, "step_time_ms": 7473.618745803833, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:21] (step=0001749) Train Loss: 0.2017, Train Steps/Sec: 0.12, Epoch: 0.0339875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1750, "loss": 0.27726948261260986, "memory_gb": 7.721559524536133, "step_time_ms": 7418.115139007568, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:29] (step=0001750) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.03400699572483482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1751, "loss": 0.25079190731048584, "memory_gb": 7.721559524536133, "step_time_ms": 7441.393613815308, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:37] (step=0001751) Train Loss: 0.3011, Train Steps/Sec: 0.12, Epoch: 0.034026428293820445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1752, "loss": 0.1792546510696411, "memory_gb": 7.721559524536133, "step_time_ms": 7500.866174697876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:45] (step=0001752) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.03404586086280606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:49:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1753, "loss": 0.12830542027950287, "memory_gb": 7.721559524536133, "step_time_ms": 7431.103229522705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:49:53] (step=0001753) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.03406529343179168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1754, "loss": 0.2473803460597992, "memory_gb": 7.721559524536133, "step_time_ms": 7444.813013076782, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:01] (step=0001754) Train Loss: 0.2300, Train Steps/Sec: 0.12, Epoch: 0.034084726000777305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1755, "loss": 0.17364482581615448, "memory_gb": 7.721559524536133, "step_time_ms": 7514.254570007324, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:09] (step=0001755) Train Loss: 0.1718, Train Steps/Sec: 0.12, Epoch: 0.03410415856976292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1756, "loss": 0.3711552619934082, "memory_gb": 7.715639114379883, "step_time_ms": 7449.098825454712, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:17] (step=0001756) Train Loss: 0.3368, Train Steps/Sec: 0.12, Epoch: 0.03412359113874854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1757, "loss": 0.1790013313293457, "memory_gb": 7.721559524536133, "step_time_ms": 7424.382209777832, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:25] (step=0001757) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.034143023707734164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1758, "loss": 0.2874464988708496, "memory_gb": 7.721559524536133, "step_time_ms": 7505.698680877686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:33] (step=0001758) Train Loss: 0.2560, Train Steps/Sec: 0.12, Epoch: 0.03416245627671978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1759, "loss": 0.2878015637397766, "memory_gb": 7.721559524536133, "step_time_ms": 7459.484338760376, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:41] (step=0001759) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.0341818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1760, "loss": 0.2838844060897827, "memory_gb": 7.721559524536133, "step_time_ms": 7464.8802280426025, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:49] (step=0001760) Train Loss: 0.2539, Train Steps/Sec: 0.12, Epoch: 0.034201321414691024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1761, "loss": 0.28872430324554443, "memory_gb": 7.721559524536133, "step_time_ms": 7601.617813110352, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:50:57] (step=0001761) Train Loss: 0.2705, Train Steps/Sec: 0.12, Epoch: 0.03422075398367664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1762, "loss": 0.2463422417640686, "memory_gb": 7.721559524536133, "step_time_ms": 7547.7330684661865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:05] (step=0001762) Train Loss: 0.2448, Train Steps/Sec: 0.13, Epoch: 0.03424018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1763, "loss": 0.3176964521408081, "memory_gb": 7.721559524536133, "step_time_ms": 7524.219036102295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:13] (step=0001763) Train Loss: 0.2854, Train Steps/Sec: 0.12, Epoch: 0.034259619121647884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1764, "loss": 0.2665672302246094, "memory_gb": 7.721559524536133, "step_time_ms": 7595.708131790161, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:22] (step=0001764) Train Loss: 0.2870, Train Steps/Sec: 0.12, Epoch: 0.0342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1765, "loss": 0.1507064551115036, "memory_gb": 7.721559524536133, "step_time_ms": 7534.541845321655, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:30] (step=0001765) Train Loss: 0.1651, Train Steps/Sec: 0.12, Epoch: 0.03429848425961912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1766, "loss": 0.19991937279701233, "memory_gb": 7.721559524536133, "step_time_ms": 7551.56683921814, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:38] (step=0001766) Train Loss: 0.1838, Train Steps/Sec: 0.12, Epoch: 0.034317916828604744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1767, "loss": 0.2159479707479477, "memory_gb": 7.721559524536133, "step_time_ms": 7661.764144897461, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:46] (step=0001767) Train Loss: 0.2898, Train Steps/Sec: 0.12, Epoch: 0.03433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1768, "loss": 0.23222458362579346, "memory_gb": 7.721559524536133, "step_time_ms": 7504.11319732666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:51:54] (step=0001768) Train Loss: 0.2375, Train Steps/Sec: 0.13, Epoch: 0.03435678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1769, "loss": 0.24112391471862793, "memory_gb": 7.721559524536133, "step_time_ms": 7548.384666442871, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:02] (step=0001769) Train Loss: 0.2542, Train Steps/Sec: 0.12, Epoch: 0.034376214535561604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1770, "loss": 0.15571561455726624, "memory_gb": 7.721559524536133, "step_time_ms": 7667.133569717407, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:10] (step=0001770) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.03439564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1771, "loss": 0.2793002426624298, "memory_gb": 7.721559524536133, "step_time_ms": 7537.158489227295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:18] (step=0001771) Train Loss: 0.2590, Train Steps/Sec: 0.13, Epoch: 0.03441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1772, "loss": 0.3288872539997101, "memory_gb": 7.721559524536133, "step_time_ms": 7417.955636978149, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:26] (step=0001772) Train Loss: 0.3050, Train Steps/Sec: 0.13, Epoch: 0.034434512242518464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1773, "loss": 0.37295064330101013, "memory_gb": 7.715639114379883, "step_time_ms": 7656.456470489502, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:34] (step=0001773) Train Loss: 0.3302, Train Steps/Sec: 0.12, Epoch: 0.03445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1774, "loss": 0.20939961075782776, "memory_gb": 7.721559524536133, "step_time_ms": 5246.553182601929, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:39] (step=0001774) Train Loss: 0.2099, Train Steps/Sec: 0.18, Epoch: 0.0344733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1775, "loss": 0.21206840872764587, "memory_gb": 7.721559524536133, "step_time_ms": 7656.3098430633545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:47] (step=0001775) Train Loss: 0.2574, Train Steps/Sec: 0.12, Epoch: 0.034492809949475324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:52:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1776, "loss": 0.3518635034561157, "memory_gb": 7.721559524536133, "step_time_ms": 7566.006660461426, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:52:55] (step=0001776) Train Loss: 0.2944, Train Steps/Sec: 0.12, Epoch: 0.03451224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1777, "loss": 0.22491775453090668, "memory_gb": 7.721559524536133, "step_time_ms": 7614.258766174316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:04] (step=0001777) Train Loss: 0.2311, Train Steps/Sec: 0.12, Epoch: 0.03453167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1778, "loss": 0.268214613199234, "memory_gb": 7.721559524536133, "step_time_ms": 7672.246694564819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:12] (step=0001778) Train Loss: 0.2124, Train Steps/Sec: 0.12, Epoch: 0.03455110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1779, "loss": 0.24959081411361694, "memory_gb": 7.721559524536133, "step_time_ms": 7604.597091674805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:20] (step=0001779) Train Loss: 0.1976, Train Steps/Sec: 0.12, Epoch: 0.0345705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1780, "loss": 0.23055337369441986, "memory_gb": 7.721559524536133, "step_time_ms": 7616.697311401367, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:28] (step=0001780) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.03458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1781, "loss": 0.1723804771900177, "memory_gb": 7.721559524536133, "step_time_ms": 7822.184085845947, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:36] (step=0001781) Train Loss: 0.2043, Train Steps/Sec: 0.12, Epoch: 0.03460940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1782, "loss": 0.22237500548362732, "memory_gb": 7.721559524536133, "step_time_ms": 7557.197332382202, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:44] (step=0001782) Train Loss: 0.2665, Train Steps/Sec: 0.12, Epoch: 0.03462883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:53:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1783, "loss": 0.3758440911769867, "memory_gb": 7.721559524536133, "step_time_ms": 7529.525518417358, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:53:52] (step=0001783) Train Loss: 0.3470, Train Steps/Sec: 0.13, Epoch: 0.03464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1784, "loss": 0.2738679349422455, "memory_gb": 7.721559524536133, "step_time_ms": 7618.329763412476, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:00] (step=0001784) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.0346677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1785, "loss": 0.13695800304412842, "memory_gb": 7.721559524536133, "step_time_ms": 7514.042854309082, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:08] (step=0001785) Train Loss: 0.1799, Train Steps/Sec: 0.12, Epoch: 0.03468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1786, "loss": 0.2625929117202759, "memory_gb": 7.721559524536133, "step_time_ms": 7535.457611083984, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:16] (step=0001786) Train Loss: 0.2884, Train Steps/Sec: 0.12, Epoch: 0.03470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1787, "loss": 0.2524430751800537, "memory_gb": 7.721559524536133, "step_time_ms": 7612.154006958008, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:24] (step=0001787) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.03472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1788, "loss": 0.18568819761276245, "memory_gb": 7.721559524536133, "step_time_ms": 7543.8337326049805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:32] (step=0001788) Train Loss: 0.2235, Train Steps/Sec: 0.12, Epoch: 0.03474543334628838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1789, "loss": 0.2964057922363281, "memory_gb": 7.721559524536133, "step_time_ms": 7456.938982009888, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:40] (step=0001789) Train Loss: 0.2652, Train Steps/Sec: 0.13, Epoch: 0.034764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1790, "loss": 0.2870745360851288, "memory_gb": 7.721559524536133, "step_time_ms": 7553.797006607056, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:48] (step=0001790) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.03478429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1791, "loss": 0.2400326281785965, "memory_gb": 7.721559524536133, "step_time_ms": 7494.401454925537, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:54:56] (step=0001791) Train Loss: 0.2033, Train Steps/Sec: 0.12, Epoch: 0.03480373105324524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1792, "loss": 0.26097559928894043, "memory_gb": 7.721559524536133, "step_time_ms": 7460.740566253662, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:04] (step=0001792) Train Loss: 0.2922, Train Steps/Sec: 0.12, Epoch: 0.03482316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1793, "loss": 0.2961375117301941, "memory_gb": 7.715639114379883, "step_time_ms": 7507.065773010254, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:12] (step=0001793) Train Loss: 0.2448, Train Steps/Sec: 0.12, Epoch: 0.03484259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1794, "loss": 0.21793335676193237, "memory_gb": 7.721559524536133, "step_time_ms": 7457.2203159332275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:20] (step=0001794) Train Loss: 0.2823, Train Steps/Sec: 0.12, Epoch: 0.0348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1795, "loss": 0.2373332977294922, "memory_gb": 7.721559524536133, "step_time_ms": 7504.141092300415, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:29] (step=0001795) Train Loss: 0.3032, Train Steps/Sec: 0.12, Epoch: 0.03488146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1796, "loss": 0.19815626740455627, "memory_gb": 7.721559524536133, "step_time_ms": 7580.705404281616, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:37] (step=0001796) Train Loss: 0.1799, Train Steps/Sec: 0.12, Epoch: 0.03490089389817334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1797, "loss": 0.16123045980930328, "memory_gb": 7.721559524536133, "step_time_ms": 7457.793951034546, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:45] (step=0001797) Train Loss: 0.2302, Train Steps/Sec: 0.12, Epoch: 0.03492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1798, "loss": 0.2713809609413147, "memory_gb": 7.721559524536133, "step_time_ms": 7446.8183517456055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:55:53] (step=0001798) Train Loss: 0.2654, Train Steps/Sec: 0.12, Epoch: 0.03493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1799, "loss": 0.21150413155555725, "memory_gb": 7.721559524536133, "step_time_ms": 7498.456954956055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:01] (step=0001799) Train Loss: 0.2085, Train Steps/Sec: 0.12, Epoch: 0.034959191605130197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1800, "loss": 0.29286134243011475, "memory_gb": 7.721559524536133, "step_time_ms": 7463.275671005249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:09] (step=0001800) Train Loss: 0.2473, Train Steps/Sec: 0.13, Epoch: 0.03497862417411582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1801, "loss": 0.3423539400100708, "memory_gb": 7.721559524536133, "step_time_ms": 7295.693874359131, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:17] (step=0001801) Train Loss: 0.2656, Train Steps/Sec: 0.13, Epoch: 0.03499805674310144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1802, "loss": 0.17925980687141418, "memory_gb": 7.721559524536133, "step_time_ms": 7523.577928543091, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:25] (step=0001802) Train Loss: 0.1781, Train Steps/Sec: 0.12, Epoch: 0.035017489312087056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1803, "loss": 0.28466925024986267, "memory_gb": 7.721559524536133, "step_time_ms": 5183.157682418823, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:30] (step=0001803) Train Loss: 0.2512, Train Steps/Sec: 0.18, Epoch: 0.03503692188107268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1804, "loss": 0.18879270553588867, "memory_gb": 7.721559524536133, "step_time_ms": 7568.00651550293, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:38] (step=0001804) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.0350563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1805, "loss": 0.33739525079727173, "memory_gb": 7.721559524536133, "step_time_ms": 7444.4990158081055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:46] (step=0001805) Train Loss: 0.3022, Train Steps/Sec: 0.12, Epoch: 0.035075787019043916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1806, "loss": 0.17658507823944092, "memory_gb": 7.721559524536133, "step_time_ms": 7476.515054702759, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:56:54] (step=0001806) Train Loss: 0.2489, Train Steps/Sec: 0.12, Epoch: 0.03509521958802954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1807, "loss": 0.2720588445663452, "memory_gb": 7.721559524536133, "step_time_ms": 7515.260457992554, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:03] (step=0001807) Train Loss: 0.2485, Train Steps/Sec: 0.12, Epoch: 0.035114652157015154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1808, "loss": 0.2331165224313736, "memory_gb": 7.721559524536133, "step_time_ms": 7452.496767044067, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:11] (step=0001808) Train Loss: 0.2044, Train Steps/Sec: 0.12, Epoch: 0.035134084726000776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1809, "loss": 0.28004345297813416, "memory_gb": 7.721559524536133, "step_time_ms": 7408.776521682739, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:19] (step=0001809) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.0351535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1810, "loss": 0.2522674798965454, "memory_gb": 7.721559524536133, "step_time_ms": 7492.406368255615, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:27] (step=0001810) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.035172949863972014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1811, "loss": 0.2187812626361847, "memory_gb": 7.721559524536133, "step_time_ms": 7425.182819366455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:35] (step=0001811) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.035192382432957636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1812, "loss": 0.17528623342514038, "memory_gb": 7.721559524536133, "step_time_ms": 7456.732273101807, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:43] (step=0001812) Train Loss: 0.1632, Train Steps/Sec: 0.12, Epoch: 0.03521181500194326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1813, "loss": 0.2572169899940491, "memory_gb": 7.721559524536133, "step_time_ms": 7491.930246353149, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:51] (step=0001813) Train Loss: 0.1949, Train Steps/Sec: 0.12, Epoch: 0.035231247570928874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:57:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1814, "loss": 0.21580249071121216, "memory_gb": 7.721559524536133, "step_time_ms": 7406.02707862854, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:57:59] (step=0001814) Train Loss: 0.1948, Train Steps/Sec: 0.12, Epoch: 0.035250680139914496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1815, "loss": 0.17664211988449097, "memory_gb": 7.721559524536133, "step_time_ms": 7475.834846496582, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:07] (step=0001815) Train Loss: 0.1708, Train Steps/Sec: 0.12, Epoch: 0.03527011270890012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1816, "loss": 0.2685236930847168, "memory_gb": 7.721559524536133, "step_time_ms": 7388.910293579102, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:15] (step=0001816) Train Loss: 0.2369, Train Steps/Sec: 0.13, Epoch: 0.035289545277885734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1817, "loss": 0.16989468038082123, "memory_gb": 7.721559524536133, "step_time_ms": 7414.742231369019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:23] (step=0001817) Train Loss: 0.2190, Train Steps/Sec: 0.12, Epoch: 0.035308977846871356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1818, "loss": 0.34522080421447754, "memory_gb": 7.721559524536133, "step_time_ms": 7429.560899734497, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:31] (step=0001818) Train Loss: 0.2733, Train Steps/Sec: 0.12, Epoch: 0.03532841041585698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1819, "loss": 0.21789675951004028, "memory_gb": 7.721559524536133, "step_time_ms": 7460.758686065674, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:39] (step=0001819) Train Loss: 0.2747, Train Steps/Sec: 0.12, Epoch: 0.035347842984842594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1820, "loss": 0.25240957736968994, "memory_gb": 7.721559524536133, "step_time_ms": 7410.762310028076, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:47] (step=0001820) Train Loss: 0.2181, Train Steps/Sec: 0.13, Epoch: 0.035367275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:58:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1821, "loss": 0.20131388306617737, "memory_gb": 7.721559524536133, "step_time_ms": 7498.72088432312, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:58:55] (step=0001821) Train Loss: 0.1862, Train Steps/Sec: 0.12, Epoch: 0.03538670812281384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1822, "loss": 0.33692774176597595, "memory_gb": 7.721559524536133, "step_time_ms": 7501.202344894409, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:03] (step=0001822) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.035406140691799454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1823, "loss": 0.15188775956630707, "memory_gb": 7.721559524536133, "step_time_ms": 7444.699287414551, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:11] (step=0001823) Train Loss: 0.2364, Train Steps/Sec: 0.12, Epoch: 0.035425573260785076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1824, "loss": 0.25470733642578125, "memory_gb": 7.721559524536133, "step_time_ms": 7437.6540184021, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:19] (step=0001824) Train Loss: 0.2794, Train Steps/Sec: 0.13, Epoch: 0.0354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1825, "loss": 0.28939640522003174, "memory_gb": 7.721559524536133, "step_time_ms": 7516.011714935303, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:27] (step=0001825) Train Loss: 0.2390, Train Steps/Sec: 0.12, Epoch: 0.035464438398756314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1826, "loss": 0.2266467809677124, "memory_gb": 7.721559524536133, "step_time_ms": 7406.399488449097, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:35] (step=0001826) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.035483870967741936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1827, "loss": 0.2847478985786438, "memory_gb": 7.721559524536133, "step_time_ms": 7414.356708526611, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:43] (step=0001827) Train Loss: 0.2935, Train Steps/Sec: 0.12, Epoch: 0.03550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1828, "loss": 0.221917986869812, "memory_gb": 7.721559524536133, "step_time_ms": 7486.453056335449, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:51] (step=0001828) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.035522736105713174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 21:59:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1829, "loss": 0.2844254970550537, "memory_gb": 7.721559524536133, "step_time_ms": 7402.13680267334, "trainable_params": 4718592, "method": "lora"} [2025-07-28 21:59:59] (step=0001829) Train Loss: 0.2940, Train Steps/Sec: 0.12, Epoch: 0.035542168674698796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1830, "loss": 0.24665409326553345, "memory_gb": 7.721559524536133, "step_time_ms": 7266.326427459717, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:07] (step=0001830) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.03556160124368442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1831, "loss": 0.24928925931453705, "memory_gb": 7.721559524536133, "step_time_ms": 7500.606060028076, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:15] (step=0001831) Train Loss: 0.2336, Train Steps/Sec: 0.12, Epoch: 0.03558103381267003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1832, "loss": 0.22906620800495148, "memory_gb": 7.721559524536133, "step_time_ms": 4852.273225784302, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:21] (step=0001832) Train Loss: 0.1889, Train Steps/Sec: 0.18, Epoch: 0.035600466381655656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1833, "loss": 0.24890649318695068, "memory_gb": 7.721559524536133, "step_time_ms": 7510.845422744751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:29] (step=0001833) Train Loss: 0.2354, Train Steps/Sec: 0.12, Epoch: 0.03561989895064128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1834, "loss": 0.15315963327884674, "memory_gb": 7.721559524536133, "step_time_ms": 7418.7750816345215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:37] (step=0001834) Train Loss: 0.1958, Train Steps/Sec: 0.12, Epoch: 0.03563933151962689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1835, "loss": 0.18528075516223907, "memory_gb": 7.721559524536133, "step_time_ms": 7486.079216003418, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:45] (step=0001835) Train Loss: 0.2155, Train Steps/Sec: 0.12, Epoch: 0.035658764088612516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:00:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1836, "loss": 0.1840730905532837, "memory_gb": 7.721559524536133, "step_time_ms": 7476.352691650391, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:00:53] (step=0001836) Train Loss: 0.1975, Train Steps/Sec: 0.12, Epoch: 0.03567819665759813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1837, "loss": 0.2532203495502472, "memory_gb": 7.721559524536133, "step_time_ms": 7406.244993209839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:01] (step=0001837) Train Loss: 0.1986, Train Steps/Sec: 0.12, Epoch: 0.03569762922658375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1838, "loss": 0.1533205509185791, "memory_gb": 7.721559524536133, "step_time_ms": 7453.975677490234, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:09] (step=0001838) Train Loss: 0.1767, Train Steps/Sec: 0.12, Epoch: 0.035717061795569376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1839, "loss": 0.23310700058937073, "memory_gb": 7.721559524536133, "step_time_ms": 7485.389947891235, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:17] (step=0001839) Train Loss: 0.2143, Train Steps/Sec: 0.12, Epoch: 0.03573649436455499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1840, "loss": 0.32080626487731934, "memory_gb": 7.721559524536133, "step_time_ms": 7425.062656402588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:25] (step=0001840) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.03575592693354061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1841, "loss": 0.35386401414871216, "memory_gb": 7.715639114379883, "step_time_ms": 7403.613090515137, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:33] (step=0001841) Train Loss: 0.2580, Train Steps/Sec: 0.13, Epoch: 0.035775359502526236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1842, "loss": 0.3406897187232971, "memory_gb": 7.721559524536133, "step_time_ms": 7473.232269287109, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:41] (step=0001842) Train Loss: 0.2942, Train Steps/Sec: 0.12, Epoch: 0.03579479207151185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1843, "loss": 0.19969317317008972, "memory_gb": 7.721559524536133, "step_time_ms": 7392.357349395752, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:49] (step=0001843) Train Loss: 0.2087, Train Steps/Sec: 0.12, Epoch: 0.03581422464049747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:01:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1844, "loss": 0.3161233067512512, "memory_gb": 7.721559524536133, "step_time_ms": 7408.287286758423, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:01:57] (step=0001844) Train Loss: 0.2850, Train Steps/Sec: 0.12, Epoch: 0.035833657209483095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1845, "loss": 0.21596691012382507, "memory_gb": 7.721559524536133, "step_time_ms": 7491.950988769531, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:05] (step=0001845) Train Loss: 0.2233, Train Steps/Sec: 0.12, Epoch: 0.03585308977846871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1846, "loss": 0.339616596698761, "memory_gb": 7.721559524536133, "step_time_ms": 7422.402620315552, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:13] (step=0001846) Train Loss: 0.3187, Train Steps/Sec: 0.12, Epoch: 0.03587252234745433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1847, "loss": 0.2500914931297302, "memory_gb": 7.721559524536133, "step_time_ms": 7422.671318054199, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:21] (step=0001847) Train Loss: 0.1940, Train Steps/Sec: 0.12, Epoch: 0.035891954916439955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1848, "loss": 0.2340909242630005, "memory_gb": 7.721559524536133, "step_time_ms": 7536.107778549194, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:29] (step=0001848) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.03591138748542557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1849, "loss": 0.37692728638648987, "memory_gb": 7.721559524536133, "step_time_ms": 7468.3287143707275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:37] (step=0001849) Train Loss: 0.3259, Train Steps/Sec: 0.12, Epoch: 0.03593082005441119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1850, "loss": 0.2417948693037033, "memory_gb": 7.721559524536133, "step_time_ms": 7451.30467414856, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:45] (step=0001850) Train Loss: 0.3316, Train Steps/Sec: 0.12, Epoch: 0.035950252623396815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:02:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1851, "loss": 0.26593294739723206, "memory_gb": 7.721559524536133, "step_time_ms": 7533.189535140991, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:02:54] (step=0001851) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.03596968519238243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1852, "loss": 0.1930513232946396, "memory_gb": 7.721559524536133, "step_time_ms": 7466.325759887695, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:02] (step=0001852) Train Loss: 0.1694, Train Steps/Sec: 0.12, Epoch: 0.03598911776136805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1853, "loss": 0.3022967576980591, "memory_gb": 7.721559524536133, "step_time_ms": 7426.743268966675, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:10] (step=0001853) Train Loss: 0.2656, Train Steps/Sec: 0.12, Epoch: 0.036008550330353675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1854, "loss": 0.24948234856128693, "memory_gb": 7.721559524536133, "step_time_ms": 7563.136100769043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:18] (step=0001854) Train Loss: 0.3208, Train Steps/Sec: 0.12, Epoch: 0.03602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1855, "loss": 0.2948511242866516, "memory_gb": 7.721559524536133, "step_time_ms": 7428.1816482543945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:26] (step=0001855) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.03604741546832491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1856, "loss": 0.23829086124897003, "memory_gb": 7.721559524536133, "step_time_ms": 7439.655065536499, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:34] (step=0001856) Train Loss: 0.2737, Train Steps/Sec: 0.12, Epoch: 0.036066848037310535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1857, "loss": 0.1232401505112648, "memory_gb": 7.721559524536133, "step_time_ms": 7499.4871616363525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:42] (step=0001857) Train Loss: 0.1585, Train Steps/Sec: 0.12, Epoch: 0.03608628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1858, "loss": 0.23442691564559937, "memory_gb": 7.721559524536133, "step_time_ms": 7470.452547073364, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:50] (step=0001858) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.03610571317528177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:03:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1859, "loss": 0.2770261764526367, "memory_gb": 7.721559524536133, "step_time_ms": 7383.724689483643, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:03:58] (step=0001859) Train Loss: 0.2519, Train Steps/Sec: 0.13, Epoch: 0.036125145744267395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1860, "loss": 0.3364744782447815, "memory_gb": 7.721559524536133, "step_time_ms": 7534.606456756592, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:06] (step=0001860) Train Loss: 0.2775, Train Steps/Sec: 0.12, Epoch: 0.03614457831325301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1861, "loss": 0.192588672041893, "memory_gb": 7.721559524536133, "step_time_ms": 5344.22492980957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:12] (step=0001861) Train Loss: 0.2390, Train Steps/Sec: 0.17, Epoch: 0.03616401088223863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1862, "loss": 0.28531354665756226, "memory_gb": 7.715639114379883, "step_time_ms": 7488.239526748657, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:20] (step=0001862) Train Loss: 0.2563, Train Steps/Sec: 0.12, Epoch: 0.036183443451224255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1863, "loss": 0.23015138506889343, "memory_gb": 7.715639114379883, "step_time_ms": 7448.798418045044, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:28] (step=0001863) Train Loss: 0.2289, Train Steps/Sec: 0.12, Epoch: 0.03620287602020987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1864, "loss": 0.20239251852035522, "memory_gb": 7.721559524536133, "step_time_ms": 7466.362953186035, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:36] (step=0001864) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.03622230858919549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1865, "loss": 0.28094613552093506, "memory_gb": 7.721559524536133, "step_time_ms": 7546.290397644043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:44] (step=0001865) Train Loss: 0.2833, Train Steps/Sec: 0.12, Epoch: 0.036241741158181115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:04:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1866, "loss": 0.27627015113830566, "memory_gb": 7.721559524536133, "step_time_ms": 7479.119539260864, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:04:52] (step=0001866) Train Loss: 0.2736, Train Steps/Sec: 0.12, Epoch: 0.03626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1867, "loss": 0.1790894716978073, "memory_gb": 7.721559524536133, "step_time_ms": 7468.819379806519, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:00] (step=0001867) Train Loss: 0.1789, Train Steps/Sec: 0.13, Epoch: 0.03628060629615235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1868, "loss": 0.34145933389663696, "memory_gb": 7.721559524536133, "step_time_ms": 7678.460121154785, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:08] (step=0001868) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.03630003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1869, "loss": 0.2342003881931305, "memory_gb": 7.721559524536133, "step_time_ms": 7472.056865692139, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:16] (step=0001869) Train Loss: 0.2630, Train Steps/Sec: 0.12, Epoch: 0.03631947143412359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1870, "loss": 0.31558334827423096, "memory_gb": 7.721559524536133, "step_time_ms": 7479.207992553711, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:24] (step=0001870) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.03633890400310921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1871, "loss": 0.17145279049873352, "memory_gb": 7.721559524536133, "step_time_ms": 7565.207481384277, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:32] (step=0001871) Train Loss: 0.2265, Train Steps/Sec: 0.12, Epoch: 0.03635833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1872, "loss": 0.14253969490528107, "memory_gb": 7.721559524536133, "step_time_ms": 7583.769083023071, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:41] (step=0001872) Train Loss: 0.2079, Train Steps/Sec: 0.12, Epoch: 0.03637776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1873, "loss": 0.2976311445236206, "memory_gb": 7.721559524536133, "step_time_ms": 7544.251918792725, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:49] (step=0001873) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.03639720171006607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1874, "loss": 0.13811573386192322, "memory_gb": 7.721559524536133, "step_time_ms": 7556.113958358765, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:05:57] (step=0001874) Train Loss: 0.1854, Train Steps/Sec: 0.12, Epoch: 0.03641663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1875, "loss": 0.179692342877388, "memory_gb": 7.721559524536133, "step_time_ms": 7494.220018386841, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:05] (step=0001875) Train Loss: 0.3005, Train Steps/Sec: 0.12, Epoch: 0.03643606684803731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1876, "loss": 0.19740363955497742, "memory_gb": 7.721559524536133, "step_time_ms": 7465.805530548096, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:13] (step=0001876) Train Loss: 0.1838, Train Steps/Sec: 0.12, Epoch: 0.03645549941702293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1877, "loss": 0.2980116605758667, "memory_gb": 7.721559524536133, "step_time_ms": 7545.7923412323, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:21] (step=0001877) Train Loss: 0.2389, Train Steps/Sec: 0.12, Epoch: 0.03647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1878, "loss": 0.3194468319416046, "memory_gb": 7.721559524536133, "step_time_ms": 7506.327152252197, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:29] (step=0001878) Train Loss: 0.2899, Train Steps/Sec: 0.13, Epoch: 0.03649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1879, "loss": 0.3214491307735443, "memory_gb": 7.721559524536133, "step_time_ms": 7508.517503738403, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:37] (step=0001879) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.03651379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1880, "loss": 0.21493583917617798, "memory_gb": 7.721559524536133, "step_time_ms": 7546.163320541382, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:45] (step=0001880) Train Loss: 0.2391, Train Steps/Sec: 0.12, Epoch: 0.03653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:06:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1881, "loss": 0.32309362292289734, "memory_gb": 7.721559524536133, "step_time_ms": 7515.85578918457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:06:53] (step=0001881) Train Loss: 0.2337, Train Steps/Sec: 0.12, Epoch: 0.03655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1882, "loss": 0.1935507357120514, "memory_gb": 7.721559524536133, "step_time_ms": 7376.860618591309, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:01] (step=0001882) Train Loss: 0.2060, Train Steps/Sec: 0.13, Epoch: 0.03657209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1883, "loss": 0.28132128715515137, "memory_gb": 7.721559524536133, "step_time_ms": 7632.763862609863, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:09] (step=0001883) Train Loss: 0.2837, Train Steps/Sec: 0.12, Epoch: 0.03659152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1884, "loss": 0.22337612509727478, "memory_gb": 7.721559524536133, "step_time_ms": 7509.711265563965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:17] (step=0001884) Train Loss: 0.2333, Train Steps/Sec: 0.12, Epoch: 0.03661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1885, "loss": 0.19757027924060822, "memory_gb": 7.721559524536133, "step_time_ms": 7499.215364456177, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:25] (step=0001885) Train Loss: 0.1967, Train Steps/Sec: 0.12, Epoch: 0.03663039253789351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1886, "loss": 0.22447581589221954, "memory_gb": 7.721559524536133, "step_time_ms": 7518.864631652832, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:33] (step=0001886) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.03664982510687913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1887, "loss": 0.2842530608177185, "memory_gb": 7.721559524536133, "step_time_ms": 7490.005254745483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:41] (step=0001887) Train Loss: 0.2137, Train Steps/Sec: 0.12, Epoch: 0.03666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1888, "loss": 0.3240276873111725, "memory_gb": 7.721559524536133, "step_time_ms": 7319.580554962158, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:49] (step=0001888) Train Loss: 0.2546, Train Steps/Sec: 0.13, Epoch: 0.03668869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:07:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1889, "loss": 0.287580668926239, "memory_gb": 7.721559524536133, "step_time_ms": 7515.12598991394, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:07:57] (step=0001889) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.03670812281383599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1890, "loss": 0.2406015843153, "memory_gb": 7.721559524536133, "step_time_ms": 4719.708681106567, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:03] (step=0001890) Train Loss: 0.2684, Train Steps/Sec: 0.18, Epoch: 0.03672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1891, "loss": 0.23659564554691315, "memory_gb": 7.721559524536133, "step_time_ms": 7507.292032241821, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:11] (step=0001891) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.03674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1892, "loss": 0.20394307374954224, "memory_gb": 7.721559524536133, "step_time_ms": 7516.379117965698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:19] (step=0001892) Train Loss: 0.1956, Train Steps/Sec: 0.12, Epoch: 0.03676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1893, "loss": 0.25541365146636963, "memory_gb": 7.721559524536133, "step_time_ms": 7455.380916595459, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:27] (step=0001893) Train Loss: 0.2576, Train Steps/Sec: 0.12, Epoch: 0.03678585308977847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1894, "loss": 0.18624316155910492, "memory_gb": 7.721559524536133, "step_time_ms": 7535.215616226196, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:35] (step=0001894) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.03680528565876409, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1895, "loss": 0.1897730827331543, "memory_gb": 7.721559524536133, "step_time_ms": 7471.103191375732, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:43] (step=0001895) Train Loss: 0.2275, Train Steps/Sec: 0.13, Epoch: 0.03682471822774971, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1896, "loss": 0.1424480527639389, "memory_gb": 7.721559524536133, "step_time_ms": 7478.29270362854, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:51] (step=0001896) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.03684415079673533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:08:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1897, "loss": 0.14932295680046082, "memory_gb": 7.721559524536133, "step_time_ms": 7530.163049697876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:08:59] (step=0001897) Train Loss: 0.1925, Train Steps/Sec: 0.12, Epoch: 0.036863583365720945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1898, "loss": 0.31646063923835754, "memory_gb": 7.721559524536133, "step_time_ms": 7473.904848098755, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:07] (step=0001898) Train Loss: 0.3159, Train Steps/Sec: 0.12, Epoch: 0.03688301593470657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1899, "loss": 0.3132963478565216, "memory_gb": 7.721559524536133, "step_time_ms": 7453.0134201049805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:15] (step=0001899) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.03690244850369219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1900, "loss": 0.258939266204834, "memory_gb": 7.721559524536133, "step_time_ms": 7542.846918106079, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:23] (step=0001900) Train Loss: 0.2693, Train Steps/Sec: 0.12, Epoch: 0.036921881072677805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1901, "loss": 0.15207502245903015, "memory_gb": 7.721559524536133, "step_time_ms": 7479.47883605957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:31] (step=0001901) Train Loss: 0.1635, Train Steps/Sec: 0.12, Epoch: 0.03694131364166343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1902, "loss": 0.23134754598140717, "memory_gb": 7.715639114379883, "step_time_ms": 7417.06919670105, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:39] (step=0001902) Train Loss: 0.2293, Train Steps/Sec: 0.12, Epoch: 0.03696074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1903, "loss": 0.17485038936138153, "memory_gb": 7.721559524536133, "step_time_ms": 7475.620269775391, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:47] (step=0001903) Train Loss: 0.2190, Train Steps/Sec: 0.12, Epoch: 0.036980178779634665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 1904, "loss": 0.2138383984565735, "memory_gb": 7.721559524536133, "step_time_ms": 7430.080413818359, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:09:55] (step=0001904) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.03699961134862029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 1905, "loss": 0.2828935384750366, "memory_gb": 7.721559524536133, "step_time_ms": 7428.115367889404, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:03] (step=0001905) Train Loss: 0.3002, Train Steps/Sec: 0.12, Epoch: 0.03701904391760591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 1906, "loss": 0.25322097539901733, "memory_gb": 7.721559524536133, "step_time_ms": 7502.710819244385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:11] (step=0001906) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.037038476486591525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 1907, "loss": 0.21727241575717926, "memory_gb": 7.721559524536133, "step_time_ms": 7457.846879959106, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:19] (step=0001907) Train Loss: 0.2268, Train Steps/Sec: 0.13, Epoch: 0.03705790905557715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 1908, "loss": 0.2526809573173523, "memory_gb": 7.721559524536133, "step_time_ms": 7429.145812988281, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:27] (step=0001908) Train Loss: 0.2251, Train Steps/Sec: 0.13, Epoch: 0.03707734162456277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 1909, "loss": 0.19357943534851074, "memory_gb": 7.721559524536133, "step_time_ms": 7661.556005477905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:35] (step=0001909) Train Loss: 0.2539, Train Steps/Sec: 0.12, Epoch: 0.037096774193548385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 1910, "loss": 0.31260183453559875, "memory_gb": 7.721559524536133, "step_time_ms": 7454.885244369507, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:43] (step=0001910) Train Loss: 0.2864, Train Steps/Sec: 0.12, Epoch: 0.03711620676253401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1911, "loss": 0.26122358441352844, "memory_gb": 7.721559524536133, "step_time_ms": 7441.376209259033, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:51] (step=0001911) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.03713563933151963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1912, "loss": 0.2868690490722656, "memory_gb": 7.721559524536133, "step_time_ms": 7512.107610702515, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:10:59] (step=0001912) Train Loss: 0.2584, Train Steps/Sec: 0.12, Epoch: 0.037155071900505245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1913, "loss": 0.23924291133880615, "memory_gb": 7.715639114379883, "step_time_ms": 7401.989221572876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:07] (step=0001913) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.03717450446949087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1914, "loss": 0.34536683559417725, "memory_gb": 7.721559524536133, "step_time_ms": 7469.070672988892, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:15] (step=0001914) Train Loss: 0.3455, Train Steps/Sec: 0.12, Epoch: 0.03719393703847649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1915, "loss": 0.25438791513442993, "memory_gb": 7.721559524536133, "step_time_ms": 7491.364479064941, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:23] (step=0001915) Train Loss: 0.2233, Train Steps/Sec: 0.12, Epoch: 0.037213369607462105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1916, "loss": 0.15879005193710327, "memory_gb": 7.721559524536133, "step_time_ms": 7450.284004211426, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:32] (step=0001916) Train Loss: 0.2149, Train Steps/Sec: 0.12, Epoch: 0.03723280217644773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1917, "loss": 0.2656906545162201, "memory_gb": 7.721559524536133, "step_time_ms": 7324.622631072998, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:39] (step=0001917) Train Loss: 0.2410, Train Steps/Sec: 0.13, Epoch: 0.03725223474543335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 1918, "loss": 0.2297339141368866, "memory_gb": 7.721559524536133, "step_time_ms": 7502.631902694702, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:48] (step=0001918) Train Loss: 0.2488, Train Steps/Sec: 0.12, Epoch: 0.037271667314418964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:11:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1919, "loss": 0.3345590829849243, "memory_gb": 7.721559524536133, "step_time_ms": 5503.068208694458, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:11:54] (step=0001919) Train Loss: 0.3530, Train Steps/Sec: 0.17, Epoch: 0.03729109988340459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1920, "loss": 0.16270777583122253, "memory_gb": 7.721559524536133, "step_time_ms": 7242.704153060913, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:02] (step=0001920) Train Loss: 0.1760, Train Steps/Sec: 0.12, Epoch: 0.03731053245239021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1921, "loss": 0.2773398160934448, "memory_gb": 7.721559524536133, "step_time_ms": 7389.902591705322, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:10] (step=0001921) Train Loss: 0.3045, Train Steps/Sec: 0.12, Epoch: 0.037329965021375824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1922, "loss": 0.24165767431259155, "memory_gb": 7.721559524536133, "step_time_ms": 7387.2950077056885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:18] (step=0001922) Train Loss: 0.2161, Train Steps/Sec: 0.12, Epoch: 0.03734939759036145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1923, "loss": 0.3093457818031311, "memory_gb": 7.721559524536133, "step_time_ms": 7460.456371307373, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:26] (step=0001923) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.03736883015934707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1924, "loss": 0.2717267870903015, "memory_gb": 7.721559524536133, "step_time_ms": 7438.637018203735, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:34] (step=0001924) Train Loss: 0.2686, Train Steps/Sec: 0.12, Epoch: 0.037388262728332684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1925, "loss": 0.2878190875053406, "memory_gb": 7.721559524536133, "step_time_ms": 7417.507648468018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:42] (step=0001925) Train Loss: 0.2073, Train Steps/Sec: 0.12, Epoch: 0.03740769529731831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 1926, "loss": 0.27270257472991943, "memory_gb": 7.721559524536133, "step_time_ms": 7492.199897766113, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:50] (step=0001926) Train Loss: 0.2412, Train Steps/Sec: 0.12, Epoch: 0.03742712786630392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:12:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 1927, "loss": 0.3507918417453766, "memory_gb": 7.721559524536133, "step_time_ms": 7491.023540496826, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:12:58] (step=0001927) Train Loss: 0.2829, Train Steps/Sec: 0.12, Epoch: 0.037446560435289544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1928, "loss": 0.15947969257831573, "memory_gb": 7.721559524536133, "step_time_ms": 7482.606887817383, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:06] (step=0001928) Train Loss: 0.2493, Train Steps/Sec: 0.12, Epoch: 0.03746599300427517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1929, "loss": 0.20791172981262207, "memory_gb": 7.721559524536133, "step_time_ms": 7541.769504547119, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:14] (step=0001929) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.03748542557326078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 1930, "loss": 0.2020864188671112, "memory_gb": 7.721559524536133, "step_time_ms": 7481.903553009033, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:22] (step=0001930) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.037504858142246404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1931, "loss": 0.2253023087978363, "memory_gb": 7.721559524536133, "step_time_ms": 7443.50790977478, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:30] (step=0001931) Train Loss: 0.2574, Train Steps/Sec: 0.12, Epoch: 0.037524290711232026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 1932, "loss": 0.2386399805545807, "memory_gb": 7.721559524536133, "step_time_ms": 7536.20457649231, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:38] (step=0001932) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.03754372328021764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 1933, "loss": 0.10848082602024078, "memory_gb": 7.721559524536133, "step_time_ms": 7493.544101715088, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:46] (step=0001933) Train Loss: 0.1669, Train Steps/Sec: 0.12, Epoch: 0.037563155849203264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:13:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 1934, "loss": 0.34467726945877075, "memory_gb": 7.721559524536133, "step_time_ms": 7439.016819000244, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:13:54] (step=0001934) Train Loss: 0.2734, Train Steps/Sec: 0.12, Epoch: 0.037582588418188886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 1935, "loss": 0.20006975531578064, "memory_gb": 7.721559524536133, "step_time_ms": 7554.400205612183, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:02] (step=0001935) Train Loss: 0.1905, Train Steps/Sec: 0.12, Epoch: 0.0376020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 1936, "loss": 0.23370623588562012, "memory_gb": 7.721559524536133, "step_time_ms": 7594.956636428833, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:10] (step=0001936) Train Loss: 0.2892, Train Steps/Sec: 0.12, Epoch: 0.037621453556160124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 1937, "loss": 0.18647238612174988, "memory_gb": 7.721559524536133, "step_time_ms": 7526.232957839966, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:18] (step=0001937) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.037640886125145746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 1938, "loss": 0.32673758268356323, "memory_gb": 7.721559524536133, "step_time_ms": 7593.906402587891, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:26] (step=0001938) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.03766031869413136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 1939, "loss": 0.16782434284687042, "memory_gb": 7.721559524536133, "step_time_ms": 7554.610729217529, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:34] (step=0001939) Train Loss: 0.1954, Train Steps/Sec: 0.12, Epoch: 0.037679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 1940, "loss": 0.3802308440208435, "memory_gb": 7.721559524536133, "step_time_ms": 7466.712713241577, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:42] (step=0001940) Train Loss: 0.2982, Train Steps/Sec: 0.12, Epoch: 0.037699183832102606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 1941, "loss": 0.32761383056640625, "memory_gb": 7.721559524536133, "step_time_ms": 7561.388254165649, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:51] (step=0001941) Train Loss: 0.3249, Train Steps/Sec: 0.12, Epoch: 0.03771861640108822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 1942, "loss": 0.2746081054210663, "memory_gb": 7.721559524536133, "step_time_ms": 7547.159910202026, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:14:59] (step=0001942) Train Loss: 0.2494, Train Steps/Sec: 0.13, Epoch: 0.037738048970073844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 1943, "loss": 0.24124935269355774, "memory_gb": 7.721559524536133, "step_time_ms": 7525.074481964111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:07] (step=0001943) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.037757481539059466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 1944, "loss": 0.26432254910469055, "memory_gb": 7.721559524536133, "step_time_ms": 7672.402381896973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:15] (step=0001944) Train Loss: 0.2196, Train Steps/Sec: 0.12, Epoch: 0.03777691410804508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 1945, "loss": 0.3069997727870941, "memory_gb": 7.721559524536133, "step_time_ms": 7587.355613708496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:23] (step=0001945) Train Loss: 0.2565, Train Steps/Sec: 0.13, Epoch: 0.037796346677030704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 1946, "loss": 0.25445181131362915, "memory_gb": 7.721559524536133, "step_time_ms": 7378.001928329468, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:31] (step=0001946) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.037815779246016326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 1947, "loss": 0.27930378913879395, "memory_gb": 7.721559524536133, "step_time_ms": 7647.8705406188965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:39] (step=0001947) Train Loss: 0.2129, Train Steps/Sec: 0.12, Epoch: 0.03783521181500194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1948, "loss": 0.2487504929304123, "memory_gb": 7.721559524536133, "step_time_ms": 5189.223766326904, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:44] (step=0001948) Train Loss: 0.2538, Train Steps/Sec: 0.18, Epoch: 0.037854644383987564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:15:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1949, "loss": 0.26473096013069153, "memory_gb": 7.721559524536133, "step_time_ms": 7662.113428115845, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:15:53] (step=0001949) Train Loss: 0.2239, Train Steps/Sec: 0.12, Epoch: 0.037874076952973186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1950, "loss": 0.2761158347129822, "memory_gb": 7.721559524536133, "step_time_ms": 7324.20802116394, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:01] (step=0001950) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.0378935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1951, "loss": 0.22775231301784515, "memory_gb": 7.721559524536133, "step_time_ms": 7525.829792022705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:09] (step=0001951) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.037912942090944424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1952, "loss": 0.23255980014801025, "memory_gb": 7.721559524536133, "step_time_ms": 7586.231231689453, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:17] (step=0001952) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.037932374659930046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1953, "loss": 0.22941634058952332, "memory_gb": 7.721559524536133, "step_time_ms": 7550.152540206909, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:25] (step=0001953) Train Loss: 0.2509, Train Steps/Sec: 0.13, Epoch: 0.03795180722891566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1954, "loss": 0.24808341264724731, "memory_gb": 7.721559524536133, "step_time_ms": 7459.181070327759, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:33] (step=0001954) Train Loss: 0.2713, Train Steps/Sec: 0.13, Epoch: 0.037971239797901284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1955, "loss": 0.23338156938552856, "memory_gb": 7.721559524536133, "step_time_ms": 7566.800117492676, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:41] (step=0001955) Train Loss: 0.2315, Train Steps/Sec: 0.12, Epoch: 0.037990672366886906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1956, "loss": 0.31122124195098877, "memory_gb": 7.721559524536133, "step_time_ms": 7494.765996932983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:49] (step=0001956) Train Loss: 0.2786, Train Steps/Sec: 0.13, Epoch: 0.03801010493587252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:16:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1957, "loss": 0.195621520280838, "memory_gb": 7.721559524536133, "step_time_ms": 7616.532564163208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:16:57] (step=0001957) Train Loss: 0.2276, Train Steps/Sec: 0.12, Epoch: 0.038029537504858144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 1958, "loss": 0.133454829454422, "memory_gb": 7.721559524536133, "step_time_ms": 7581.735134124756, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:05] (step=0001958) Train Loss: 0.1962, Train Steps/Sec: 0.12, Epoch: 0.03804897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 1959, "loss": 0.26668989658355713, "memory_gb": 7.721559524536133, "step_time_ms": 7550.274848937988, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:13] (step=0001959) Train Loss: 0.2321, Train Steps/Sec: 0.13, Epoch: 0.03806840264282938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1960, "loss": 0.32550734281539917, "memory_gb": 7.721559524536133, "step_time_ms": 7517.77720451355, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:21] (step=0001960) Train Loss: 0.2948, Train Steps/Sec: 0.12, Epoch: 0.038087835211815003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 1961, "loss": 0.2906743884086609, "memory_gb": 7.721559524536133, "step_time_ms": 7584.049224853516, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:29] (step=0001961) Train Loss: 0.3253, Train Steps/Sec: 0.12, Epoch: 0.03810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 1962, "loss": 0.25293266773223877, "memory_gb": 7.721559524536133, "step_time_ms": 7526.601552963257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:37] (step=0001962) Train Loss: 0.2883, Train Steps/Sec: 0.12, Epoch: 0.03812670034978624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1963, "loss": 0.2951011657714844, "memory_gb": 7.721559524536133, "step_time_ms": 7489.070892333984, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:45] (step=0001963) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.03814613291877186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:17:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1964, "loss": 0.24854639172554016, "memory_gb": 7.721559524536133, "step_time_ms": 7519.510984420776, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:17:53] (step=0001964) Train Loss: 0.2281, Train Steps/Sec: 0.12, Epoch: 0.03816556548775748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1965, "loss": 0.21971307694911957, "memory_gb": 7.721559524536133, "step_time_ms": 7436.899900436401, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:01] (step=0001965) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.0381849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1966, "loss": 0.15114793181419373, "memory_gb": 7.721559524536133, "step_time_ms": 7427.6323318481445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:09] (step=0001966) Train Loss: 0.1910, Train Steps/Sec: 0.13, Epoch: 0.03820443062572872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1967, "loss": 0.37746891379356384, "memory_gb": 7.721559524536133, "step_time_ms": 7552.16646194458, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:17] (step=0001967) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.03822386319471434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1968, "loss": 0.25371837615966797, "memory_gb": 7.721559524536133, "step_time_ms": 7485.447883605957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:25] (step=0001968) Train Loss: 0.2498, Train Steps/Sec: 0.12, Epoch: 0.03824329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1969, "loss": 0.31943362951278687, "memory_gb": 7.721559524536133, "step_time_ms": 7460.067272186279, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:33] (step=0001969) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.03826272833268558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 1970, "loss": 0.18774691224098206, "memory_gb": 7.721559524536133, "step_time_ms": 7521.816968917847, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:41] (step=0001970) Train Loss: 0.2240, Train Steps/Sec: 0.12, Epoch: 0.0382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 1971, "loss": 0.17103564739227295, "memory_gb": 7.721559524536133, "step_time_ms": 7490.106105804443, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:49] (step=0001971) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.03830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 1972, "loss": 0.24575680494308472, "memory_gb": 7.721559524536133, "step_time_ms": 7436.427354812622, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:18:58] (step=0001972) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.03832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 1973, "loss": 0.31184619665145874, "memory_gb": 7.721559524536133, "step_time_ms": 7519.944906234741, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:06] (step=0001973) Train Loss: 0.2799, Train Steps/Sec: 0.12, Epoch: 0.03834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 1974, "loss": 0.17910516262054443, "memory_gb": 7.721559524536133, "step_time_ms": 7487.232685089111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:14] (step=0001974) Train Loss: 0.2190, Train Steps/Sec: 0.12, Epoch: 0.03835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 1975, "loss": 0.28762149810791016, "memory_gb": 7.721559524536133, "step_time_ms": 7333.471298217773, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:21] (step=0001975) Train Loss: 0.2716, Train Steps/Sec: 0.13, Epoch: 0.0383793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 1976, "loss": 0.17458085715770721, "memory_gb": 7.721559524536133, "step_time_ms": 7565.223455429077, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:30] (step=0001976) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.03839875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1977, "loss": 0.28448906540870667, "memory_gb": 7.721559524536133, "step_time_ms": 4798.7165451049805, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:36] (step=0001977) Train Loss: 0.2811, Train Steps/Sec: 0.17, Epoch: 0.03841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 1978, "loss": 0.25229382514953613, "memory_gb": 7.721559524536133, "step_time_ms": 7483.253955841064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:44] (step=0001978) Train Loss: 0.3033, Train Steps/Sec: 0.12, Epoch: 0.03843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:19:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 1979, "loss": 0.1422252207994461, "memory_gb": 7.721559524536133, "step_time_ms": 7449.4805335998535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:19:52] (step=0001979) Train Loss: 0.1756, Train Steps/Sec: 0.12, Epoch: 0.03845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 1980, "loss": 0.2511594593524933, "memory_gb": 7.721559524536133, "step_time_ms": 7509.257793426514, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:00] (step=0001980) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.0384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 1981, "loss": 0.26560842990875244, "memory_gb": 7.721559524536133, "step_time_ms": 7541.314363479614, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:08] (step=0001981) Train Loss: 0.3146, Train Steps/Sec: 0.12, Epoch: 0.03849591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 1982, "loss": 0.16635306179523468, "memory_gb": 7.721559524536133, "step_time_ms": 7509.493350982666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:16] (step=0001982) Train Loss: 0.1926, Train Steps/Sec: 0.12, Epoch: 0.03851535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 1983, "loss": 0.2670372426509857, "memory_gb": 7.721559524536133, "step_time_ms": 7442.525386810303, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:24] (step=0001983) Train Loss: 0.2834, Train Steps/Sec: 0.12, Epoch: 0.03853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 1984, "loss": 0.21299275755882263, "memory_gb": 7.721559524536133, "step_time_ms": 7522.9315757751465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:32] (step=0001984) Train Loss: 0.2176, Train Steps/Sec: 0.12, Epoch: 0.03855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 1985, "loss": 0.2663743197917938, "memory_gb": 7.721559524536133, "step_time_ms": 7442.586183547974, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:40] (step=0001985) Train Loss: 0.2825, Train Steps/Sec: 0.12, Epoch: 0.0385736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 1986, "loss": 0.20450669527053833, "memory_gb": 7.721559524536133, "step_time_ms": 7410.171031951904, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:48] (step=0001986) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.03859308200544112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:20:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 1987, "loss": 0.1258191168308258, "memory_gb": 7.721559524536133, "step_time_ms": 7494.915962219238, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:20:56] (step=0001987) Train Loss: 0.1938, Train Steps/Sec: 0.12, Epoch: 0.038612514574426736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 1988, "loss": 0.17378632724285126, "memory_gb": 7.721559524536133, "step_time_ms": 7444.369554519653, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:04] (step=0001988) Train Loss: 0.2316, Train Steps/Sec: 0.12, Epoch: 0.03863194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 1989, "loss": 0.16289794445037842, "memory_gb": 7.721559524536133, "step_time_ms": 7482.417345046997, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:12] (step=0001989) Train Loss: 0.2063, Train Steps/Sec: 0.12, Epoch: 0.03865137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 1990, "loss": 0.24751734733581543, "memory_gb": 7.721559524536133, "step_time_ms": 7494.530916213989, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:20] (step=0001990) Train Loss: 0.2806, Train Steps/Sec: 0.12, Epoch: 0.038670812281383596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 1991, "loss": 0.2473548948764801, "memory_gb": 7.721559524536133, "step_time_ms": 7419.095754623413, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:28] (step=0001991) Train Loss: 0.2593, Train Steps/Sec: 0.12, Epoch: 0.03869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 1992, "loss": 0.2646121382713318, "memory_gb": 7.721559524536133, "step_time_ms": 7437.813520431519, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:36] (step=0001992) Train Loss: 0.2676, Train Steps/Sec: 0.13, Epoch: 0.03870967741935484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 1993, "loss": 0.1819068342447281, "memory_gb": 7.721559524536133, "step_time_ms": 7521.147727966309, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:45] (step=0001993) Train Loss: 0.2390, Train Steps/Sec: 0.12, Epoch: 0.038729109988340456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 1994, "loss": 0.2303701639175415, "memory_gb": 7.715639114379883, "step_time_ms": 7394.764423370361, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:21:53] (step=0001994) Train Loss: 0.1895, Train Steps/Sec: 0.12, Epoch: 0.03874854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 1995, "loss": 0.14796920120716095, "memory_gb": 7.721559524536133, "step_time_ms": 7426.640033721924, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:01] (step=0001995) Train Loss: 0.1636, Train Steps/Sec: 0.12, Epoch: 0.0387679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 1996, "loss": 0.2609724998474121, "memory_gb": 7.721559524536133, "step_time_ms": 7495.106220245361, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:09] (step=0001996) Train Loss: 0.2237, Train Steps/Sec: 0.12, Epoch: 0.038787407695297316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 1997, "loss": 0.23041865229606628, "memory_gb": 7.721559524536133, "step_time_ms": 7549.25274848938, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:17] (step=0001997) Train Loss: 0.2320, Train Steps/Sec: 0.12, Epoch: 0.03880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 1998, "loss": 0.2311479151248932, "memory_gb": 7.721559524536133, "step_time_ms": 7422.6508140563965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:25] (step=0001998) Train Loss: 0.2055, Train Steps/Sec: 0.12, Epoch: 0.03882627283326856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 1999, "loss": 0.28830528259277344, "memory_gb": 7.721559524536133, "step_time_ms": 7503.731966018677, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:33] (step=0001999) Train Loss: 0.2397, Train Steps/Sec: 0.12, Epoch: 0.038845705402254176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2000, "loss": 0.1725420504808426, "memory_gb": 7.721559524536133, "step_time_ms": 7473.704814910889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:41] (step=0002000) Train Loss: 0.1781, Train Steps/Sec: 0.12, Epoch: 0.0388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:41] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0002000/ [2025-07-28 22:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2001, "loss": 0.3208978474140167, "memory_gb": 7.721559524536133, "step_time_ms": 7427.437543869019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:49] (step=0002001) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.03888457054022542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:22:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2002, "loss": 0.3240136504173279, "memory_gb": 7.721559524536133, "step_time_ms": 7524.424314498901, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:22:57] (step=0002002) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.038904003109211036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2003, "loss": 0.1875268816947937, "memory_gb": 7.721559524536133, "step_time_ms": 7555.109739303589, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:05] (step=0002003) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.03892343567819666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2004, "loss": 0.3098490834236145, "memory_gb": 7.721559524536133, "step_time_ms": 7358.248710632324, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:13] (step=0002004) Train Loss: 0.2856, Train Steps/Sec: 0.13, Epoch: 0.03894286824718228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2005, "loss": 0.2785906195640564, "memory_gb": 7.721559524536133, "step_time_ms": 7551.97548866272, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:21] (step=0002005) Train Loss: 0.2313, Train Steps/Sec: 0.12, Epoch: 0.038962300816167895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2006, "loss": 0.1951664537191391, "memory_gb": 7.721559524536133, "step_time_ms": 4990.354776382446, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:27] (step=0002006) Train Loss: 0.1762, Train Steps/Sec: 0.17, Epoch: 0.03898173338515352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2007, "loss": 0.18938997387886047, "memory_gb": 7.721559524536133, "step_time_ms": 7532.35387802124, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:35] (step=0002007) Train Loss: 0.2058, Train Steps/Sec: 0.12, Epoch: 0.03900116595413914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2008, "loss": 0.24058067798614502, "memory_gb": 7.721559524536133, "step_time_ms": 7516.756296157837, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:43] (step=0002008) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.039020598523124755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2009, "loss": 0.2989647388458252, "memory_gb": 7.721559524536133, "step_time_ms": 7481.833934783936, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:51] (step=0002009) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.03904003109211038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:23:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2010, "loss": 0.2560836672782898, "memory_gb": 7.721559524536133, "step_time_ms": 7561.873912811279, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:23:59] (step=0002010) Train Loss: 0.2869, Train Steps/Sec: 0.12, Epoch: 0.039059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2011, "loss": 0.2780873775482178, "memory_gb": 7.721559524536133, "step_time_ms": 7589.183330535889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:07] (step=0002011) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.039078896230081615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2012, "loss": 0.28654158115386963, "memory_gb": 7.721559524536133, "step_time_ms": 7556.317567825317, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:15] (step=0002012) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.03909832879906724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2013, "loss": 0.2569428086280823, "memory_gb": 7.721559524536133, "step_time_ms": 7615.41485786438, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:23] (step=0002013) Train Loss: 0.2218, Train Steps/Sec: 0.12, Epoch: 0.03911776136805286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2014, "loss": 0.1548214554786682, "memory_gb": 7.721559524536133, "step_time_ms": 7563.463449478149, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:31] (step=0002014) Train Loss: 0.2063, Train Steps/Sec: 0.13, Epoch: 0.039137193937038475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2015, "loss": 0.2100386917591095, "memory_gb": 7.721559524536133, "step_time_ms": 7476.961135864258, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:39] (step=0002015) Train Loss: 0.1914, Train Steps/Sec: 0.13, Epoch: 0.0391566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2016, "loss": 0.267280638217926, "memory_gb": 7.721559524536133, "step_time_ms": 7579.2810916900635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:47] (step=0002016) Train Loss: 0.2000, Train Steps/Sec: 0.12, Epoch: 0.03917605907500971, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:24:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2017, "loss": 0.1964733600616455, "memory_gb": 7.721559524536133, "step_time_ms": 7302.164316177368, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:24:55] (step=0002017) Train Loss: 0.2404, Train Steps/Sec: 0.13, Epoch: 0.039195491643995335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2018, "loss": 0.19804149866104126, "memory_gb": 7.721559524536133, "step_time_ms": 7546.2965965271, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:03] (step=0002018) Train Loss: 0.2815, Train Steps/Sec: 0.13, Epoch: 0.03921492421298096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2019, "loss": 0.17573875188827515, "memory_gb": 7.721559524536133, "step_time_ms": 7629.610300064087, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:11] (step=0002019) Train Loss: 0.2019, Train Steps/Sec: 0.12, Epoch: 0.03923435678196657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2020, "loss": 0.2720974087715149, "memory_gb": 7.721559524536133, "step_time_ms": 7537.456750869751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:19] (step=0002020) Train Loss: 0.2522, Train Steps/Sec: 0.12, Epoch: 0.039253789350952195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2021, "loss": 0.3040201663970947, "memory_gb": 7.721559524536133, "step_time_ms": 7538.800954818726, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:27] (step=0002021) Train Loss: 0.2284, Train Steps/Sec: 0.12, Epoch: 0.03927322191993782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2022, "loss": 0.27007871866226196, "memory_gb": 7.721559524536133, "step_time_ms": 7587.566614151001, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:35] (step=0002022) Train Loss: 0.2545, Train Steps/Sec: 0.12, Epoch: 0.03929265448892343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2023, "loss": 0.10716770589351654, "memory_gb": 7.721559524536133, "step_time_ms": 7478.043794631958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:43] (step=0002023) Train Loss: 0.1508, Train Steps/Sec: 0.12, Epoch: 0.039312087057909055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2024, "loss": 0.16473516821861267, "memory_gb": 7.721559524536133, "step_time_ms": 7507.40909576416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:51] (step=0002024) Train Loss: 0.2375, Train Steps/Sec: 0.12, Epoch: 0.03933151962689468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:25:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2025, "loss": 0.3406091332435608, "memory_gb": 7.721559524536133, "step_time_ms": 7613.826274871826, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:25:59] (step=0002025) Train Loss: 0.3391, Train Steps/Sec: 0.12, Epoch: 0.03935095219588029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2026, "loss": 0.3015506863594055, "memory_gb": 7.721559524536133, "step_time_ms": 7538.933753967285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:07] (step=0002026) Train Loss: 0.2325, Train Steps/Sec: 0.13, Epoch: 0.039370384764865915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2027, "loss": 0.19761569797992706, "memory_gb": 7.721559524536133, "step_time_ms": 7535.514116287231, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:15] (step=0002027) Train Loss: 0.2517, Train Steps/Sec: 0.12, Epoch: 0.03938981733385154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2028, "loss": 0.12944237887859344, "memory_gb": 7.721559524536133, "step_time_ms": 7581.626415252686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:23] (step=0002028) Train Loss: 0.1890, Train Steps/Sec: 0.12, Epoch: 0.03940924990283715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2029, "loss": 0.22610758244991302, "memory_gb": 7.721559524536133, "step_time_ms": 7501.251459121704, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:31] (step=0002029) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.039428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2030, "loss": 0.16074411571025848, "memory_gb": 7.721559524536133, "step_time_ms": 7543.781042098999, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:40] (step=0002030) Train Loss: 0.1627, Train Steps/Sec: 0.12, Epoch: 0.0394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2031, "loss": 0.24542562663555145, "memory_gb": 7.721559524536133, "step_time_ms": 7573.92692565918, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:48] (step=0002031) Train Loss: 0.2413, Train Steps/Sec: 0.12, Epoch: 0.03946754760979401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2032, "loss": 0.27518630027770996, "memory_gb": 7.721559524536133, "step_time_ms": 7539.728403091431, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:26:56] (step=0002032) Train Loss: 0.2327, Train Steps/Sec: 0.12, Epoch: 0.039486980178779635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2033, "loss": 0.18825939297676086, "memory_gb": 7.721559524536133, "step_time_ms": 7370.847463607788, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:04] (step=0002033) Train Loss: 0.2166, Train Steps/Sec: 0.13, Epoch: 0.03950641274776526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2034, "loss": 0.27874940633773804, "memory_gb": 7.721559524536133, "step_time_ms": 7531.787872314453, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:12] (step=0002034) Train Loss: 0.2585, Train Steps/Sec: 0.12, Epoch: 0.03952584531675087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2035, "loss": 0.32693102955818176, "memory_gb": 7.721559524536133, "step_time_ms": 4824.876308441162, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:18] (step=0002035) Train Loss: 0.2897, Train Steps/Sec: 0.17, Epoch: 0.039545277885736495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2036, "loss": 0.2946258783340454, "memory_gb": 7.721559524536133, "step_time_ms": 7604.635953903198, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:26] (step=0002036) Train Loss: 0.2350, Train Steps/Sec: 0.12, Epoch: 0.03956471045472212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2037, "loss": 0.16200301051139832, "memory_gb": 7.721559524536133, "step_time_ms": 7483.359098434448, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:34] (step=0002037) Train Loss: 0.1822, Train Steps/Sec: 0.12, Epoch: 0.03958414302370773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2038, "loss": 0.22923815250396729, "memory_gb": 7.721559524536133, "step_time_ms": 7545.806407928467, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:42] (step=0002038) Train Loss: 0.2638, Train Steps/Sec: 0.12, Epoch: 0.039603575592693355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2039, "loss": 0.23615328967571259, "memory_gb": 7.721559524536133, "step_time_ms": 7758.733034133911, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:50] (step=0002039) Train Loss: 0.2454, Train Steps/Sec: 0.12, Epoch: 0.03962300816167898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:27:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2040, "loss": 0.2978628873825073, "memory_gb": 7.721559524536133, "step_time_ms": 7462.525844573975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:27:58] (step=0002040) Train Loss: 0.2796, Train Steps/Sec: 0.13, Epoch: 0.03964244073066459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2041, "loss": 0.2810710370540619, "memory_gb": 7.721559524536133, "step_time_ms": 7436.643362045288, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:06] (step=0002041) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.039661873299650215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2042, "loss": 0.14854998886585236, "memory_gb": 7.721559524536133, "step_time_ms": 7534.538269042969, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:14] (step=0002042) Train Loss: 0.2072, Train Steps/Sec: 0.12, Epoch: 0.03968130586863584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2043, "loss": 0.2306351214647293, "memory_gb": 7.721559524536133, "step_time_ms": 7456.12096786499, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:22] (step=0002043) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.03970073843762145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2044, "loss": 0.25586941838264465, "memory_gb": 7.721559524536133, "step_time_ms": 7483.251094818115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:30] (step=0002044) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.039720171006607075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2045, "loss": 0.16371455788612366, "memory_gb": 7.721559524536133, "step_time_ms": 7561.607122421265, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:38] (step=0002045) Train Loss: 0.1583, Train Steps/Sec: 0.12, Epoch: 0.03973960357559269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2046, "loss": 0.2801464796066284, "memory_gb": 7.715639114379883, "step_time_ms": 7434.357166290283, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:46] (step=0002046) Train Loss: 0.3106, Train Steps/Sec: 0.12, Epoch: 0.03975903614457831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2047, "loss": 0.31072884798049927, "memory_gb": 7.721559524536133, "step_time_ms": 7489.731311798096, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:28:54] (step=0002047) Train Loss: 0.3014, Train Steps/Sec: 0.12, Epoch: 0.039778468713563934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2048, "loss": 0.15592601895332336, "memory_gb": 7.721559524536133, "step_time_ms": 7524.807929992676, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:02] (step=0002048) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.03979790128254955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2049, "loss": 0.2808244228363037, "memory_gb": 7.721559524536133, "step_time_ms": 7508.672475814819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:10] (step=0002049) Train Loss: 0.2835, Train Steps/Sec: 0.12, Epoch: 0.03981733385153517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2050, "loss": 0.30072924494743347, "memory_gb": 7.721559524536133, "step_time_ms": 7472.551107406616, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:18] (step=0002050) Train Loss: 0.2730, Train Steps/Sec: 0.13, Epoch: 0.039836766420520794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2051, "loss": 0.1980382651090622, "memory_gb": 7.721559524536133, "step_time_ms": 7518.0487632751465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:26] (step=0002051) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.03985619898950641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2052, "loss": 0.26080092787742615, "memory_gb": 7.721559524536133, "step_time_ms": 7455.719470977783, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:34] (step=0002052) Train Loss: 0.2619, Train Steps/Sec: 0.12, Epoch: 0.03987563155849203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2053, "loss": 0.1302611529827118, "memory_gb": 7.721559524536133, "step_time_ms": 7440.40584564209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:42] (step=0002053) Train Loss: 0.1615, Train Steps/Sec: 0.13, Epoch: 0.039895064127477654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2054, "loss": 0.24613851308822632, "memory_gb": 7.721559524536133, "step_time_ms": 7537.97459602356, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:51] (step=0002054) Train Loss: 0.2797, Train Steps/Sec: 0.12, Epoch: 0.03991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:29:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2055, "loss": 0.2175811529159546, "memory_gb": 7.721559524536133, "step_time_ms": 7452.412843704224, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:29:59] (step=0002055) Train Loss: 0.2878, Train Steps/Sec: 0.12, Epoch: 0.03993392926544889, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2056, "loss": 0.340450257062912, "memory_gb": 7.721559524536133, "step_time_ms": 7453.176736831665, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:07] (step=0002056) Train Loss: 0.2978, Train Steps/Sec: 0.12, Epoch: 0.039953361834434514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2057, "loss": 0.13365134596824646, "memory_gb": 7.721559524536133, "step_time_ms": 7541.885137557983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:15] (step=0002057) Train Loss: 0.1928, Train Steps/Sec: 0.12, Epoch: 0.03997279440342013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2058, "loss": 0.1867460012435913, "memory_gb": 7.721559524536133, "step_time_ms": 7496.192455291748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:23] (step=0002058) Train Loss: 0.1524, Train Steps/Sec: 0.12, Epoch: 0.03999222697240575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2059, "loss": 0.2692527770996094, "memory_gb": 7.721559524536133, "step_time_ms": 7466.320037841797, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:31] (step=0002059) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.040011659541391374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2060, "loss": 0.2622179687023163, "memory_gb": 7.721559524536133, "step_time_ms": 7523.066759109497, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:39] (step=0002060) Train Loss: 0.2781, Train Steps/Sec: 0.12, Epoch: 0.04003109211037699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2061, "loss": 0.20699048042297363, "memory_gb": 7.721559524536133, "step_time_ms": 7436.349868774414, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:47] (step=0002061) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.04005052467936261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:30:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2062, "loss": 0.3297055661678314, "memory_gb": 7.721559524536133, "step_time_ms": 7311.426401138306, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:30:55] (step=0002062) Train Loss: 0.2737, Train Steps/Sec: 0.13, Epoch: 0.040069957248348234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2063, "loss": 0.21285679936408997, "memory_gb": 7.721559524536133, "step_time_ms": 7497.025012969971, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:03] (step=0002063) Train Loss: 0.1859, Train Steps/Sec: 0.12, Epoch: 0.04008938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2064, "loss": 0.267971396446228, "memory_gb": 7.715639114379883, "step_time_ms": 5262.953996658325, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:08] (step=0002064) Train Loss: 0.3034, Train Steps/Sec: 0.18, Epoch: 0.04010882238631947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2065, "loss": 0.32884567975997925, "memory_gb": 7.721559524536133, "step_time_ms": 7469.587087631226, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:16] (step=0002065) Train Loss: 0.2639, Train Steps/Sec: 0.12, Epoch: 0.040128254955305094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2066, "loss": 0.18700169026851654, "memory_gb": 7.721559524536133, "step_time_ms": 7417.666435241699, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:24] (step=0002066) Train Loss: 0.1849, Train Steps/Sec: 0.13, Epoch: 0.04014768752429071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2067, "loss": 0.3246496617794037, "memory_gb": 7.721559524536133, "step_time_ms": 7406.835556030273, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:33] (step=0002067) Train Loss: 0.2806, Train Steps/Sec: 0.12, Epoch: 0.04016712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2068, "loss": 0.3115423023700714, "memory_gb": 7.721559524536133, "step_time_ms": 7461.057186126709, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:41] (step=0002068) Train Loss: 0.2829, Train Steps/Sec: 0.12, Epoch: 0.040186552662261954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2069, "loss": 0.2196582853794098, "memory_gb": 7.721559524536133, "step_time_ms": 7417.972564697266, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:49] (step=0002069) Train Loss: 0.2579, Train Steps/Sec: 0.12, Epoch: 0.04020598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2070, "loss": 0.17524342238903046, "memory_gb": 7.721559524536133, "step_time_ms": 7447.856426239014, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:31:57] (step=0002070) Train Loss: 0.2168, Train Steps/Sec: 0.12, Epoch: 0.04022541780023319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2071, "loss": 0.30899953842163086, "memory_gb": 7.721559524536133, "step_time_ms": 7528.216361999512, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:05] (step=0002071) Train Loss: 0.2658, Train Steps/Sec: 0.12, Epoch: 0.040244850369218814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2072, "loss": 0.2875719964504242, "memory_gb": 7.721559524536133, "step_time_ms": 7459.790468215942, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:13] (step=0002072) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.04026428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2073, "loss": 0.2028941661119461, "memory_gb": 7.721559524536133, "step_time_ms": 7449.559450149536, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:21] (step=0002073) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.04028371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2074, "loss": 0.25296780467033386, "memory_gb": 7.721559524536133, "step_time_ms": 7507.83371925354, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:29] (step=0002074) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.040303148076175674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2075, "loss": 0.24956455826759338, "memory_gb": 7.721559524536133, "step_time_ms": 7402.318000793457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:37] (step=0002075) Train Loss: 0.2135, Train Steps/Sec: 0.12, Epoch: 0.04032258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2076, "loss": 0.1919107884168625, "memory_gb": 7.721559524536133, "step_time_ms": 7461.615085601807, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:45] (step=0002076) Train Loss: 0.2040, Train Steps/Sec: 0.12, Epoch: 0.04034201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:32:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2077, "loss": 0.22304663062095642, "memory_gb": 7.721559524536133, "step_time_ms": 7557.565450668335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:32:53] (step=0002077) Train Loss: 0.2778, Train Steps/Sec: 0.12, Epoch: 0.04036144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2078, "loss": 0.17782160639762878, "memory_gb": 7.721559524536133, "step_time_ms": 7418.063640594482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:01] (step=0002078) Train Loss: 0.1740, Train Steps/Sec: 0.12, Epoch: 0.04038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2079, "loss": 0.31218740344047546, "memory_gb": 7.721559524536133, "step_time_ms": 7422.00493812561, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:09] (step=0002079) Train Loss: 0.2765, Train Steps/Sec: 0.12, Epoch: 0.04040031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2080, "loss": 0.16086378693580627, "memory_gb": 7.721559524536133, "step_time_ms": 7485.964298248291, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:17] (step=0002080) Train Loss: 0.1859, Train Steps/Sec: 0.12, Epoch: 0.04041974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2081, "loss": 0.30227169394493103, "memory_gb": 7.721559524536133, "step_time_ms": 7416.821479797363, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:25] (step=0002081) Train Loss: 0.3310, Train Steps/Sec: 0.13, Epoch: 0.04043917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2082, "loss": 0.26758626103401184, "memory_gb": 7.721559524536133, "step_time_ms": 7406.9504737854, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:33] (step=0002082) Train Loss: 0.2163, Train Steps/Sec: 0.13, Epoch: 0.04045860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2083, "loss": 0.29298606514930725, "memory_gb": 7.721559524536133, "step_time_ms": 7490.29016494751, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:41] (step=0002083) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.04047804119704625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2084, "loss": 0.2892442047595978, "memory_gb": 7.721559524536133, "step_time_ms": 7411.4789962768555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:49] (step=0002084) Train Loss: 0.2757, Train Steps/Sec: 0.12, Epoch: 0.04049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2085, "loss": 0.25886404514312744, "memory_gb": 7.721559524536133, "step_time_ms": 7352.90265083313, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:33:57] (step=0002085) Train Loss: 0.2360, Train Steps/Sec: 0.13, Epoch: 0.04051690633501749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2086, "loss": 0.1554349660873413, "memory_gb": 7.721559524536133, "step_time_ms": 7499.716281890869, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:05] (step=0002086) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.04053633890400311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2087, "loss": 0.2547835111618042, "memory_gb": 7.721559524536133, "step_time_ms": 7552.750825881958, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:13] (step=0002087) Train Loss: 0.2275, Train Steps/Sec: 0.13, Epoch: 0.04055577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2088, "loss": 0.2887364625930786, "memory_gb": 7.721559524536133, "step_time_ms": 7472.197532653809, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:21] (step=0002088) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.04057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2089, "loss": 0.3270196318626404, "memory_gb": 7.721559524536133, "step_time_ms": 7564.28599357605, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:29] (step=0002089) Train Loss: 0.3183, Train Steps/Sec: 0.12, Epoch: 0.04059463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2090, "loss": 0.42774879932403564, "memory_gb": 7.721559524536133, "step_time_ms": 7477.008104324341, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:37] (step=0002090) Train Loss: 0.3573, Train Steps/Sec: 0.12, Epoch: 0.04061406917994559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2091, "loss": 0.2485974133014679, "memory_gb": 7.721559524536133, "step_time_ms": 7323.572874069214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:45] (step=0002091) Train Loss: 0.2615, Train Steps/Sec: 0.13, Epoch: 0.04063350174893121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2092, "loss": 0.2820502519607544, "memory_gb": 7.721559524536133, "step_time_ms": 7557.625770568848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:53] (step=0002092) Train Loss: 0.2985, Train Steps/Sec: 0.12, Epoch: 0.040652934317916826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2093, "loss": 0.24512720108032227, "memory_gb": 7.721559524536133, "step_time_ms": 4978.450536727905, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:34:59] (step=0002093) Train Loss: 0.2104, Train Steps/Sec: 0.18, Epoch: 0.04067236688690245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2094, "loss": 0.2844330370426178, "memory_gb": 7.721559524536133, "step_time_ms": 7569.344282150269, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:07] (step=0002094) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.04069179945588807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2095, "loss": 0.3649997115135193, "memory_gb": 7.721559524536133, "step_time_ms": 7485.304832458496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:15] (step=0002095) Train Loss: 0.2921, Train Steps/Sec: 0.12, Epoch: 0.040711232024873686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2096, "loss": 0.317425400018692, "memory_gb": 7.721559524536133, "step_time_ms": 7494.660377502441, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:23] (step=0002096) Train Loss: 0.3103, Train Steps/Sec: 0.12, Epoch: 0.04073066459385931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2097, "loss": 0.20060868561267853, "memory_gb": 7.721559524536133, "step_time_ms": 7588.947534561157, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:31] (step=0002097) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.04075009716284493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2098, "loss": 0.24796231091022491, "memory_gb": 7.721559524536133, "step_time_ms": 7476.010322570801, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:39] (step=0002098) Train Loss: 0.1879, Train Steps/Sec: 0.12, Epoch: 0.040769529731830546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2099, "loss": 0.34705132246017456, "memory_gb": 7.721559524536133, "step_time_ms": 7498.279809951782, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:47] (step=0002099) Train Loss: 0.2592, Train Steps/Sec: 0.13, Epoch: 0.04078896230081617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:35:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2100, "loss": 0.3325303792953491, "memory_gb": 7.721559524536133, "step_time_ms": 7610.264778137207, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:35:55] (step=0002100) Train Loss: 0.3078, Train Steps/Sec: 0.12, Epoch: 0.04080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2101, "loss": 0.2487490028142929, "memory_gb": 7.721559524536133, "step_time_ms": 7516.838073730469, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:03] (step=0002101) Train Loss: 0.2469, Train Steps/Sec: 0.12, Epoch: 0.040827827438787406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2102, "loss": 0.2100464403629303, "memory_gb": 7.721559524536133, "step_time_ms": 7525.302886962891, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:11] (step=0002102) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.04084726000777303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2103, "loss": 0.21774178743362427, "memory_gb": 7.721559524536133, "step_time_ms": 7592.552423477173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:20] (step=0002103) Train Loss: 0.2048, Train Steps/Sec: 0.12, Epoch: 0.04086669257675865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2104, "loss": 0.3194241523742676, "memory_gb": 7.721559524536133, "step_time_ms": 7588.804006576538, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:27] (step=0002104) Train Loss: 0.3549, Train Steps/Sec: 0.13, Epoch: 0.040886125145744266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2105, "loss": 0.2500942349433899, "memory_gb": 7.721559524536133, "step_time_ms": 7598.4601974487305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:36] (step=0002105) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.04090555771472989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2106, "loss": 0.2287868708372116, "memory_gb": 7.715639114379883, "step_time_ms": 7567.71445274353, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:44] (step=0002106) Train Loss: 0.2333, Train Steps/Sec: 0.12, Epoch: 0.040924990283715504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2107, "loss": 0.19504490494728088, "memory_gb": 7.721559524536133, "step_time_ms": 7552.953243255615, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:36:52] (step=0002107) Train Loss: 0.2134, Train Steps/Sec: 0.12, Epoch: 0.040944422852701126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2108, "loss": 0.1735447198152542, "memory_gb": 7.721559524536133, "step_time_ms": 7574.812650680542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:00] (step=0002108) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.04096385542168675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2109, "loss": 0.27178019285202026, "memory_gb": 7.721559524536133, "step_time_ms": 7670.106410980225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:08] (step=0002109) Train Loss: 0.2836, Train Steps/Sec: 0.12, Epoch: 0.040983287990672364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2110, "loss": 0.22665682435035706, "memory_gb": 7.721559524536133, "step_time_ms": 7545.3479290008545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:16] (step=0002110) Train Loss: 0.2669, Train Steps/Sec: 0.12, Epoch: 0.041002720559657986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2111, "loss": 0.259240984916687, "memory_gb": 7.721559524536133, "step_time_ms": 7620.240926742554, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:24] (step=0002111) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.04102215312864361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2112, "loss": 0.3732595145702362, "memory_gb": 7.721559524536133, "step_time_ms": 7603.526830673218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:32] (step=0002112) Train Loss: 0.2935, Train Steps/Sec: 0.12, Epoch: 0.041041585697629224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2113, "loss": 0.2578277885913849, "memory_gb": 7.721559524536133, "step_time_ms": 7541.69225692749, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:40] (step=0002113) Train Loss: 0.2649, Train Steps/Sec: 0.13, Epoch: 0.041061018266614846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2114, "loss": 0.24822206795215607, "memory_gb": 7.721559524536133, "step_time_ms": 7510.48469543457, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:48] (step=0002114) Train Loss: 0.2318, Train Steps/Sec: 0.12, Epoch: 0.04108045083560047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2115, "loss": 0.20217499136924744, "memory_gb": 7.721559524536133, "step_time_ms": 7564.005136489868, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:37:56] (step=0002115) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.041099883404586084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2116, "loss": 0.2298422008752823, "memory_gb": 7.721559524536133, "step_time_ms": 7421.808958053589, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:04] (step=0002116) Train Loss: 0.2395, Train Steps/Sec: 0.12, Epoch: 0.041119315973571706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2117, "loss": 0.13267314434051514, "memory_gb": 7.721559524536133, "step_time_ms": 7488.0406856536865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:12] (step=0002117) Train Loss: 0.1760, Train Steps/Sec: 0.12, Epoch: 0.04113874854255733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2118, "loss": 0.26162534952163696, "memory_gb": 7.721559524536133, "step_time_ms": 7496.737003326416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:20] (step=0002118) Train Loss: 0.1995, Train Steps/Sec: 0.12, Epoch: 0.041158181111542944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2119, "loss": 0.2032943069934845, "memory_gb": 7.721559524536133, "step_time_ms": 7477.29754447937, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:28] (step=0002119) Train Loss: 0.1759, Train Steps/Sec: 0.12, Epoch: 0.041177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2120, "loss": 0.18247312307357788, "memory_gb": 7.721559524536133, "step_time_ms": 7343.076705932617, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:36] (step=0002120) Train Loss: 0.2214, Train Steps/Sec: 0.13, Epoch: 0.04119704624951419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2121, "loss": 0.2158198207616806, "memory_gb": 7.721559524536133, "step_time_ms": 7538.606882095337, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:44] (step=0002121) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.041216478818499803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2122, "loss": 0.2068505585193634, "memory_gb": 7.721559524536133, "step_time_ms": 5327.626705169678, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:50] (step=0002122) Train Loss: 0.2624, Train Steps/Sec: 0.18, Epoch: 0.041235911387485426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:38:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2123, "loss": 0.30610108375549316, "memory_gb": 7.721559524536133, "step_time_ms": 7461.276531219482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:38:58] (step=0002123) Train Loss: 0.2935, Train Steps/Sec: 0.12, Epoch: 0.04125534395647105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2124, "loss": 0.2915703058242798, "memory_gb": 7.721559524536133, "step_time_ms": 7449.225425720215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:06] (step=0002124) Train Loss: 0.2125, Train Steps/Sec: 0.12, Epoch: 0.04127477652545666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2125, "loss": 0.16471914947032928, "memory_gb": 7.721559524536133, "step_time_ms": 7461.593866348267, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:14] (step=0002125) Train Loss: 0.1991, Train Steps/Sec: 0.12, Epoch: 0.041294209094442286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2126, "loss": 0.24704816937446594, "memory_gb": 7.721559524536133, "step_time_ms": 7528.343439102173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:22] (step=0002126) Train Loss: 0.2635, Train Steps/Sec: 0.12, Epoch: 0.04131364166342791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2127, "loss": 0.29526203870773315, "memory_gb": 7.721559524536133, "step_time_ms": 7584.46741104126, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:30] (step=0002127) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.04133307423241352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2128, "loss": 0.2499966323375702, "memory_gb": 7.721559524536133, "step_time_ms": 7429.154872894287, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:38] (step=0002128) Train Loss: 0.2673, Train Steps/Sec: 0.12, Epoch: 0.041352506801399146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2129, "loss": 0.34586918354034424, "memory_gb": 7.721559524536133, "step_time_ms": 7516.911029815674, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:46] (step=0002129) Train Loss: 0.3090, Train Steps/Sec: 0.12, Epoch: 0.04137193937038477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2130, "loss": 0.24175697565078735, "memory_gb": 7.721559524536133, "step_time_ms": 7490.369558334351, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:39:54] (step=0002130) Train Loss: 0.2197, Train Steps/Sec: 0.12, Epoch: 0.04139137193937038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2131, "loss": 0.16926635801792145, "memory_gb": 7.721559524536133, "step_time_ms": 7463.289737701416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:02] (step=0002131) Train Loss: 0.1552, Train Steps/Sec: 0.12, Epoch: 0.041410804508356006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2132, "loss": 0.21324151754379272, "memory_gb": 7.721559524536133, "step_time_ms": 7517.733812332153, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:10] (step=0002132) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.04143023707734163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2133, "loss": 0.23436331748962402, "memory_gb": 7.721559524536133, "step_time_ms": 7471.807479858398, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:19] (step=0002133) Train Loss: 0.2001, Train Steps/Sec: 0.12, Epoch: 0.04144966964632724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2134, "loss": 0.24677108228206635, "memory_gb": 7.721559524536133, "step_time_ms": 7478.258371353149, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:27] (step=0002134) Train Loss: 0.2167, Train Steps/Sec: 0.12, Epoch: 0.041469102215312866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2135, "loss": 0.30996713042259216, "memory_gb": 7.721559524536133, "step_time_ms": 7543.303489685059, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:35] (step=0002135) Train Loss: 0.2935, Train Steps/Sec: 0.12, Epoch: 0.04148853478429848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2136, "loss": 0.2691488265991211, "memory_gb": 7.721559524536133, "step_time_ms": 7436.50484085083, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:43] (step=0002136) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.0415079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2137, "loss": 0.2717891335487366, "memory_gb": 7.721559524536133, "step_time_ms": 7477.761507034302, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:51] (step=0002137) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.041527399922269725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:40:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2138, "loss": 0.25298285484313965, "memory_gb": 7.721559524536133, "step_time_ms": 7534.2888832092285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:40:59] (step=0002138) Train Loss: 0.2652, Train Steps/Sec: 0.12, Epoch: 0.04154683249125534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2139, "loss": 0.25816458463668823, "memory_gb": 7.721559524536133, "step_time_ms": 7490.810871124268, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:07] (step=0002139) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.04156626506024096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2140, "loss": 0.16583845019340515, "memory_gb": 7.721559524536133, "step_time_ms": 7489.831209182739, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:15] (step=0002140) Train Loss: 0.1943, Train Steps/Sec: 0.12, Epoch: 0.041585697629226585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2141, "loss": 0.20981281995773315, "memory_gb": 7.721559524536133, "step_time_ms": 7550.928592681885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:23] (step=0002141) Train Loss: 0.2369, Train Steps/Sec: 0.12, Epoch: 0.0416051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2142, "loss": 0.20180052518844604, "memory_gb": 7.721559524536133, "step_time_ms": 7450.931549072266, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:31] (step=0002142) Train Loss: 0.2322, Train Steps/Sec: 0.12, Epoch: 0.04162456276719782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2143, "loss": 0.2805529236793518, "memory_gb": 7.721559524536133, "step_time_ms": 7480.626344680786, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:39] (step=0002143) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.041643995336183445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2144, "loss": 0.2707202434539795, "memory_gb": 7.721559524536133, "step_time_ms": 7480.7257652282715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:47] (step=0002144) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.04166342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:41:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2145, "loss": 0.34233853220939636, "memory_gb": 7.721559524536133, "step_time_ms": 7432.801961898804, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:41:55] (step=0002145) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.04168286047415468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2146, "loss": 0.22043952345848083, "memory_gb": 7.721559524536133, "step_time_ms": 7492.0814037323, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:03] (step=0002146) Train Loss: 0.2310, Train Steps/Sec: 0.12, Epoch: 0.041702293043140305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2147, "loss": 0.21523961424827576, "memory_gb": 7.721559524536133, "step_time_ms": 7545.452117919922, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:11] (step=0002147) Train Loss: 0.2425, Train Steps/Sec: 0.12, Epoch: 0.04172172561212592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2148, "loss": 0.2526928782463074, "memory_gb": 7.721559524536133, "step_time_ms": 7302.137613296509, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:19] (step=0002148) Train Loss: 0.2578, Train Steps/Sec: 0.12, Epoch: 0.04174115818111154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2149, "loss": 0.3036205768585205, "memory_gb": 7.721559524536133, "step_time_ms": 7362.648487091064, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:27] (step=0002149) Train Loss: 0.2848, Train Steps/Sec: 0.13, Epoch: 0.041760590750097165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2150, "loss": 0.24897028505802155, "memory_gb": 7.721559524536133, "step_time_ms": 7447.957515716553, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:35] (step=0002150) Train Loss: 0.1822, Train Steps/Sec: 0.12, Epoch: 0.04178002331908278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2151, "loss": 0.16879203915596008, "memory_gb": 7.721559524536133, "step_time_ms": 5200.392723083496, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:41] (step=0002151) Train Loss: 0.2270, Train Steps/Sec: 0.17, Epoch: 0.0417994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2152, "loss": 0.23862771689891815, "memory_gb": 7.721559524536133, "step_time_ms": 7473.804473876953, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:49] (step=0002152) Train Loss: 0.1974, Train Steps/Sec: 0.12, Epoch: 0.041818888457054025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:42:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2153, "loss": 0.2983737885951996, "memory_gb": 7.721559524536133, "step_time_ms": 7415.590763092041, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:42:57] (step=0002153) Train Loss: 0.2323, Train Steps/Sec: 0.13, Epoch: 0.04183832102603964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2154, "loss": 0.24291761219501495, "memory_gb": 7.721559524536133, "step_time_ms": 7423.744916915894, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:05] (step=0002154) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.04185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2155, "loss": 0.23735058307647705, "memory_gb": 7.721559524536133, "step_time_ms": 7521.738529205322, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:13] (step=0002155) Train Loss: 0.2706, Train Steps/Sec: 0.12, Epoch: 0.041877186164010885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2156, "loss": 0.23948433995246887, "memory_gb": 7.721559524536133, "step_time_ms": 7431.748390197754, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:21] (step=0002156) Train Loss: 0.2509, Train Steps/Sec: 0.12, Epoch: 0.0418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2157, "loss": 0.2015046775341034, "memory_gb": 7.721559524536133, "step_time_ms": 7496.697425842285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:29] (step=0002157) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 0.04191605130198212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2158, "loss": 0.20631183683872223, "memory_gb": 7.721559524536133, "step_time_ms": 7526.950359344482, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:38] (step=0002158) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.041935483870967745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2159, "loss": 0.31553927063941956, "memory_gb": 7.721559524536133, "step_time_ms": 7456.9995403289795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:46] (step=0002159) Train Loss: 0.2948, Train Steps/Sec: 0.12, Epoch: 0.04195491643995336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:43:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2160, "loss": 0.2544788718223572, "memory_gb": 7.721559524536133, "step_time_ms": 7477.794408798218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:43:54] (step=0002160) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.04197434900893898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2161, "loss": 0.3687272071838379, "memory_gb": 7.721559524536133, "step_time_ms": 7560.418605804443, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:02] (step=0002161) Train Loss: 0.3117, Train Steps/Sec: 0.12, Epoch: 0.041993781577924605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2162, "loss": 0.19920295476913452, "memory_gb": 7.721559524536133, "step_time_ms": 7527.897596359253, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:10] (step=0002162) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.04201321414691022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2163, "loss": 0.2705845236778259, "memory_gb": 7.721559524536133, "step_time_ms": 7519.738435745239, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:18] (step=0002163) Train Loss: 0.2465, Train Steps/Sec: 0.12, Epoch: 0.04203264671589584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2164, "loss": 0.20404501259326935, "memory_gb": 7.721559524536133, "step_time_ms": 7567.771673202515, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:26] (step=0002164) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.042052079284881465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2165, "loss": 0.18678155541419983, "memory_gb": 7.721559524536133, "step_time_ms": 7461.150407791138, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:34] (step=0002165) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.04207151185386708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2166, "loss": 0.30977436900138855, "memory_gb": 7.721559524536133, "step_time_ms": 7522.822618484497, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:42] (step=0002166) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.0420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2167, "loss": 0.21387065947055817, "memory_gb": 7.721559524536133, "step_time_ms": 7569.19002532959, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:50] (step=0002167) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.04211037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:44:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2168, "loss": 0.1638157069683075, "memory_gb": 7.721559524536133, "step_time_ms": 7491.063356399536, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:44:58] (step=0002168) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.04212980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2169, "loss": 0.24962344765663147, "memory_gb": 7.721559524536133, "step_time_ms": 7454.66947555542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:06] (step=0002169) Train Loss: 0.2373, Train Steps/Sec: 0.13, Epoch: 0.04214924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2170, "loss": 0.19573228061199188, "memory_gb": 7.721559524536133, "step_time_ms": 7563.842058181763, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:14] (step=0002170) Train Loss: 0.2106, Train Steps/Sec: 0.12, Epoch: 0.04216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2171, "loss": 0.25739115476608276, "memory_gb": 7.721559524536133, "step_time_ms": 7547.153949737549, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:22] (step=0002171) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.0421881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2172, "loss": 0.23067010939121246, "memory_gb": 7.721559524536133, "step_time_ms": 7520.087242126465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:31] (step=0002172) Train Loss: 0.2759, Train Steps/Sec: 0.12, Epoch: 0.04220753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2173, "loss": 0.21558302640914917, "memory_gb": 7.721559524536133, "step_time_ms": 7607.123613357544, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:39] (step=0002173) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.04222697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2174, "loss": 0.29274657368659973, "memory_gb": 7.721559524536133, "step_time_ms": 7684.299468994141, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:47] (step=0002174) Train Loss: 0.3178, Train Steps/Sec: 0.12, Epoch: 0.04224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:45:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2175, "loss": 0.24887782335281372, "memory_gb": 7.721559524536133, "step_time_ms": 7603.616237640381, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:45:55] (step=0002175) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.04226583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2176, "loss": 0.1424226313829422, "memory_gb": 7.721559524536133, "step_time_ms": 7628.6773681640625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:03] (step=0002176) Train Loss: 0.1866, Train Steps/Sec: 0.12, Epoch: 0.0422852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2177, "loss": 0.29870736598968506, "memory_gb": 7.721559524536133, "step_time_ms": 7502.698183059692, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:11] (step=0002177) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.04230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2178, "loss": 0.23826342821121216, "memory_gb": 7.721559524536133, "step_time_ms": 7423.920631408691, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:19] (step=0002178) Train Loss: 0.2374, Train Steps/Sec: 0.13, Epoch: 0.04232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2179, "loss": 0.16578525304794312, "memory_gb": 7.721559524536133, "step_time_ms": 7628.997087478638, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:27] (step=0002179) Train Loss: 0.1781, Train Steps/Sec: 0.12, Epoch: 0.04234356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2180, "loss": 0.2652776539325714, "memory_gb": 7.721559524536133, "step_time_ms": 5329.879999160767, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:33] (step=0002180) Train Loss: 0.2701, Train Steps/Sec: 0.17, Epoch: 0.04236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2181, "loss": 0.2394656538963318, "memory_gb": 7.721559524536133, "step_time_ms": 7570.795774459839, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:41] (step=0002181) Train Loss: 0.2007, Train Steps/Sec: 0.12, Epoch: 0.042382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2182, "loss": 0.26172497868537903, "memory_gb": 7.721559524536133, "step_time_ms": 7506.499767303467, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:49] (step=0002182) Train Loss: 0.2037, Train Steps/Sec: 0.12, Epoch: 0.04240186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:46:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2183, "loss": 0.22301730513572693, "memory_gb": 7.721559524536133, "step_time_ms": 7554.063558578491, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:46:57] (step=0002183) Train Loss: 0.2422, Train Steps/Sec: 0.13, Epoch: 0.04242129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2184, "loss": 0.1954478621482849, "memory_gb": 7.721559524536133, "step_time_ms": 7630.604267120361, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:05] (step=0002184) Train Loss: 0.2058, Train Steps/Sec: 0.12, Epoch: 0.04244073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2185, "loss": 0.19414502382278442, "memory_gb": 7.721559524536133, "step_time_ms": 7530.608654022217, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:13] (step=0002185) Train Loss: 0.1842, Train Steps/Sec: 0.13, Epoch: 0.04246016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2186, "loss": 0.37213319540023804, "memory_gb": 7.721559524536133, "step_time_ms": 7554.685831069946, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:21] (step=0002186) Train Loss: 0.3073, Train Steps/Sec: 0.13, Epoch: 0.0424795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2187, "loss": 0.23735056817531586, "memory_gb": 7.721559524536133, "step_time_ms": 7662.6136302948, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:29] (step=0002187) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.04249902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2188, "loss": 0.24749085307121277, "memory_gb": 7.721559524536133, "step_time_ms": 7488.4161949157715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:37] (step=0002188) Train Loss: 0.2749, Train Steps/Sec: 0.12, Epoch: 0.04251846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2189, "loss": 0.18191832304000854, "memory_gb": 7.721559524536133, "step_time_ms": 7511.267423629761, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:45] (step=0002189) Train Loss: 0.1997, Train Steps/Sec: 0.12, Epoch: 0.04253789350952196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2190, "loss": 0.28484559059143066, "memory_gb": 7.721559524536133, "step_time_ms": 7518.525123596191, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:47:53] (step=0002190) Train Loss: 0.2514, Train Steps/Sec: 0.12, Epoch: 0.04255732607850758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2191, "loss": 0.2630269527435303, "memory_gb": 7.721559524536133, "step_time_ms": 7430.351734161377, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:01] (step=0002191) Train Loss: 0.1809, Train Steps/Sec: 0.12, Epoch: 0.0425767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2192, "loss": 0.1626587063074112, "memory_gb": 7.721559524536133, "step_time_ms": 7474.331855773926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:09] (step=0002192) Train Loss: 0.2190, Train Steps/Sec: 0.12, Epoch: 0.04259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2193, "loss": 0.14966131746768951, "memory_gb": 7.721559524536133, "step_time_ms": 7525.22087097168, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:17] (step=0002193) Train Loss: 0.2276, Train Steps/Sec: 0.12, Epoch: 0.04261562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2194, "loss": 0.15116706490516663, "memory_gb": 7.721559524536133, "step_time_ms": 7469.700574874878, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:25] (step=0002194) Train Loss: 0.1738, Train Steps/Sec: 0.12, Epoch: 0.04263505635445006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2195, "loss": 0.16462020576000214, "memory_gb": 7.721559524536133, "step_time_ms": 7536.836624145508, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:33] (step=0002195) Train Loss: 0.1968, Train Steps/Sec: 0.12, Epoch: 0.04265448892343568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2196, "loss": 0.23104256391525269, "memory_gb": 7.721559524536133, "step_time_ms": 7544.357061386108, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:41] (step=0002196) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.042673921492421295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2197, "loss": 0.25456300377845764, "memory_gb": 7.721559524536133, "step_time_ms": 7536.044120788574, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:49] (step=0002197) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.04269335406140692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2198, "loss": 0.273612380027771, "memory_gb": 7.721559524536133, "step_time_ms": 7500.894784927368, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:48:57] (step=0002198) Train Loss: 0.2687, Train Steps/Sec: 0.12, Epoch: 0.04271278663039254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2199, "loss": 0.2828803062438965, "memory_gb": 7.721559524536133, "step_time_ms": 7590.810537338257, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:05] (step=0002199) Train Loss: 0.2667, Train Steps/Sec: 0.12, Epoch: 0.042732219199378155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2200, "loss": 0.26095396280288696, "memory_gb": 7.721559524536133, "step_time_ms": 7553.903579711914, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:14] (step=0002200) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.04275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2201, "loss": 0.30588704347610474, "memory_gb": 7.721559524536133, "step_time_ms": 7517.879247665405, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:22] (step=0002201) Train Loss: 0.3024, Train Steps/Sec: 0.12, Epoch: 0.0427710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2202, "loss": 0.33847320079803467, "memory_gb": 7.721559524536133, "step_time_ms": 7550.148010253906, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:30] (step=0002202) Train Loss: 0.2986, Train Steps/Sec: 0.12, Epoch: 0.042790516906335015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2203, "loss": 0.2557796835899353, "memory_gb": 7.721559524536133, "step_time_ms": 7435.908079147339, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:38] (step=0002203) Train Loss: 0.2247, Train Steps/Sec: 0.13, Epoch: 0.04280994947532064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2204, "loss": 0.2099987268447876, "memory_gb": 7.721559524536133, "step_time_ms": 7477.949380874634, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:46] (step=0002204) Train Loss: 0.2565, Train Steps/Sec: 0.12, Epoch: 0.04282938204430626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2205, "loss": 0.2514023780822754, "memory_gb": 7.721559524536133, "step_time_ms": 7482.294082641602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:49:54] (step=0002205) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.042848814613291875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2206, "loss": 0.21823686361312866, "memory_gb": 7.721559524536133, "step_time_ms": 7463.143348693848, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:02] (step=0002206) Train Loss: 0.1823, Train Steps/Sec: 0.13, Epoch: 0.0428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2207, "loss": 0.18573921918869019, "memory_gb": 7.721559524536133, "step_time_ms": 7334.160327911377, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:10] (step=0002207) Train Loss: 0.2130, Train Steps/Sec: 0.13, Epoch: 0.04288767975126312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2208, "loss": 0.3129039406776428, "memory_gb": 7.721559524536133, "step_time_ms": 7523.748874664307, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:18] (step=0002208) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.042907112320248735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2209, "loss": 0.28149062395095825, "memory_gb": 7.721559524536133, "step_time_ms": 4944.483518600464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:24] (step=0002209) Train Loss: 0.3132, Train Steps/Sec: 0.16, Epoch: 0.04292654488923436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2210, "loss": 0.3313513994216919, "memory_gb": 7.721559524536133, "step_time_ms": 7551.144361495972, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:32] (step=0002210) Train Loss: 0.2536, Train Steps/Sec: 0.12, Epoch: 0.04294597745821998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2211, "loss": 0.15251165628433228, "memory_gb": 7.721559524536133, "step_time_ms": 7491.079330444336, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:40] (step=0002211) Train Loss: 0.2387, Train Steps/Sec: 0.13, Epoch: 0.042965410027205594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2212, "loss": 0.23798179626464844, "memory_gb": 7.721559524536133, "step_time_ms": 7444.866418838501, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:48] (step=0002212) Train Loss: 0.2176, Train Steps/Sec: 0.12, Epoch: 0.04298484259619122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:50:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2213, "loss": 0.18859055638313293, "memory_gb": 7.721559524536133, "step_time_ms": 7575.352430343628, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:50:56] (step=0002213) Train Loss: 0.2428, Train Steps/Sec: 0.12, Epoch: 0.04300427516517684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2214, "loss": 0.134303018450737, "memory_gb": 7.721559524536133, "step_time_ms": 7457.789182662964, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:04] (step=0002214) Train Loss: 0.2470, Train Steps/Sec: 0.12, Epoch: 0.043023707734162454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2215, "loss": 0.21446117758750916, "memory_gb": 7.721559524536133, "step_time_ms": 7594.903945922852, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:12] (step=0002215) Train Loss: 0.2272, Train Steps/Sec: 0.13, Epoch: 0.04304314030314808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2216, "loss": 0.27034705877304077, "memory_gb": 7.721559524536133, "step_time_ms": 7570.340633392334, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:20] (step=0002216) Train Loss: 0.2109, Train Steps/Sec: 0.12, Epoch: 0.0430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2217, "loss": 0.2801613509654999, "memory_gb": 7.721559524536133, "step_time_ms": 7562.5176429748535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:28] (step=0002217) Train Loss: 0.3084, Train Steps/Sec: 0.12, Epoch: 0.043082005441119314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2218, "loss": 0.33676794171333313, "memory_gb": 7.721559524536133, "step_time_ms": 7507.640361785889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:36] (step=0002218) Train Loss: 0.3307, Train Steps/Sec: 0.13, Epoch: 0.04310143801010494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2219, "loss": 0.20975764095783234, "memory_gb": 7.721559524536133, "step_time_ms": 7528.972387313843, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:44] (step=0002219) Train Loss: 0.2466, Train Steps/Sec: 0.12, Epoch: 0.04312087057909056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:51:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2220, "loss": 0.2812145948410034, "memory_gb": 7.721559524536133, "step_time_ms": 7314.6350383758545, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:51:53] (step=0002220) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.043140303148076174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2221, "loss": 0.17641738057136536, "memory_gb": 7.721559524536133, "step_time_ms": 7519.772529602051, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:01] (step=0002221) Train Loss: 0.1722, Train Steps/Sec: 0.12, Epoch: 0.043159735717061797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2222, "loss": 0.22876355051994324, "memory_gb": 7.721559524536133, "step_time_ms": 7596.823930740356, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:09] (step=0002222) Train Loss: 0.2422, Train Steps/Sec: 0.12, Epoch: 0.04317916828604742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2223, "loss": 0.21616196632385254, "memory_gb": 7.721559524536133, "step_time_ms": 7581.886291503906, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:17] (step=0002223) Train Loss: 0.2623, Train Steps/Sec: 0.12, Epoch: 0.043198600855033034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2224, "loss": 0.36588746309280396, "memory_gb": 7.721559524536133, "step_time_ms": 7544.8150634765625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:25] (step=0002224) Train Loss: 0.3284, Train Steps/Sec: 0.13, Epoch: 0.043218033424018656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2225, "loss": 0.3171412944793701, "memory_gb": 7.721559524536133, "step_time_ms": 7590.64507484436, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:33] (step=0002225) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.04323746599300427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2226, "loss": 0.28519368171691895, "memory_gb": 7.721559524536133, "step_time_ms": 7576.996564865112, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:41] (step=0002226) Train Loss: 0.2938, Train Steps/Sec: 0.12, Epoch: 0.043256898561989894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2227, "loss": 0.2402500957250595, "memory_gb": 7.721559524536133, "step_time_ms": 7606.510400772095, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:49] (step=0002227) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.043276331130975516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:52:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2228, "loss": 0.2669978737831116, "memory_gb": 7.721559524536133, "step_time_ms": 7620.914697647095, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:52:57] (step=0002228) Train Loss: 0.1910, Train Steps/Sec: 0.12, Epoch: 0.04329576369996113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2229, "loss": 0.3305596709251404, "memory_gb": 7.721559524536133, "step_time_ms": 7533.1127643585205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:05] (step=0002229) Train Loss: 0.2690, Train Steps/Sec: 0.13, Epoch: 0.043315196268946754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2230, "loss": 0.26337748765945435, "memory_gb": 7.721559524536133, "step_time_ms": 7502.241611480713, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:13] (step=0002230) Train Loss: 0.2653, Train Steps/Sec: 0.12, Epoch: 0.043334628837932376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2231, "loss": 0.2254190593957901, "memory_gb": 7.721559524536133, "step_time_ms": 7655.64227104187, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:21] (step=0002231) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.04335406140691799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2232, "loss": 0.28219175338745117, "memory_gb": 7.721559524536133, "step_time_ms": 7565.562963485718, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:29] (step=0002232) Train Loss: 0.2797, Train Steps/Sec: 0.13, Epoch: 0.043373493975903614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2233, "loss": 0.36201727390289307, "memory_gb": 7.721559524536133, "step_time_ms": 7527.661085128784, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:37] (step=0002233) Train Loss: 0.2777, Train Steps/Sec: 0.12, Epoch: 0.043392926544889236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2234, "loss": 0.1101674884557724, "memory_gb": 7.721559524536133, "step_time_ms": 7642.753124237061, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:45] (step=0002234) Train Loss: 0.1887, Train Steps/Sec: 0.12, Epoch: 0.04341235911387485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2235, "loss": 0.22324217855930328, "memory_gb": 7.721559524536133, "step_time_ms": 7584.313631057739, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:53:53] (step=0002235) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.043431791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2236, "loss": 0.35470765829086304, "memory_gb": 7.721559524536133, "step_time_ms": 7449.527025222778, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:01] (step=0002236) Train Loss: 0.2561, Train Steps/Sec: 0.13, Epoch: 0.043451224251846096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2237, "loss": 0.15447208285331726, "memory_gb": 7.721559524536133, "step_time_ms": 7605.929136276245, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:09] (step=0002237) Train Loss: 0.1927, Train Steps/Sec: 0.12, Epoch: 0.04347065682083171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2238, "loss": 0.16370148956775665, "memory_gb": 7.721559524536133, "step_time_ms": 5203.228712081909, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:16] (step=0002238) Train Loss: 0.1867, Train Steps/Sec: 0.15, Epoch: 0.043490089389817334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2239, "loss": 0.21648739278316498, "memory_gb": 7.721559524536133, "step_time_ms": 7629.342794418335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:24] (step=0002239) Train Loss: 0.2125, Train Steps/Sec: 0.12, Epoch: 0.043509521958802956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2240, "loss": 0.24027885496616364, "memory_gb": 7.721559524536133, "step_time_ms": 7528.209924697876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:32] (step=0002240) Train Loss: 0.2678, Train Steps/Sec: 0.13, Epoch: 0.04352895452778857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2241, "loss": 0.2252776324748993, "memory_gb": 7.721559524536133, "step_time_ms": 7544.9419021606445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:40] (step=0002241) Train Loss: 0.2061, Train Steps/Sec: 0.13, Epoch: 0.043548387096774194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2242, "loss": 0.18666338920593262, "memory_gb": 7.721559524536133, "step_time_ms": 7599.006414413452, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:48] (step=0002242) Train Loss: 0.1569, Train Steps/Sec: 0.12, Epoch: 0.043567819665759816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2243, "loss": 0.16190136969089508, "memory_gb": 7.721559524536133, "step_time_ms": 7507.205486297607, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:54:56] (step=0002243) Train Loss: 0.2023, Train Steps/Sec: 0.12, Epoch: 0.04358725223474543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2244, "loss": 0.2325594127178192, "memory_gb": 7.721559524536133, "step_time_ms": 7536.585807800293, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:04] (step=0002244) Train Loss: 0.2083, Train Steps/Sec: 0.12, Epoch: 0.043606684803731054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2245, "loss": 0.20116813480854034, "memory_gb": 7.721559524536133, "step_time_ms": 7596.897840499878, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:12] (step=0002245) Train Loss: 0.2174, Train Steps/Sec: 0.12, Epoch: 0.043626117372716676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2246, "loss": 0.16880610585212708, "memory_gb": 7.721559524536133, "step_time_ms": 7563.450336456299, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:20] (step=0002246) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.04364554994170229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2247, "loss": 0.1383415162563324, "memory_gb": 7.721559524536133, "step_time_ms": 7479.829549789429, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:28] (step=0002247) Train Loss: 0.1737, Train Steps/Sec: 0.12, Epoch: 0.043664982510687914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2248, "loss": 0.24658606946468353, "memory_gb": 7.721559524536133, "step_time_ms": 7545.879364013672, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:36] (step=0002248) Train Loss: 0.2355, Train Steps/Sec: 0.12, Epoch: 0.043684415079673536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2249, "loss": 0.2106863558292389, "memory_gb": 7.721559524536133, "step_time_ms": 7483.3738803863525, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:44] (step=0002249) Train Loss: 0.2893, Train Steps/Sec: 0.12, Epoch: 0.04370384764865915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2250, "loss": 0.2871527373790741, "memory_gb": 7.721559524536133, "step_time_ms": 7527.027606964111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:55:52] (step=0002250) Train Loss: 0.2619, Train Steps/Sec: 0.12, Epoch: 0.043723280217644774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2251, "loss": 0.26903337240219116, "memory_gb": 7.721559524536133, "step_time_ms": 7587.024450302124, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:00] (step=0002251) Train Loss: 0.3190, Train Steps/Sec: 0.12, Epoch: 0.043742712786630396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2252, "loss": 0.2260613590478897, "memory_gb": 7.721559524536133, "step_time_ms": 7488.80934715271, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:08] (step=0002252) Train Loss: 0.1864, Train Steps/Sec: 0.12, Epoch: 0.04376214535561601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2253, "loss": 0.343758225440979, "memory_gb": 7.715639114379883, "step_time_ms": 7491.441011428833, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:17] (step=0002253) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.04378157792460163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2254, "loss": 0.18462327122688293, "memory_gb": 7.721559524536133, "step_time_ms": 7563.431978225708, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:25] (step=0002254) Train Loss: 0.2674, Train Steps/Sec: 0.12, Epoch: 0.04380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2255, "loss": 0.3157604932785034, "memory_gb": 7.721559524536133, "step_time_ms": 7507.888078689575, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:33] (step=0002255) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.04382044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2256, "loss": 0.21873722970485687, "memory_gb": 7.721559524536133, "step_time_ms": 7498.15034866333, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:41] (step=0002256) Train Loss: 0.2489, Train Steps/Sec: 0.13, Epoch: 0.04383987563155849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2257, "loss": 0.20028358697891235, "memory_gb": 7.721559524536133, "step_time_ms": 7576.607227325439, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:49] (step=0002257) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.04385930820054411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:56:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2258, "loss": 0.1393422931432724, "memory_gb": 7.721559524536133, "step_time_ms": 7500.559568405151, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:56:57] (step=0002258) Train Loss: 0.1761, Train Steps/Sec: 0.13, Epoch: 0.04387874076952973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2259, "loss": 0.21978451311588287, "memory_gb": 7.721559524536133, "step_time_ms": 7452.057600021362, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:05] (step=0002259) Train Loss: 0.2602, Train Steps/Sec: 0.12, Epoch: 0.04389817333851535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2260, "loss": 0.22585107386112213, "memory_gb": 7.721559524536133, "step_time_ms": 7513.219594955444, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:13] (step=0002260) Train Loss: 0.2475, Train Steps/Sec: 0.12, Epoch: 0.04391760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2261, "loss": 0.2385912984609604, "memory_gb": 7.721559524536133, "step_time_ms": 7458.842515945435, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:21] (step=0002261) Train Loss: 0.2094, Train Steps/Sec: 0.12, Epoch: 0.04393703847648659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2262, "loss": 0.40100911259651184, "memory_gb": 7.721559524536133, "step_time_ms": 7421.065807342529, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:29] (step=0002262) Train Loss: 0.2699, Train Steps/Sec: 0.13, Epoch: 0.04395647104547221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2263, "loss": 0.172947496175766, "memory_gb": 7.721559524536133, "step_time_ms": 7651.421785354614, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:37] (step=0002263) Train Loss: 0.2203, Train Steps/Sec: 0.12, Epoch: 0.04397590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2264, "loss": 0.32796192169189453, "memory_gb": 7.715639114379883, "step_time_ms": 7401.858329772949, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:45] (step=0002264) Train Loss: 0.2671, Train Steps/Sec: 0.12, Epoch: 0.04399533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:57:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2265, "loss": 0.237323597073555, "memory_gb": 7.721559524536133, "step_time_ms": 7296.708583831787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:57:53] (step=0002265) Train Loss: 0.3102, Train Steps/Sec: 0.13, Epoch: 0.04401476875242907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2266, "loss": 0.2390557825565338, "memory_gb": 7.721559524536133, "step_time_ms": 7487.483501434326, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:01] (step=0002266) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.04403420132141469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2267, "loss": 0.13698795437812805, "memory_gb": 7.721559524536133, "step_time_ms": 5414.867162704468, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:07] (step=0002267) Train Loss: 0.1419, Train Steps/Sec: 0.17, Epoch: 0.04405363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2268, "loss": 0.268473744392395, "memory_gb": 7.721559524536133, "step_time_ms": 7555.27400970459, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:15] (step=0002268) Train Loss: 0.2566, Train Steps/Sec: 0.12, Epoch: 0.04407306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2269, "loss": 0.2991076111793518, "memory_gb": 7.721559524536133, "step_time_ms": 7432.4963092803955, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:23] (step=0002269) Train Loss: 0.2829, Train Steps/Sec: 0.12, Epoch: 0.04409249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2270, "loss": 0.1703479290008545, "memory_gb": 7.721559524536133, "step_time_ms": 7489.228248596191, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:31] (step=0002270) Train Loss: 0.1713, Train Steps/Sec: 0.12, Epoch: 0.04411193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2271, "loss": 0.2864692807197571, "memory_gb": 7.721559524536133, "step_time_ms": 7494.345903396606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:39] (step=0002271) Train Loss: 0.2790, Train Steps/Sec: 0.12, Epoch: 0.04413136416634279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2272, "loss": 0.2689089775085449, "memory_gb": 7.721559524536133, "step_time_ms": 7435.6653690338135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:47] (step=0002272) Train Loss: 0.2488, Train Steps/Sec: 0.12, Epoch: 0.04415079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:58:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2273, "loss": 0.21849094331264496, "memory_gb": 7.721559524536133, "step_time_ms": 7452.313423156738, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:58:55] (step=0002273) Train Loss: 0.2767, Train Steps/Sec: 0.12, Epoch: 0.04417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2274, "loss": 0.17921984195709229, "memory_gb": 7.721559524536133, "step_time_ms": 7544.5873737335205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:03] (step=0002274) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.04418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2275, "loss": 0.1460021734237671, "memory_gb": 7.721559524536133, "step_time_ms": 7474.03621673584, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:11] (step=0002275) Train Loss: 0.1700, Train Steps/Sec: 0.12, Epoch: 0.04420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2276, "loss": 0.3065693974494934, "memory_gb": 7.715639114379883, "step_time_ms": 7241.553068161011, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:19] (step=0002276) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.04422852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2277, "loss": 0.32854077219963074, "memory_gb": 7.721559524536133, "step_time_ms": 7479.4602394104, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:28] (step=0002277) Train Loss: 0.2568, Train Steps/Sec: 0.12, Epoch: 0.04424795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2278, "loss": 0.18219146132469177, "memory_gb": 7.721559524536133, "step_time_ms": 7450.326442718506, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:36] (step=0002278) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.04426739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2279, "loss": 0.20116977393627167, "memory_gb": 7.721559524536133, "step_time_ms": 7474.894285202026, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:44] (step=0002279) Train Loss: 0.2257, Train Steps/Sec: 0.12, Epoch: 0.04428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 22:59:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2280, "loss": 0.36013466119766235, "memory_gb": 7.721559524536133, "step_time_ms": 7564.427614212036, "trainable_params": 4718592, "method": "lora"} [2025-07-28 22:59:52] (step=0002280) Train Loss: 0.3509, Train Steps/Sec: 0.12, Epoch: 0.04430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2281, "loss": 0.3028396666049957, "memory_gb": 7.721559524536133, "step_time_ms": 7529.558420181274, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:00] (step=0002281) Train Loss: 0.2478, Train Steps/Sec: 0.13, Epoch: 0.04432568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2282, "loss": 0.2298390120267868, "memory_gb": 7.721559524536133, "step_time_ms": 7540.750741958618, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:08] (step=0002282) Train Loss: 0.2397, Train Steps/Sec: 0.12, Epoch: 0.04434512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2283, "loss": 0.19361776113510132, "memory_gb": 7.721559524536133, "step_time_ms": 7575.942277908325, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:16] (step=0002283) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.04436455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2284, "loss": 0.1617206633090973, "memory_gb": 7.721559524536133, "step_time_ms": 7486.480712890625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:24] (step=0002284) Train Loss: 0.1592, Train Steps/Sec: 0.13, Epoch: 0.04438398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2285, "loss": 0.2729923725128174, "memory_gb": 7.721559524536133, "step_time_ms": 7435.518264770508, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:32] (step=0002285) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.04440342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2286, "loss": 0.21076329052448273, "memory_gb": 7.721559524536133, "step_time_ms": 7569.7643756866455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:40] (step=0002286) Train Loss: 0.2545, Train Steps/Sec: 0.12, Epoch: 0.044422852701127086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2287, "loss": 0.18901535868644714, "memory_gb": 7.721559524536133, "step_time_ms": 7549.804449081421, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:48] (step=0002287) Train Loss: 0.2258, Train Steps/Sec: 0.12, Epoch: 0.04444228527011271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:00:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2288, "loss": 0.26492905616760254, "memory_gb": 7.721559524536133, "step_time_ms": 7545.506000518799, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:00:56] (step=0002288) Train Loss: 0.3034, Train Steps/Sec: 0.12, Epoch: 0.04446171783909833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2289, "loss": 0.2994515895843506, "memory_gb": 7.721559524536133, "step_time_ms": 7587.530136108398, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:04] (step=0002289) Train Loss: 0.2906, Train Steps/Sec: 0.12, Epoch: 0.044481150408083946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2290, "loss": 0.20087461173534393, "memory_gb": 7.721559524536133, "step_time_ms": 7556.4117431640625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:12] (step=0002290) Train Loss: 0.1943, Train Steps/Sec: 0.12, Epoch: 0.04450058297706957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2291, "loss": 0.2608410120010376, "memory_gb": 7.721559524536133, "step_time_ms": 7551.630020141602, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:20] (step=0002291) Train Loss: 0.2628, Train Steps/Sec: 0.12, Epoch: 0.04452001554605519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2292, "loss": 0.14083251357078552, "memory_gb": 7.721559524536133, "step_time_ms": 7631.018161773682, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:28] (step=0002292) Train Loss: 0.1869, Train Steps/Sec: 0.12, Epoch: 0.044539448115040806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2293, "loss": 0.268487811088562, "memory_gb": 7.721559524536133, "step_time_ms": 7566.127777099609, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:36] (step=0002293) Train Loss: 0.2952, Train Steps/Sec: 0.13, Epoch: 0.04455888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2294, "loss": 0.10856732726097107, "memory_gb": 7.721559524536133, "step_time_ms": 7438.671588897705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:44] (step=0002294) Train Loss: 0.1487, Train Steps/Sec: 0.13, Epoch: 0.04457831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2295, "loss": 0.19460979104042053, "memory_gb": 7.721559524536133, "step_time_ms": 7638.673782348633, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:52] (step=0002295) Train Loss: 0.2186, Train Steps/Sec: 0.12, Epoch: 0.044597745821997666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2296, "loss": 0.3284724950790405, "memory_gb": 7.721559524536133, "step_time_ms": 5479.973316192627, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:01:58] (step=0002296) Train Loss: 0.2818, Train Steps/Sec: 0.18, Epoch: 0.04461717839098329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2297, "loss": 0.31034135818481445, "memory_gb": 7.721559524536133, "step_time_ms": 7624.823570251465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:06] (step=0002297) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.04463661095996891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2298, "loss": 0.2655622363090515, "memory_gb": 7.721559524536133, "step_time_ms": 7573.498487472534, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:14] (step=0002298) Train Loss: 0.3066, Train Steps/Sec: 0.12, Epoch: 0.044656043528954525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2299, "loss": 0.21041801571846008, "memory_gb": 7.721559524536133, "step_time_ms": 7529.533386230469, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:22] (step=0002299) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.04467547609794015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2300, "loss": 0.14110663533210754, "memory_gb": 7.721559524536133, "step_time_ms": 7537.739038467407, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:30] (step=0002300) Train Loss: 0.1774, Train Steps/Sec: 0.12, Epoch: 0.04469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2301, "loss": 0.309260755777359, "memory_gb": 7.721559524536133, "step_time_ms": 7489.773273468018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:38] (step=0002301) Train Loss: 0.2398, Train Steps/Sec: 0.12, Epoch: 0.044714341235911385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2302, "loss": 0.1680777221918106, "memory_gb": 7.721559524536133, "step_time_ms": 7492.208003997803, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:46] (step=0002302) Train Loss: 0.2043, Train Steps/Sec: 0.12, Epoch: 0.04473377380489701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:02:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2303, "loss": 0.30567502975463867, "memory_gb": 7.721559524536133, "step_time_ms": 7769.206285476685, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:02:54] (step=0002303) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.04475320637388263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2304, "loss": 0.22594711184501648, "memory_gb": 7.721559524536133, "step_time_ms": 7501.838445663452, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:02] (step=0002304) Train Loss: 0.2754, Train Steps/Sec: 0.12, Epoch: 0.044772638942868245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2305, "loss": 0.1687602698802948, "memory_gb": 7.721559524536133, "step_time_ms": 7521.536827087402, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:10] (step=0002305) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.04479207151185387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2306, "loss": 0.2231665998697281, "memory_gb": 7.721559524536133, "step_time_ms": 7571.119785308838, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:18] (step=0002306) Train Loss: 0.2333, Train Steps/Sec: 0.12, Epoch: 0.04481150408083949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2307, "loss": 0.2654270529747009, "memory_gb": 7.721559524536133, "step_time_ms": 7524.634122848511, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:26] (step=0002307) Train Loss: 0.2352, Train Steps/Sec: 0.12, Epoch: 0.044830936649825105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2308, "loss": 0.28443261981010437, "memory_gb": 7.721559524536133, "step_time_ms": 7456.2811851501465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:34] (step=0002308) Train Loss: 0.3149, Train Steps/Sec: 0.13, Epoch: 0.04485036921881073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2309, "loss": 0.220867320895195, "memory_gb": 7.721559524536133, "step_time_ms": 7542.160987854004, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:42] (step=0002309) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.04486980178779635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2310, "loss": 0.24339714646339417, "memory_gb": 7.721559524536133, "step_time_ms": 7447.083473205566, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:50] (step=0002310) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.044889234356781965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:03:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2311, "loss": 0.2734109163284302, "memory_gb": 7.721559524536133, "step_time_ms": 7502.4659633636475, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:03:59] (step=0002311) Train Loss: 0.2654, Train Steps/Sec: 0.12, Epoch: 0.04490866692576759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2312, "loss": 0.21209058165550232, "memory_gb": 7.721559524536133, "step_time_ms": 7592.628002166748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:07] (step=0002312) Train Loss: 0.2686, Train Steps/Sec: 0.12, Epoch: 0.04492809949475321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2313, "loss": 0.32943493127822876, "memory_gb": 7.721559524536133, "step_time_ms": 7495.457649230957, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:15] (step=0002313) Train Loss: 0.3318, Train Steps/Sec: 0.12, Epoch: 0.044947532063738825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2314, "loss": 0.31064116954803467, "memory_gb": 7.721559524536133, "step_time_ms": 7493.093967437744, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:23] (step=0002314) Train Loss: 0.3082, Train Steps/Sec: 0.13, Epoch: 0.04496696463272445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2315, "loss": 0.2441207468509674, "memory_gb": 7.721559524536133, "step_time_ms": 7659.011363983154, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:31] (step=0002315) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.04498639720171006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2316, "loss": 0.2178390920162201, "memory_gb": 7.721559524536133, "step_time_ms": 7505.911827087402, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:39] (step=0002316) Train Loss: 0.2495, Train Steps/Sec: 0.12, Epoch: 0.045005829770695685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2317, "loss": 0.18938791751861572, "memory_gb": 7.721559524536133, "step_time_ms": 7495.68247795105, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:47] (step=0002317) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.04502526233968131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:04:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2318, "loss": 0.18794862926006317, "memory_gb": 7.721559524536133, "step_time_ms": 7550.408840179443, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:04:55] (step=0002318) Train Loss: 0.2415, Train Steps/Sec: 0.12, Epoch: 0.04504469490866692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2319, "loss": 0.19307836890220642, "memory_gb": 7.721559524536133, "step_time_ms": 7549.964427947998, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:03] (step=0002319) Train Loss: 0.2112, Train Steps/Sec: 0.12, Epoch: 0.045064127477652545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2320, "loss": 0.29901403188705444, "memory_gb": 7.721559524536133, "step_time_ms": 7486.4044189453125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:11] (step=0002320) Train Loss: 0.2865, Train Steps/Sec: 0.13, Epoch: 0.04508356004663817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2321, "loss": 0.19745638966560364, "memory_gb": 7.721559524536133, "step_time_ms": 7561.573505401611, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:19] (step=0002321) Train Loss: 0.2006, Train Steps/Sec: 0.12, Epoch: 0.04510299261562378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2322, "loss": 0.3425751328468323, "memory_gb": 7.721559524536133, "step_time_ms": 7430.337429046631, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:27] (step=0002322) Train Loss: 0.2661, Train Steps/Sec: 0.12, Epoch: 0.045122425184609405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2323, "loss": 0.22462084889411926, "memory_gb": 7.721559524536133, "step_time_ms": 7264.005899429321, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:35] (step=0002323) Train Loss: 0.1872, Train Steps/Sec: 0.13, Epoch: 0.04514185775359503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2324, "loss": 0.26804953813552856, "memory_gb": 7.721559524536133, "step_time_ms": 7497.735977172852, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:43] (step=0002324) Train Loss: 0.2977, Train Steps/Sec: 0.12, Epoch: 0.04516129032258064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2325, "loss": 0.27338796854019165, "memory_gb": 7.721559524536133, "step_time_ms": 5007.567882537842, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:49] (step=0002325) Train Loss: 0.3078, Train Steps/Sec: 0.17, Epoch: 0.045180722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2326, "loss": 0.3029574155807495, "memory_gb": 7.721559524536133, "step_time_ms": 7494.1725730896, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:05:57] (step=0002326) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.04520015546055189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2327, "loss": 0.2872992157936096, "memory_gb": 7.721559524536133, "step_time_ms": 7405.226230621338, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:05] (step=0002327) Train Loss: 0.3143, Train Steps/Sec: 0.12, Epoch: 0.0452195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2328, "loss": 0.246678426861763, "memory_gb": 7.721559524536133, "step_time_ms": 7423.223257064819, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:13] (step=0002328) Train Loss: 0.2056, Train Steps/Sec: 0.12, Epoch: 0.045239020598523125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2329, "loss": 0.2026391476392746, "memory_gb": 7.721559524536133, "step_time_ms": 7523.882150650024, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:21] (step=0002329) Train Loss: 0.2145, Train Steps/Sec: 0.12, Epoch: 0.04525845316750875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2330, "loss": 0.18824082612991333, "memory_gb": 7.721559524536133, "step_time_ms": 7447.283983230591, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:29] (step=0002330) Train Loss: 0.2163, Train Steps/Sec: 0.12, Epoch: 0.04527788573649436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2331, "loss": 0.2845962941646576, "memory_gb": 7.721559524536133, "step_time_ms": 7464.020490646362, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:37] (step=0002331) Train Loss: 0.2610, Train Steps/Sec: 0.12, Epoch: 0.045297318305479985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2332, "loss": 0.30113083124160767, "memory_gb": 7.721559524536133, "step_time_ms": 7506.3276290893555, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:45] (step=0002332) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.04531675087446561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:06:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2333, "loss": 0.2460358440876007, "memory_gb": 7.721559524536133, "step_time_ms": 7484.634160995483, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:06:53] (step=0002333) Train Loss: 0.2127, Train Steps/Sec: 0.12, Epoch: 0.04533618344345122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2334, "loss": 0.22686386108398438, "memory_gb": 7.721559524536133, "step_time_ms": 7463.167190551758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:01] (step=0002334) Train Loss: 0.2262, Train Steps/Sec: 0.12, Epoch: 0.045355616012436845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2335, "loss": 0.1543380469083786, "memory_gb": 7.721559524536133, "step_time_ms": 7509.359359741211, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:09] (step=0002335) Train Loss: 0.1628, Train Steps/Sec: 0.12, Epoch: 0.04537504858142247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2336, "loss": 0.34604835510253906, "memory_gb": 7.721559524536133, "step_time_ms": 7497.832298278809, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:18] (step=0002336) Train Loss: 0.3050, Train Steps/Sec: 0.12, Epoch: 0.04539448115040808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2337, "loss": 0.2233295887708664, "memory_gb": 7.721559524536133, "step_time_ms": 7456.97546005249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:26] (step=0002337) Train Loss: 0.2398, Train Steps/Sec: 0.13, Epoch: 0.045413913719393705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2338, "loss": 0.21741005778312683, "memory_gb": 7.721559524536133, "step_time_ms": 7536.745548248291, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:34] (step=0002338) Train Loss: 0.2864, Train Steps/Sec: 0.12, Epoch: 0.04543334628837933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2339, "loss": 0.19604304432868958, "memory_gb": 7.721559524536133, "step_time_ms": 7440.031290054321, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:42] (step=0002339) Train Loss: 0.2274, Train Steps/Sec: 0.13, Epoch: 0.04545277885736494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2340, "loss": 0.19200776517391205, "memory_gb": 7.721559524536133, "step_time_ms": 7458.346605300903, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:50] (step=0002340) Train Loss: 0.2621, Train Steps/Sec: 0.12, Epoch: 0.045472211426350564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:07:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2341, "loss": 0.15910643339157104, "memory_gb": 7.721559524536133, "step_time_ms": 7524.39284324646, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:07:58] (step=0002341) Train Loss: 0.1969, Train Steps/Sec: 0.12, Epoch: 0.04549164399533619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2342, "loss": 0.2288571149110794, "memory_gb": 7.721559524536133, "step_time_ms": 7475.544214248657, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:06] (step=0002342) Train Loss: 0.2353, Train Steps/Sec: 0.13, Epoch: 0.0455110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2343, "loss": 0.2715056538581848, "memory_gb": 7.721559524536133, "step_time_ms": 7470.6971645355225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:14] (step=0002343) Train Loss: 0.2568, Train Steps/Sec: 0.12, Epoch: 0.045530509133307424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2344, "loss": 0.24464596807956696, "memory_gb": 7.721559524536133, "step_time_ms": 7559.990406036377, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:22] (step=0002344) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.04554994170229304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2345, "loss": 0.3090812861919403, "memory_gb": 7.721559524536133, "step_time_ms": 7514.746904373169, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:30] (step=0002345) Train Loss: 0.2812, Train Steps/Sec: 0.12, Epoch: 0.04556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2346, "loss": 0.22871699929237366, "memory_gb": 7.721559524536133, "step_time_ms": 7451.190233230591, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:38] (step=0002346) Train Loss: 0.2182, Train Steps/Sec: 0.12, Epoch: 0.045588806840264284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2347, "loss": 0.15698741376399994, "memory_gb": 7.721559524536133, "step_time_ms": 7575.802564620972, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:46] (step=0002347) Train Loss: 0.2108, Train Steps/Sec: 0.12, Epoch: 0.0456082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2348, "loss": 0.20271515846252441, "memory_gb": 7.721559524536133, "step_time_ms": 7615.024089813232, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:08:54] (step=0002348) Train Loss: 0.2178, Train Steps/Sec: 0.13, Epoch: 0.04562767197823552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2349, "loss": 0.19090144336223602, "memory_gb": 7.721559524536133, "step_time_ms": 7515.2013301849365, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:02] (step=0002349) Train Loss: 0.2336, Train Steps/Sec: 0.12, Epoch: 0.045647104547221144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2350, "loss": 0.296023428440094, "memory_gb": 7.721559524536133, "step_time_ms": 7753.281593322754, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:10] (step=0002350) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.04566653711620676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2351, "loss": 0.23026321828365326, "memory_gb": 7.721559524536133, "step_time_ms": 7359.335422515869, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:18] (step=0002351) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.04568596968519238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2352, "loss": 0.3499947190284729, "memory_gb": 7.721559524536133, "step_time_ms": 7376.656770706177, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:26] (step=0002352) Train Loss: 0.2465, Train Steps/Sec: 0.13, Epoch: 0.045705402254178004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2353, "loss": 0.2873479127883911, "memory_gb": 7.721559524536133, "step_time_ms": 7588.336706161499, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:34] (step=0002353) Train Loss: 0.3080, Train Steps/Sec: 0.12, Epoch: 0.04572483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2354, "loss": 0.2092823088169098, "memory_gb": 7.721559524536133, "step_time_ms": 4960.314512252808, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:40] (step=0002354) Train Loss: 0.2509, Train Steps/Sec: 0.18, Epoch: 0.04574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2355, "loss": 0.24931550025939941, "memory_gb": 7.721559524536133, "step_time_ms": 7527.238368988037, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:48] (step=0002355) Train Loss: 0.2667, Train Steps/Sec: 0.12, Epoch: 0.045763699961134864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:09:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2356, "loss": 0.1359751969575882, "memory_gb": 7.721559524536133, "step_time_ms": 7448.590993881226, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:09:56] (step=0002356) Train Loss: 0.1958, Train Steps/Sec: 0.13, Epoch: 0.04578313253012048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2357, "loss": 0.23771294951438904, "memory_gb": 7.721559524536133, "step_time_ms": 7473.56915473938, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:04] (step=0002357) Train Loss: 0.2627, Train Steps/Sec: 0.12, Epoch: 0.0458025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2358, "loss": 0.1232331246137619, "memory_gb": 7.721559524536133, "step_time_ms": 7527.7416706085205, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:12] (step=0002358) Train Loss: 0.1954, Train Steps/Sec: 0.12, Epoch: 0.045821997668091724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2359, "loss": 0.2808837890625, "memory_gb": 7.721559524536133, "step_time_ms": 7506.356716156006, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:20] (step=0002359) Train Loss: 0.2462, Train Steps/Sec: 0.12, Epoch: 0.04584143023707734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2360, "loss": 0.2256178855895996, "memory_gb": 7.721559524536133, "step_time_ms": 7515.954494476318, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:28] (step=0002360) Train Loss: 0.1828, Train Steps/Sec: 0.12, Epoch: 0.04586086280606296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2361, "loss": 0.2220270186662674, "memory_gb": 7.721559524536133, "step_time_ms": 7612.659692764282, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:36] (step=0002361) Train Loss: 0.2615, Train Steps/Sec: 0.12, Epoch: 0.045880295375048584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2362, "loss": 0.23419886827468872, "memory_gb": 7.721559524536133, "step_time_ms": 7503.11803817749, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:44] (step=0002362) Train Loss: 0.2365, Train Steps/Sec: 0.12, Epoch: 0.0458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2363, "loss": 0.2862929701805115, "memory_gb": 7.721559524536133, "step_time_ms": 7552.875995635986, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:10:52] (step=0002363) Train Loss: 0.2735, Train Steps/Sec: 0.12, Epoch: 0.04591916051301982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2364, "loss": 0.2804213762283325, "memory_gb": 7.721559524536133, "step_time_ms": 7696.717262268066, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:00] (step=0002364) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.045938593082005444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2365, "loss": 0.21629944443702698, "memory_gb": 7.721559524536133, "step_time_ms": 7532.394170761108, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:08] (step=0002365) Train Loss: 0.2180, Train Steps/Sec: 0.12, Epoch: 0.04595802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2366, "loss": 0.2673321068286896, "memory_gb": 7.721559524536133, "step_time_ms": 7529.058933258057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:16] (step=0002366) Train Loss: 0.2429, Train Steps/Sec: 0.12, Epoch: 0.04597745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2367, "loss": 0.2423427700996399, "memory_gb": 7.721559524536133, "step_time_ms": 7639.41764831543, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:24] (step=0002367) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.045996890788962304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2368, "loss": 0.22521935403347015, "memory_gb": 7.721559524536133, "step_time_ms": 7467.029571533203, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:32] (step=0002368) Train Loss: 0.1964, Train Steps/Sec: 0.12, Epoch: 0.04601632335794792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2369, "loss": 0.19324789941310883, "memory_gb": 7.721559524536133, "step_time_ms": 7486.467361450195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:40] (step=0002369) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.04603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2370, "loss": 0.18908804655075073, "memory_gb": 7.721559524536133, "step_time_ms": 7532.896995544434, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:48] (step=0002370) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.046055188495919164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:11:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2371, "loss": 0.31183695793151855, "memory_gb": 7.721559524536133, "step_time_ms": 7448.794603347778, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:11:56] (step=0002371) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.04607462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2372, "loss": 0.17398719489574432, "memory_gb": 7.721559524536133, "step_time_ms": 7518.939733505249, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:05] (step=0002372) Train Loss: 0.2032, Train Steps/Sec: 0.12, Epoch: 0.0460940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2373, "loss": 0.29684334993362427, "memory_gb": 7.721559524536133, "step_time_ms": 7512.792110443115, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:13] (step=0002373) Train Loss: 0.2673, Train Steps/Sec: 0.12, Epoch: 0.046113486202876024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2374, "loss": 0.281803160905838, "memory_gb": 7.721559524536133, "step_time_ms": 7501.6608238220215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:21] (step=0002374) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.04613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2375, "loss": 0.18759721517562866, "memory_gb": 7.721559524536133, "step_time_ms": 7425.884962081909, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:29] (step=0002375) Train Loss: 0.1637, Train Steps/Sec: 0.12, Epoch: 0.04615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2376, "loss": 0.27620360255241394, "memory_gb": 7.721559524536133, "step_time_ms": 7555.108547210693, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:37] (step=0002376) Train Loss: 0.2626, Train Steps/Sec: 0.12, Epoch: 0.04617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2377, "loss": 0.23420575261116028, "memory_gb": 7.721559524536133, "step_time_ms": 7472.8217124938965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:45] (step=0002377) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.0461912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:12:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2378, "loss": 0.24192366003990173, "memory_gb": 7.721559524536133, "step_time_ms": 7459.480285644531, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:12:53] (step=0002378) Train Loss: 0.2060, Train Steps/Sec: 0.12, Epoch: 0.04621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2379, "loss": 0.11325893551111221, "memory_gb": 7.721559524536133, "step_time_ms": 7517.463207244873, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:01] (step=0002379) Train Loss: 0.2318, Train Steps/Sec: 0.12, Epoch: 0.04623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2380, "loss": 0.16858600080013275, "memory_gb": 7.721559524536133, "step_time_ms": 7458.893299102783, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:09] (step=0002380) Train Loss: 0.2034, Train Steps/Sec: 0.13, Epoch: 0.04624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2381, "loss": 0.32003694772720337, "memory_gb": 7.721559524536133, "step_time_ms": 7347.395420074463, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:17] (step=0002381) Train Loss: 0.2809, Train Steps/Sec: 0.13, Epoch: 0.04626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2382, "loss": 0.20006750524044037, "memory_gb": 7.721559524536133, "step_time_ms": 7508.445024490356, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:25] (step=0002382) Train Loss: 0.2084, Train Steps/Sec: 0.12, Epoch: 0.046288379323746597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2383, "loss": 0.25834527611732483, "memory_gb": 7.721559524536133, "step_time_ms": 5400.701522827148, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:30] (step=0002383) Train Loss: 0.2099, Train Steps/Sec: 0.18, Epoch: 0.04630781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2384, "loss": 0.1688186228275299, "memory_gb": 7.721559524536133, "step_time_ms": 7509.654760360718, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:39] (step=0002384) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.04632724446171784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2385, "loss": 0.23107194900512695, "memory_gb": 7.721559524536133, "step_time_ms": 7450.201511383057, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:47] (step=0002385) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.046346677030703456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:13:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2386, "loss": 0.28264081478118896, "memory_gb": 7.721559524536133, "step_time_ms": 7445.544004440308, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:13:55] (step=0002386) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.04636610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2387, "loss": 0.19146503508090973, "memory_gb": 7.721559524536133, "step_time_ms": 7486.3903522491455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:03] (step=0002387) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.0463855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2388, "loss": 0.16369542479515076, "memory_gb": 7.721559524536133, "step_time_ms": 7465.32940864563, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:11] (step=0002388) Train Loss: 0.2251, Train Steps/Sec: 0.12, Epoch: 0.046404974737660316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2389, "loss": 0.2349487841129303, "memory_gb": 7.721559524536133, "step_time_ms": 7491.326570510864, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:19] (step=0002389) Train Loss: 0.2137, Train Steps/Sec: 0.12, Epoch: 0.04642440730664594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2390, "loss": 0.2686479389667511, "memory_gb": 7.721559524536133, "step_time_ms": 7531.108379364014, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:27] (step=0002390) Train Loss: 0.2376, Train Steps/Sec: 0.12, Epoch: 0.04644383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2391, "loss": 0.18999788165092468, "memory_gb": 7.721559524536133, "step_time_ms": 7578.441619873047, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:35] (step=0002391) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.046463272444617176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2392, "loss": 0.33015844225883484, "memory_gb": 7.721559524536133, "step_time_ms": 7486.042499542236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:43] (step=0002392) Train Loss: 0.2977, Train Steps/Sec: 0.12, Epoch: 0.0464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2393, "loss": 0.23314493894577026, "memory_gb": 7.721559524536133, "step_time_ms": 7505.584955215454, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:51] (step=0002393) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.04650213758258842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2394, "loss": 0.22487777471542358, "memory_gb": 7.721559524536133, "step_time_ms": 7481.523036956787, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:14:59] (step=0002394) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.046521570151574036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2395, "loss": 0.2901875376701355, "memory_gb": 7.721559524536133, "step_time_ms": 7435.200929641724, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:07] (step=0002395) Train Loss: 0.2626, Train Steps/Sec: 0.12, Epoch: 0.04654100272055966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2396, "loss": 0.2502148747444153, "memory_gb": 7.721559524536133, "step_time_ms": 7509.412527084351, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:15] (step=0002396) Train Loss: 0.1968, Train Steps/Sec: 0.12, Epoch: 0.04656043528954528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2397, "loss": 0.2396211475133896, "memory_gb": 7.721559524536133, "step_time_ms": 7444.5250034332275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:23] (step=0002397) Train Loss: 0.2554, Train Steps/Sec: 0.12, Epoch: 0.046579867858530896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2398, "loss": 0.21260246634483337, "memory_gb": 7.721559524536133, "step_time_ms": 7426.359176635742, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:31] (step=0002398) Train Loss: 0.2343, Train Steps/Sec: 0.12, Epoch: 0.04659930042751652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2399, "loss": 0.23166489601135254, "memory_gb": 7.721559524536133, "step_time_ms": 7527.529954910278, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:39] (step=0002399) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.04661873299650214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2400, "loss": 0.2706086337566376, "memory_gb": 7.721559524536133, "step_time_ms": 7560.823917388916, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:47] (step=0002400) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.046638165565487756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:15:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2401, "loss": 0.3519376218318939, "memory_gb": 7.721559524536133, "step_time_ms": 7498.519659042358, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:15:55] (step=0002401) Train Loss: 0.2589, Train Steps/Sec: 0.12, Epoch: 0.04665759813447338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2402, "loss": 0.2918429672718048, "memory_gb": 7.721559524536133, "step_time_ms": 7545.79496383667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:03] (step=0002402) Train Loss: 0.3131, Train Steps/Sec: 0.12, Epoch: 0.046677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2403, "loss": 0.22301286458969116, "memory_gb": 7.721559524536133, "step_time_ms": 7481.804847717285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:11] (step=0002403) Train Loss: 0.2787, Train Steps/Sec: 0.13, Epoch: 0.046696463272444616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2404, "loss": 0.33014339208602905, "memory_gb": 7.721559524536133, "step_time_ms": 7467.210054397583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:20] (step=0002404) Train Loss: 0.3247, Train Steps/Sec: 0.12, Epoch: 0.04671589584143024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2405, "loss": 0.305420458316803, "memory_gb": 7.721559524536133, "step_time_ms": 7573.878765106201, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:28] (step=0002405) Train Loss: 0.2984, Train Steps/Sec: 0.12, Epoch: 0.046735328410415854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2406, "loss": 0.24989688396453857, "memory_gb": 7.721559524536133, "step_time_ms": 7585.3447914123535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:36] (step=0002406) Train Loss: 0.2832, Train Steps/Sec: 0.12, Epoch: 0.046754760979401476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2407, "loss": 0.20145183801651, "memory_gb": 7.721559524536133, "step_time_ms": 7534.345626831055, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:44] (step=0002407) Train Loss: 0.1911, Train Steps/Sec: 0.13, Epoch: 0.0467741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2408, "loss": 0.22136004269123077, "memory_gb": 7.721559524536133, "step_time_ms": 7679.217100143433, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:16:52] (step=0002408) Train Loss: 0.2283, Train Steps/Sec: 0.12, Epoch: 0.046793626117372714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2409, "loss": 0.24755755066871643, "memory_gb": 7.721559524536133, "step_time_ms": 7611.281156539917, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:00] (step=0002409) Train Loss: 0.2870, Train Steps/Sec: 0.12, Epoch: 0.046813058686358336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2410, "loss": 0.21543535590171814, "memory_gb": 7.721559524536133, "step_time_ms": 7401.611804962158, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:08] (step=0002410) Train Loss: 0.2410, Train Steps/Sec: 0.13, Epoch: 0.04683249125534396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2411, "loss": 0.2565360963344574, "memory_gb": 7.721559524536133, "step_time_ms": 7586.0278606414795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:16] (step=0002411) Train Loss: 0.2630, Train Steps/Sec: 0.12, Epoch: 0.046851923824329574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2412, "loss": 0.24247652292251587, "memory_gb": 7.721559524536133, "step_time_ms": 4918.225049972534, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:22] (step=0002412) Train Loss: 0.2385, Train Steps/Sec: 0.17, Epoch: 0.046871356393315196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2413, "loss": 0.23577609658241272, "memory_gb": 7.715639114379883, "step_time_ms": 7393.710136413574, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:30] (step=0002413) Train Loss: 0.2031, Train Steps/Sec: 0.13, Epoch: 0.04689078896230082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2414, "loss": 0.21733306348323822, "memory_gb": 7.721559524536133, "step_time_ms": 7553.305864334106, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:38] (step=0002414) Train Loss: 0.2184, Train Steps/Sec: 0.13, Epoch: 0.04691022153128643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2415, "loss": 0.1898702085018158, "memory_gb": 7.721559524536133, "step_time_ms": 7514.258861541748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:46] (step=0002415) Train Loss: 0.2043, Train Steps/Sec: 0.12, Epoch: 0.046929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:17:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2416, "loss": 0.17807501554489136, "memory_gb": 7.721559524536133, "step_time_ms": 7573.319673538208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:17:54] (step=0002416) Train Loss: 0.1636, Train Steps/Sec: 0.12, Epoch: 0.04694908666925768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2417, "loss": 0.1892319768667221, "memory_gb": 7.721559524536133, "step_time_ms": 7367.752552032471, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:02] (step=0002417) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.04696851923824329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2418, "loss": 0.23306620121002197, "memory_gb": 7.721559524536133, "step_time_ms": 7527.611017227173, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:10] (step=0002418) Train Loss: 0.2266, Train Steps/Sec: 0.12, Epoch: 0.046987951807228916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2419, "loss": 0.15429145097732544, "memory_gb": 7.721559524536133, "step_time_ms": 7623.297691345215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:18] (step=0002419) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.04700738437621454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2420, "loss": 0.3191280961036682, "memory_gb": 7.721559524536133, "step_time_ms": 7540.935516357422, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:26] (step=0002420) Train Loss: 0.2578, Train Steps/Sec: 0.12, Epoch: 0.04702681694520015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2421, "loss": 0.1596301645040512, "memory_gb": 7.721559524536133, "step_time_ms": 7535.081148147583, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:34] (step=0002421) Train Loss: 0.2158, Train Steps/Sec: 0.12, Epoch: 0.047046249514185776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2422, "loss": 0.3496342599391937, "memory_gb": 7.721559524536133, "step_time_ms": 7577.983617782593, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:42] (step=0002422) Train Loss: 0.2884, Train Steps/Sec: 0.12, Epoch: 0.0470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2423, "loss": 0.15026730298995972, "memory_gb": 7.721559524536133, "step_time_ms": 7533.161163330078, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:50] (step=0002423) Train Loss: 0.2203, Train Steps/Sec: 0.12, Epoch: 0.04708511465215701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:18:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2424, "loss": 0.20282834768295288, "memory_gb": 7.721559524536133, "step_time_ms": 7456.444025039673, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:18:58] (step=0002424) Train Loss: 0.2032, Train Steps/Sec: 0.13, Epoch: 0.047104547221142636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2425, "loss": 0.2465885877609253, "memory_gb": 7.721559524536133, "step_time_ms": 7529.146671295166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:06] (step=0002425) Train Loss: 0.2972, Train Steps/Sec: 0.12, Epoch: 0.04712397979012826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2426, "loss": 0.35446634888648987, "memory_gb": 7.721559524536133, "step_time_ms": 7469.735622406006, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:14] (step=0002426) Train Loss: 0.3026, Train Steps/Sec: 0.12, Epoch: 0.04714341235911387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2427, "loss": 0.26890113949775696, "memory_gb": 7.721559524536133, "step_time_ms": 7494.539499282837, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:23] (step=0002427) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.047162844928099495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2428, "loss": 0.2509322762489319, "memory_gb": 7.721559524536133, "step_time_ms": 7489.746809005737, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:31] (step=0002428) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.04718227749708512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2429, "loss": 0.29410791397094727, "memory_gb": 7.721559524536133, "step_time_ms": 7492.965221405029, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:39] (step=0002429) Train Loss: 0.2325, Train Steps/Sec: 0.12, Epoch: 0.04720171006607073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2430, "loss": 0.25424641370773315, "memory_gb": 7.721559524536133, "step_time_ms": 7430.242538452148, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:47] (step=0002430) Train Loss: 0.3026, Train Steps/Sec: 0.12, Epoch: 0.047221142635056355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:19:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2431, "loss": 0.18838682770729065, "memory_gb": 7.721559524536133, "step_time_ms": 7465.358018875122, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:19:55] (step=0002431) Train Loss: 0.1896, Train Steps/Sec: 0.12, Epoch: 0.04724057520404198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2432, "loss": 0.30731722712516785, "memory_gb": 7.721559524536133, "step_time_ms": 7480.966806411743, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:03] (step=0002432) Train Loss: 0.2896, Train Steps/Sec: 0.12, Epoch: 0.04726000777302759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2433, "loss": 0.27524155378341675, "memory_gb": 7.721559524536133, "step_time_ms": 7436.268091201782, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:11] (step=0002433) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.047279440342013215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2434, "loss": 0.1860543042421341, "memory_gb": 7.721559524536133, "step_time_ms": 7505.961656570435, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:19] (step=0002434) Train Loss: 0.2166, Train Steps/Sec: 0.12, Epoch: 0.04729887291099883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2435, "loss": 0.20428785681724548, "memory_gb": 7.721559524536133, "step_time_ms": 7504.838943481445, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:27] (step=0002435) Train Loss: 0.2160, Train Steps/Sec: 0.12, Epoch: 0.04731830547998445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2436, "loss": 0.31820324063301086, "memory_gb": 7.715639114379883, "step_time_ms": 7416.479110717773, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:35] (step=0002436) Train Loss: 0.2365, Train Steps/Sec: 0.12, Epoch: 0.047337738048970075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2437, "loss": 0.28980404138565063, "memory_gb": 7.715639114379883, "step_time_ms": 7473.583698272705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:43] (step=0002437) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.04735717061795569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2438, "loss": 0.2753971517086029, "memory_gb": 7.721559524536133, "step_time_ms": 7401.947975158691, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:51] (step=0002438) Train Loss: 0.2276, Train Steps/Sec: 0.13, Epoch: 0.04737660318694131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:20:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2439, "loss": 0.26018112897872925, "memory_gb": 7.721559524536133, "step_time_ms": 7442.003488540649, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:20:59] (step=0002439) Train Loss: 0.2277, Train Steps/Sec: 0.13, Epoch: 0.047396035755926935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2440, "loss": 0.1397474706172943, "memory_gb": 7.721559524536133, "step_time_ms": 7478.092193603516, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:07] (step=0002440) Train Loss: 0.1937, Train Steps/Sec: 0.13, Epoch: 0.04741546832491255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2441, "loss": 0.3789149522781372, "memory_gb": 7.721559524536133, "step_time_ms": 5191.402912139893, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:13] (step=0002441) Train Loss: 0.3203, Train Steps/Sec: 0.17, Epoch: 0.04743490089389817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2442, "loss": 0.305573046207428, "memory_gb": 7.721559524536133, "step_time_ms": 7543.995141983032, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:21] (step=0002442) Train Loss: 0.3078, Train Steps/Sec: 0.12, Epoch: 0.047454333462883795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2443, "loss": 0.21557733416557312, "memory_gb": 7.721559524536133, "step_time_ms": 7440.4590129852295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:29] (step=0002443) Train Loss: 0.2212, Train Steps/Sec: 0.12, Epoch: 0.04747376603186941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2444, "loss": 0.27948522567749023, "memory_gb": 7.721559524536133, "step_time_ms": 7430.510520935059, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:37] (step=0002444) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.04749319860085503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2445, "loss": 0.17703276872634888, "memory_gb": 7.721559524536133, "step_time_ms": 7538.386583328247, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:45] (step=0002445) Train Loss: 0.2429, Train Steps/Sec: 0.12, Epoch: 0.047512631169840655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2446, "loss": 0.23272165656089783, "memory_gb": 7.721559524536133, "step_time_ms": 7446.920871734619, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:21:53] (step=0002446) Train Loss: 0.2005, Train Steps/Sec: 0.12, Epoch: 0.04753206373882627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2447, "loss": 0.145869642496109, "memory_gb": 7.721559524536133, "step_time_ms": 7398.077964782715, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:01] (step=0002447) Train Loss: 0.1877, Train Steps/Sec: 0.12, Epoch: 0.04755149630781189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2448, "loss": 0.31003183126449585, "memory_gb": 7.721559524536133, "step_time_ms": 7465.896368026733, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:09] (step=0002448) Train Loss: 0.2925, Train Steps/Sec: 0.13, Epoch: 0.047570928876797515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2449, "loss": 0.23770259320735931, "memory_gb": 7.721559524536133, "step_time_ms": 7446.7453956604, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:17] (step=0002449) Train Loss: 0.2283, Train Steps/Sec: 0.12, Epoch: 0.04759036144578313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2450, "loss": 0.15469703078269958, "memory_gb": 7.721559524536133, "step_time_ms": 7410.748481750488, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:25] (step=0002450) Train Loss: 0.1544, Train Steps/Sec: 0.12, Epoch: 0.04760979401476875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2451, "loss": 0.22323289513587952, "memory_gb": 7.721559524536133, "step_time_ms": 7494.735956192017, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:33] (step=0002451) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.047629226583754375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2452, "loss": 0.3241744041442871, "memory_gb": 7.721559524536133, "step_time_ms": 7472.732305526733, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:41] (step=0002452) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.04764865915273999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2453, "loss": 0.3107318878173828, "memory_gb": 7.721559524536133, "step_time_ms": 7450.868129730225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:49] (step=0002453) Train Loss: 0.3365, Train Steps/Sec: 0.13, Epoch: 0.04766809172172561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:22:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2454, "loss": 0.18335723876953125, "memory_gb": 7.721559524536133, "step_time_ms": 7482.830286026001, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:22:57] (step=0002454) Train Loss: 0.2518, Train Steps/Sec: 0.13, Epoch: 0.047687524290711235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2455, "loss": 0.22170016169548035, "memory_gb": 7.721559524536133, "step_time_ms": 7574.5484828948975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:05] (step=0002455) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.04770695685969685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2456, "loss": 0.2393341362476349, "memory_gb": 7.721559524536133, "step_time_ms": 7465.352535247803, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:13] (step=0002456) Train Loss: 0.2308, Train Steps/Sec: 0.13, Epoch: 0.04772638942868247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2457, "loss": 0.21449360251426697, "memory_gb": 7.721559524536133, "step_time_ms": 7514.087915420532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:21] (step=0002457) Train Loss: 0.2197, Train Steps/Sec: 0.12, Epoch: 0.047745821997668095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2458, "loss": 0.38303542137145996, "memory_gb": 7.721559524536133, "step_time_ms": 7535.441637039185, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:29] (step=0002458) Train Loss: 0.3392, Train Steps/Sec: 0.12, Epoch: 0.04776525456665371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2459, "loss": 0.15601769089698792, "memory_gb": 7.721559524536133, "step_time_ms": 7394.898891448975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:37] (step=0002459) Train Loss: 0.1974, Train Steps/Sec: 0.13, Epoch: 0.04778468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2460, "loss": 0.1799541860818863, "memory_gb": 7.721559524536133, "step_time_ms": 7393.6426639556885, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:45] (step=0002460) Train Loss: 0.1960, Train Steps/Sec: 0.12, Epoch: 0.047804119704624955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2461, "loss": 0.17455054819583893, "memory_gb": 7.721559524536133, "step_time_ms": 7459.449052810669, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:23:53] (step=0002461) Train Loss: 0.2077, Train Steps/Sec: 0.12, Epoch: 0.04782355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2462, "loss": 0.23307140171527863, "memory_gb": 7.721559524536133, "step_time_ms": 7425.129413604736, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:01] (step=0002462) Train Loss: 0.2287, Train Steps/Sec: 0.13, Epoch: 0.04784298484259619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2463, "loss": 0.21160608530044556, "memory_gb": 7.721559524536133, "step_time_ms": 7491.884469985962, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:09] (step=0002463) Train Loss: 0.2380, Train Steps/Sec: 0.12, Epoch: 0.04786241741158181, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2464, "loss": 0.19216981530189514, "memory_gb": 7.721559524536133, "step_time_ms": 7485.373258590698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:17] (step=0002464) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.04788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2465, "loss": 0.34029945731163025, "memory_gb": 7.721559524536133, "step_time_ms": 7526.816844940186, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:25] (step=0002465) Train Loss: 0.2799, Train Steps/Sec: 0.12, Epoch: 0.04790128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2466, "loss": 0.2315012365579605, "memory_gb": 7.721559524536133, "step_time_ms": 7535.6605052948, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:33] (step=0002466) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.04792071511853867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2467, "loss": 0.13224276900291443, "memory_gb": 7.721559524536133, "step_time_ms": 7580.273151397705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:41] (step=0002467) Train Loss: 0.2144, Train Steps/Sec: 0.12, Epoch: 0.04794014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2468, "loss": 0.2969295084476471, "memory_gb": 7.721559524536133, "step_time_ms": 7377.779960632324, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:49] (step=0002468) Train Loss: 0.2646, Train Steps/Sec: 0.13, Epoch: 0.04795958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2469, "loss": 0.20013436675071716, "memory_gb": 7.721559524536133, "step_time_ms": 7526.533603668213, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:24:57] (step=0002469) Train Loss: 0.2034, Train Steps/Sec: 0.13, Epoch: 0.04797901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2470, "loss": 0.2079114466905594, "memory_gb": 7.721559524536133, "step_time_ms": 4976.839065551758, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:03] (step=0002470) Train Loss: 0.2407, Train Steps/Sec: 0.18, Epoch: 0.04799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2471, "loss": 0.24645854532718658, "memory_gb": 7.721559524536133, "step_time_ms": 7608.831167221069, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:11] (step=0002471) Train Loss: 0.2200, Train Steps/Sec: 0.12, Epoch: 0.04801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2472, "loss": 0.2281387746334076, "memory_gb": 7.721559524536133, "step_time_ms": 7586.742639541626, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:19] (step=0002472) Train Loss: 0.2413, Train Steps/Sec: 0.13, Epoch: 0.04803731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2473, "loss": 0.2448643445968628, "memory_gb": 7.721559524536133, "step_time_ms": 7565.6797885894775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:27] (step=0002473) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.04805674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2474, "loss": 0.2186761200428009, "memory_gb": 7.721559524536133, "step_time_ms": 7620.083808898926, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:35] (step=0002474) Train Loss: 0.2147, Train Steps/Sec: 0.12, Epoch: 0.04807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2475, "loss": 0.2826076149940491, "memory_gb": 7.721559524536133, "step_time_ms": 7578.188419342041, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:43] (step=0002475) Train Loss: 0.2288, Train Steps/Sec: 0.12, Epoch: 0.04809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2476, "loss": 0.22221529483795166, "memory_gb": 7.721559524536133, "step_time_ms": 7540.053367614746, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:51] (step=0002476) Train Loss: 0.2648, Train Steps/Sec: 0.12, Epoch: 0.04811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:25:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2477, "loss": 0.31216657161712646, "memory_gb": 7.721559524536133, "step_time_ms": 7590.677976608276, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:25:59] (step=0002477) Train Loss: 0.3052, Train Steps/Sec: 0.12, Epoch: 0.04813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2478, "loss": 0.2424689531326294, "memory_gb": 7.721559524536133, "step_time_ms": 7453.22322845459, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:07] (step=0002478) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.04815390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2479, "loss": 0.32409849762916565, "memory_gb": 7.721559524536133, "step_time_ms": 7491.299867630005, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:15] (step=0002479) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.04817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2480, "loss": 0.22255608439445496, "memory_gb": 7.721559524536133, "step_time_ms": 7581.674814224243, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:23] (step=0002480) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.04819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2481, "loss": 0.2135511338710785, "memory_gb": 7.721559524536133, "step_time_ms": 7337.197303771973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:31] (step=0002481) Train Loss: 0.1742, Train Steps/Sec: 0.13, Epoch: 0.04821220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2482, "loss": 0.18462854623794556, "memory_gb": 7.721559524536133, "step_time_ms": 7469.371557235718, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:39] (step=0002482) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.04823163622230859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2483, "loss": 0.3009163737297058, "memory_gb": 7.721559524536133, "step_time_ms": 7558.432817459106, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:47] (step=0002483) Train Loss: 0.2429, Train Steps/Sec: 0.12, Epoch: 0.04825106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2484, "loss": 0.24229256808757782, "memory_gb": 7.721559524536133, "step_time_ms": 7463.413238525391, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:26:55] (step=0002484) Train Loss: 0.2430, Train Steps/Sec: 0.12, Epoch: 0.04827050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2485, "loss": 0.28587499260902405, "memory_gb": 7.721559524536133, "step_time_ms": 7469.8216915130615, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:03] (step=0002485) Train Loss: 0.2212, Train Steps/Sec: 0.13, Epoch: 0.04828993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2486, "loss": 0.23881813883781433, "memory_gb": 7.721559524536133, "step_time_ms": 7564.781188964844, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:11] (step=0002486) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.04830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2487, "loss": 0.3704516887664795, "memory_gb": 7.721559524536133, "step_time_ms": 7599.977254867554, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:19] (step=0002487) Train Loss: 0.3049, Train Steps/Sec: 0.12, Epoch: 0.04832879906723669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2488, "loss": 0.28029200434684753, "memory_gb": 7.721559524536133, "step_time_ms": 7485.370874404907, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:27] (step=0002488) Train Loss: 0.2883, Train Steps/Sec: 0.12, Epoch: 0.04834823163622231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2489, "loss": 0.21677690744400024, "memory_gb": 7.721559524536133, "step_time_ms": 7513.300895690918, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:35] (step=0002489) Train Loss: 0.1942, Train Steps/Sec: 0.12, Epoch: 0.04836766420520793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2490, "loss": 0.30389514565467834, "memory_gb": 7.721559524536133, "step_time_ms": 7453.158617019653, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:43] (step=0002490) Train Loss: 0.3241, Train Steps/Sec: 0.12, Epoch: 0.04838709677419355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:27:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2491, "loss": 0.22590209543704987, "memory_gb": 7.721559524536133, "step_time_ms": 7485.764980316162, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:27:52] (step=0002491) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.04840652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2492, "loss": 0.19083011150360107, "memory_gb": 7.721559524536133, "step_time_ms": 7522.680759429932, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:00] (step=0002492) Train Loss: 0.2149, Train Steps/Sec: 0.12, Epoch: 0.04842596191216479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2493, "loss": 0.19175107777118683, "memory_gb": 7.721559524536133, "step_time_ms": 7481.094121932983, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:08] (step=0002493) Train Loss: 0.2220, Train Steps/Sec: 0.12, Epoch: 0.04844539448115041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2494, "loss": 0.29020464420318604, "memory_gb": 7.721559524536133, "step_time_ms": 7401.85809135437, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:16] (step=0002494) Train Loss: 0.2289, Train Steps/Sec: 0.12, Epoch: 0.04846482705013603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2495, "loss": 0.2878413200378418, "memory_gb": 7.721559524536133, "step_time_ms": 7518.269062042236, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:24] (step=0002495) Train Loss: 0.2747, Train Steps/Sec: 0.12, Epoch: 0.048484259619121645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2496, "loss": 0.2123776078224182, "memory_gb": 7.721559524536133, "step_time_ms": 7444.732427597046, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:32] (step=0002496) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.04850369218810727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2497, "loss": 0.2380649745464325, "memory_gb": 7.721559524536133, "step_time_ms": 7330.87158203125, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:40] (step=0002497) Train Loss: 0.2422, Train Steps/Sec: 0.13, Epoch: 0.04852312475709289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2498, "loss": 0.26726871728897095, "memory_gb": 7.721559524536133, "step_time_ms": 7497.5879192352295, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:48] (step=0002498) Train Loss: 0.2592, Train Steps/Sec: 0.12, Epoch: 0.048542557326078505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2499, "loss": 0.3608831763267517, "memory_gb": 7.721559524536133, "step_time_ms": 5247.828006744385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:28:54] (step=0002499) Train Loss: 0.2617, Train Steps/Sec: 0.17, Epoch: 0.04856198989506413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2500, "loss": 0.2238565981388092, "memory_gb": 7.721559524536133, "step_time_ms": 7517.0488357543945, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:02] (step=0002500) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.04858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2501, "loss": 0.2664407789707184, "memory_gb": 7.721559524536133, "step_time_ms": 7387.326240539551, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:10] (step=0002501) Train Loss: 0.2803, Train Steps/Sec: 0.13, Epoch: 0.048600855033035364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2502, "loss": 0.2947356104850769, "memory_gb": 7.721559524536133, "step_time_ms": 7392.010450363159, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:18] (step=0002502) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.04862028760202099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2503, "loss": 0.16565027832984924, "memory_gb": 7.721559524536133, "step_time_ms": 7534.446001052856, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:26] (step=0002503) Train Loss: 0.1836, Train Steps/Sec: 0.12, Epoch: 0.04863972017100661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2504, "loss": 0.2807750701904297, "memory_gb": 7.721559524536133, "step_time_ms": 7463.709592819214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:34] (step=0002504) Train Loss: 0.2375, Train Steps/Sec: 0.12, Epoch: 0.048659152739992224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2505, "loss": 0.24307037889957428, "memory_gb": 7.721559524536133, "step_time_ms": 7459.745645523071, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:42] (step=0002505) Train Loss: 0.2668, Train Steps/Sec: 0.12, Epoch: 0.04867858530897785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2506, "loss": 0.16210363805294037, "memory_gb": 7.721559524536133, "step_time_ms": 7563.271522521973, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:50] (step=0002506) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.04869801787796347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:29:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2507, "loss": 0.2702067792415619, "memory_gb": 7.721559524536133, "step_time_ms": 7479.69913482666, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:29:58] (step=0002507) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.048717450446949084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2508, "loss": 0.27238768339157104, "memory_gb": 7.721559524536133, "step_time_ms": 7476.377487182617, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:06] (step=0002508) Train Loss: 0.2299, Train Steps/Sec: 0.12, Epoch: 0.04873688301593471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2509, "loss": 0.32276666164398193, "memory_gb": 7.721559524536133, "step_time_ms": 7491.178512573242, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:14] (step=0002509) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.04875631558492033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2510, "loss": 0.23672141134738922, "memory_gb": 7.721559524536133, "step_time_ms": 7462.361812591553, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:22] (step=0002510) Train Loss: 0.3052, Train Steps/Sec: 0.12, Epoch: 0.048775748153905944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2511, "loss": 0.2559047341346741, "memory_gb": 7.721559524536133, "step_time_ms": 7515.258550643921, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:30] (step=0002511) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.04879518072289157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2512, "loss": 0.13838474452495575, "memory_gb": 7.721559524536133, "step_time_ms": 7584.27095413208, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:38] (step=0002512) Train Loss: 0.1463, Train Steps/Sec: 0.12, Epoch: 0.04881461329187719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2513, "loss": 0.27995866537094116, "memory_gb": 7.721559524536133, "step_time_ms": 7477.081060409546, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:46] (step=0002513) Train Loss: 0.3089, Train Steps/Sec: 0.12, Epoch: 0.048834045860862804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:30:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2514, "loss": 0.259692907333374, "memory_gb": 7.721559524536133, "step_time_ms": 7431.611776351929, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:30:54] (step=0002514) Train Loss: 0.3135, Train Steps/Sec: 0.13, Epoch: 0.048853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2515, "loss": 0.2750501036643982, "memory_gb": 7.721559524536133, "step_time_ms": 7535.459756851196, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:02] (step=0002515) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.04887291099883405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2516, "loss": 0.2883603870868683, "memory_gb": 7.721559524536133, "step_time_ms": 7535.992622375488, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:10] (step=0002516) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.048892343567819664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2517, "loss": 0.1575966477394104, "memory_gb": 7.721559524536133, "step_time_ms": 7447.681903839111, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:18] (step=0002517) Train Loss: 0.2074, Train Steps/Sec: 0.12, Epoch: 0.048911776136805286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2518, "loss": 0.1825053095817566, "memory_gb": 7.721559524536133, "step_time_ms": 7551.341533660889, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:26] (step=0002518) Train Loss: 0.2078, Train Steps/Sec: 0.12, Epoch: 0.04893120870579091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2519, "loss": 0.2832193076610565, "memory_gb": 7.721559524536133, "step_time_ms": 7593.6150550842285, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:35] (step=0002519) Train Loss: 0.3222, Train Steps/Sec: 0.12, Epoch: 0.048950641274776524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2520, "loss": 0.1459694802761078, "memory_gb": 7.721559524536133, "step_time_ms": 7498.503684997559, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:43] (step=0002520) Train Loss: 0.1901, Train Steps/Sec: 0.12, Epoch: 0.048970073843762146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2521, "loss": 0.18457692861557007, "memory_gb": 7.721559524536133, "step_time_ms": 7611.876726150513, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:51] (step=0002521) Train Loss: 0.1854, Train Steps/Sec: 0.12, Epoch: 0.04898950641274777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:31:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2522, "loss": 0.25745004415512085, "memory_gb": 7.721559524536133, "step_time_ms": 7630.232572555542, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:31:59] (step=0002522) Train Loss: 0.2620, Train Steps/Sec: 0.12, Epoch: 0.049008938981733384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2523, "loss": 0.2124033123254776, "memory_gb": 7.721559524536133, "step_time_ms": 7628.355979919434, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:07] (step=0002523) Train Loss: 0.1930, Train Steps/Sec: 0.12, Epoch: 0.049028371550719006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2524, "loss": 0.2099839299917221, "memory_gb": 7.721559524536133, "step_time_ms": 7641.741514205933, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:15] (step=0002524) Train Loss: 0.2380, Train Steps/Sec: 0.12, Epoch: 0.04904780411970462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2525, "loss": 0.182289719581604, "memory_gb": 7.721559524536133, "step_time_ms": 7576.075553894043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:23] (step=0002525) Train Loss: 0.1833, Train Steps/Sec: 0.12, Epoch: 0.049067236688690244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2526, "loss": 0.344210684299469, "memory_gb": 7.721559524536133, "step_time_ms": 7494.396924972534, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:31] (step=0002526) Train Loss: 0.2625, Train Steps/Sec: 0.13, Epoch: 0.049086669257675866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2527, "loss": 0.21716120839118958, "memory_gb": 7.721559524536133, "step_time_ms": 7155.169725418091, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:38] (step=0002527) Train Loss: 0.2043, Train Steps/Sec: 0.14, Epoch: 0.04910610182666148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2528, "loss": 0.20191913843154907, "memory_gb": 7.721559524536133, "step_time_ms": 5740.012884140015, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:45] (step=0002528) Train Loss: 0.2145, Train Steps/Sec: 0.16, Epoch: 0.049125534395647104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:32:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2529, "loss": 0.25200924277305603, "memory_gb": 7.721559524536133, "step_time_ms": 7577.97646522522, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:32:53] (step=0002529) Train Loss: 0.1938, Train Steps/Sec: 0.12, Epoch: 0.049144966964632726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2530, "loss": 0.16760489344596863, "memory_gb": 7.721559524536133, "step_time_ms": 7646.920919418335, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:01] (step=0002530) Train Loss: 0.2472, Train Steps/Sec: 0.12, Epoch: 0.04916439953361834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2531, "loss": 0.30148419737815857, "memory_gb": 7.721559524536133, "step_time_ms": 7523.612499237061, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:09] (step=0002531) Train Loss: 0.2822, Train Steps/Sec: 0.12, Epoch: 0.049183832102603964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2532, "loss": 0.3201477527618408, "memory_gb": 7.721559524536133, "step_time_ms": 7520.480632781982, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:17] (step=0002532) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.049203264671589586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2533, "loss": 0.2958236336708069, "memory_gb": 7.721559524536133, "step_time_ms": 7515.687942504883, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:25] (step=0002533) Train Loss: 0.2535, Train Steps/Sec: 0.12, Epoch: 0.0492226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2534, "loss": 0.247381329536438, "memory_gb": 7.721559524536133, "step_time_ms": 7487.486124038696, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:33] (step=0002534) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.049242129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2535, "loss": 0.22952400147914886, "memory_gb": 7.721559524536133, "step_time_ms": 7477.227210998535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:41] (step=0002535) Train Loss: 0.2588, Train Steps/Sec: 0.12, Epoch: 0.049261562378546446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2536, "loss": 0.21945181488990784, "memory_gb": 7.721559524536133, "step_time_ms": 7493.834018707275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:49] (step=0002536) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.04928099494753206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2537, "loss": 0.32804325222969055, "memory_gb": 7.721559524536133, "step_time_ms": 7492.881059646606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:33:57] (step=0002537) Train Loss: 0.2630, Train Steps/Sec: 0.12, Epoch: 0.049300427516517684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2538, "loss": 0.25331246852874756, "memory_gb": 7.721559524536133, "step_time_ms": 7522.209882736206, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:05] (step=0002538) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.049319860085503306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2539, "loss": 0.18716999888420105, "memory_gb": 7.721559524536133, "step_time_ms": 7528.787136077881, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:13] (step=0002539) Train Loss: 0.2139, Train Steps/Sec: 0.13, Epoch: 0.04933929265448892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2540, "loss": 0.3027128577232361, "memory_gb": 7.721559524536133, "step_time_ms": 7465.959548950195, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:21] (step=0002540) Train Loss: 0.2981, Train Steps/Sec: 0.13, Epoch: 0.049358725223474544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2541, "loss": 0.19658911228179932, "memory_gb": 7.721559524536133, "step_time_ms": 7434.587001800537, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:29] (step=0002541) Train Loss: 0.2371, Train Steps/Sec: 0.12, Epoch: 0.049378157792460166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2542, "loss": 0.2884139120578766, "memory_gb": 7.721559524536133, "step_time_ms": 7496.1998462677, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:37] (step=0002542) Train Loss: 0.2918, Train Steps/Sec: 0.12, Epoch: 0.04939759036144578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2543, "loss": 0.29736337065696716, "memory_gb": 7.721559524536133, "step_time_ms": 7484.327793121338, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:45] (step=0002543) Train Loss: 0.2766, Train Steps/Sec: 0.12, Epoch: 0.049417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2544, "loss": 0.27658820152282715, "memory_gb": 7.721559524536133, "step_time_ms": 7437.964200973511, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:34:53] (step=0002544) Train Loss: 0.2558, Train Steps/Sec: 0.12, Epoch: 0.049436455499417026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2545, "loss": 0.2969762682914734, "memory_gb": 7.721559524536133, "step_time_ms": 7283.7584018707275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:01] (step=0002545) Train Loss: 0.2999, Train Steps/Sec: 0.12, Epoch: 0.04945588806840264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2546, "loss": 0.33109039068222046, "memory_gb": 7.721559524536133, "step_time_ms": 7501.158237457275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:10] (step=0002546) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.04947532063738826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2547, "loss": 0.20708122849464417, "memory_gb": 7.721559524536133, "step_time_ms": 7487.039566040039, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:18] (step=0002547) Train Loss: 0.2192, Train Steps/Sec: 0.12, Epoch: 0.049494753206373886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2548, "loss": 0.3120036721229553, "memory_gb": 7.721559524536133, "step_time_ms": 7514.884471893311, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:26] (step=0002548) Train Loss: 0.3302, Train Steps/Sec: 0.12, Epoch: 0.0495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2549, "loss": 0.20488034188747406, "memory_gb": 7.721559524536133, "step_time_ms": 7489.62140083313, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:33] (step=0002549) Train Loss: 0.2649, Train Steps/Sec: 0.13, Epoch: 0.04953361834434512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2550, "loss": 0.3187454640865326, "memory_gb": 7.721559524536133, "step_time_ms": 7496.457099914551, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:42] (step=0002550) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.049553050913330746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2551, "loss": 0.29417985677719116, "memory_gb": 7.721559524536133, "step_time_ms": 7538.999319076538, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:50] (step=0002551) Train Loss: 0.3291, Train Steps/Sec: 0.12, Epoch: 0.04957248348231636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2552, "loss": 0.2375185191631317, "memory_gb": 7.721559524536133, "step_time_ms": 7480.30161857605, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:35:58] (step=0002552) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.04959191605130198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2553, "loss": 0.14933398365974426, "memory_gb": 7.721559524536133, "step_time_ms": 7464.429616928101, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:06] (step=0002553) Train Loss: 0.1542, Train Steps/Sec: 0.12, Epoch: 0.0496113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2554, "loss": 0.3651588559150696, "memory_gb": 7.721559524536133, "step_time_ms": 7491.89829826355, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:14] (step=0002554) Train Loss: 0.2912, Train Steps/Sec: 0.12, Epoch: 0.04963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2555, "loss": 0.33235159516334534, "memory_gb": 7.721559524536133, "step_time_ms": 7276.95369720459, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:22] (step=0002555) Train Loss: 0.2956, Train Steps/Sec: 0.13, Epoch: 0.04965021375825884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2556, "loss": 0.21743208169937134, "memory_gb": 7.721559524536133, "step_time_ms": 7007.267475128174, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:29] (step=0002556) Train Loss: 0.2374, Train Steps/Sec: 0.14, Epoch: 0.04966964632724446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2557, "loss": 0.19268426299095154, "memory_gb": 7.721559524536133, "step_time_ms": 5874.806880950928, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:35] (step=0002557) Train Loss: 0.2195, Train Steps/Sec: 0.16, Epoch: 0.04968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2558, "loss": 0.1943536102771759, "memory_gb": 7.721559524536133, "step_time_ms": 7440.605878829956, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:43] (step=0002558) Train Loss: 0.2131, Train Steps/Sec: 0.12, Epoch: 0.0497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2559, "loss": 0.2851649522781372, "memory_gb": 7.715639114379883, "step_time_ms": 7463.88053894043, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:51] (step=0002559) Train Loss: 0.2424, Train Steps/Sec: 0.12, Epoch: 0.04972794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2560, "loss": 0.2360682487487793, "memory_gb": 7.721559524536133, "step_time_ms": 7407.623291015625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:36:59] (step=0002560) Train Loss: 0.2263, Train Steps/Sec: 0.13, Epoch: 0.04974737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2561, "loss": 0.2632501721382141, "memory_gb": 7.721559524536133, "step_time_ms": 7454.123258590698, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:07] (step=0002561) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.04976680917217256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2562, "loss": 0.21093067526817322, "memory_gb": 7.721559524536133, "step_time_ms": 7557.056188583374, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:16] (step=0002562) Train Loss: 0.1944, Train Steps/Sec: 0.12, Epoch: 0.04978624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2563, "loss": 0.14740502834320068, "memory_gb": 7.721559524536133, "step_time_ms": 7498.274087905884, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:24] (step=0002563) Train Loss: 0.1636, Train Steps/Sec: 0.12, Epoch: 0.0498056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2564, "loss": 0.2084897756576538, "memory_gb": 7.721559524536133, "step_time_ms": 7502.528190612793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:32] (step=0002564) Train Loss: 0.2455, Train Steps/Sec: 0.12, Epoch: 0.04982510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2565, "loss": 0.3558921217918396, "memory_gb": 7.721559524536133, "step_time_ms": 7508.777856826782, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:40] (step=0002565) Train Loss: 0.2995, Train Steps/Sec: 0.12, Epoch: 0.04984453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2566, "loss": 0.18502146005630493, "memory_gb": 7.721559524536133, "step_time_ms": 7428.167104721069, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:48] (step=0002566) Train Loss: 0.1845, Train Steps/Sec: 0.12, Epoch: 0.04986397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2567, "loss": 0.23754477500915527, "memory_gb": 7.721559524536133, "step_time_ms": 7508.246421813965, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:37:56] (step=0002567) Train Loss: 0.2179, Train Steps/Sec: 0.13, Epoch: 0.04988340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2568, "loss": 0.31187304854393005, "memory_gb": 7.721559524536133, "step_time_ms": 7566.803455352783, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:04] (step=0002568) Train Loss: 0.2413, Train Steps/Sec: 0.12, Epoch: 0.0499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2569, "loss": 0.26051416993141174, "memory_gb": 7.721559524536133, "step_time_ms": 7494.687080383301, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:12] (step=0002569) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.04992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2570, "loss": 0.18776682019233704, "memory_gb": 7.721559524536133, "step_time_ms": 7528.35488319397, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:20] (step=0002570) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.04994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2571, "loss": 0.22155748307704926, "memory_gb": 7.721559524536133, "step_time_ms": 7539.016246795654, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:28] (step=0002571) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.04996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2572, "loss": 0.2643219232559204, "memory_gb": 7.721559524536133, "step_time_ms": 7521.986484527588, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:36] (step=0002572) Train Loss: 0.2766, Train Steps/Sec: 0.13, Epoch: 0.04998056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2573, "loss": 0.27970924973487854, "memory_gb": 7.715639114379883, "step_time_ms": 7543.018341064453, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:44] (step=0002573) Train Loss: 0.2675, Train Steps/Sec: 0.12, Epoch: 0.05, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2574, "loss": 0.13727307319641113, "memory_gb": 7.721559524536133, "step_time_ms": 7683.009624481201, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:38:52] (step=0002574) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.05001943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2575, "loss": 0.26368361711502075, "memory_gb": 7.721559524536133, "step_time_ms": 7481.7750453948975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:00] (step=0002575) Train Loss: 0.2208, Train Steps/Sec: 0.12, Epoch: 0.05003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2576, "loss": 0.2719554901123047, "memory_gb": 7.715639114379883, "step_time_ms": 7439.9254322052, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:08] (step=0002576) Train Loss: 0.2196, Train Steps/Sec: 0.13, Epoch: 0.05005829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2577, "loss": 0.3232184052467346, "memory_gb": 7.721559524536133, "step_time_ms": 7576.9617557525635, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:16] (step=0002577) Train Loss: 0.3314, Train Steps/Sec: 0.12, Epoch: 0.05007773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2578, "loss": 0.31261634826660156, "memory_gb": 7.721559524536133, "step_time_ms": 7511.558055877686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:24] (step=0002578) Train Loss: 0.3090, Train Steps/Sec: 0.12, Epoch: 0.0500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2579, "loss": 0.2805858850479126, "memory_gb": 7.721559524536133, "step_time_ms": 7567.553758621216, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:32] (step=0002579) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.05011659541391372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2580, "loss": 0.3568022549152374, "memory_gb": 7.721559524536133, "step_time_ms": 7604.46834564209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:40] (step=0002580) Train Loss: 0.2682, Train Steps/Sec: 0.12, Epoch: 0.05013602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2581, "loss": 0.32722994685173035, "memory_gb": 7.721559524536133, "step_time_ms": 7530.559539794922, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:48] (step=0002581) Train Loss: 0.2903, Train Steps/Sec: 0.12, Epoch: 0.05015546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:39:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2582, "loss": 0.1879490166902542, "memory_gb": 7.721559524536133, "step_time_ms": 7584.82551574707, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:39:56] (step=0002582) Train Loss: 0.2121, Train Steps/Sec: 0.12, Epoch: 0.050174893120870576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2583, "loss": 0.2590360641479492, "memory_gb": 7.721559524536133, "step_time_ms": 7672.529697418213, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:04] (step=0002583) Train Loss: 0.2500, Train Steps/Sec: 0.12, Epoch: 0.0501943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2584, "loss": 0.36471331119537354, "memory_gb": 7.721559524536133, "step_time_ms": 7426.928281784058, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:12] (step=0002584) Train Loss: 0.3450, Train Steps/Sec: 0.13, Epoch: 0.05021375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2585, "loss": 0.332528293132782, "memory_gb": 7.715639114379883, "step_time_ms": 7048.952102661133, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:19] (step=0002585) Train Loss: 0.3072, Train Steps/Sec: 0.14, Epoch: 0.050233190827827436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2586, "loss": 0.2829987406730652, "memory_gb": 7.721559524536133, "step_time_ms": 5709.166049957275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:26] (step=0002586) Train Loss: 0.2408, Train Steps/Sec: 0.15, Epoch: 0.05025262339681306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2587, "loss": 0.2631843388080597, "memory_gb": 7.721559524536133, "step_time_ms": 7536.697626113892, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:34] (step=0002587) Train Loss: 0.3023, Train Steps/Sec: 0.12, Epoch: 0.05027205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2588, "loss": 0.22631214559078217, "memory_gb": 7.721559524536133, "step_time_ms": 7578.366756439209, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:42] (step=0002588) Train Loss: 0.2117, Train Steps/Sec: 0.12, Epoch: 0.050291488534784295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2589, "loss": 0.20965614914894104, "memory_gb": 7.721559524536133, "step_time_ms": 7498.707056045532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:50] (step=0002589) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.05031092110376992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2590, "loss": 0.21840515732765198, "memory_gb": 7.721559524536133, "step_time_ms": 7522.608995437622, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:40:58] (step=0002590) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.05033035367275554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2591, "loss": 0.2146899700164795, "memory_gb": 7.721559524536133, "step_time_ms": 7548.084259033203, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:06] (step=0002591) Train Loss: 0.2130, Train Steps/Sec: 0.12, Epoch: 0.050349786241741155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2592, "loss": 0.2996915876865387, "memory_gb": 7.721559524536133, "step_time_ms": 7481.603145599365, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:14] (step=0002592) Train Loss: 0.2779, Train Steps/Sec: 0.12, Epoch: 0.05036921881072678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2593, "loss": 0.28413668274879456, "memory_gb": 7.721559524536133, "step_time_ms": 7512.403249740601, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:22] (step=0002593) Train Loss: 0.3163, Train Steps/Sec: 0.12, Epoch: 0.0503886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2594, "loss": 0.2439059019088745, "memory_gb": 7.721559524536133, "step_time_ms": 7499.36318397522, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:30] (step=0002594) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.050408083948698015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2595, "loss": 0.30939340591430664, "memory_gb": 7.721559524536133, "step_time_ms": 7457.188844680786, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:38] (step=0002595) Train Loss: 0.2530, Train Steps/Sec: 0.12, Epoch: 0.05042751651768364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2596, "loss": 0.25728797912597656, "memory_gb": 7.721559524536133, "step_time_ms": 7518.70322227478, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:46] (step=0002596) Train Loss: 0.2954, Train Steps/Sec: 0.12, Epoch: 0.05044694908666926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:41:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2597, "loss": 0.22090119123458862, "memory_gb": 7.721559524536133, "step_time_ms": 7490.829944610596, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:41:54] (step=0002597) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.050466381655654875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2598, "loss": 0.26302140951156616, "memory_gb": 7.721559524536133, "step_time_ms": 7468.517065048218, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:03] (step=0002598) Train Loss: 0.2143, Train Steps/Sec: 0.12, Epoch: 0.0504858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2599, "loss": 0.26551905274391174, "memory_gb": 7.721559524536133, "step_time_ms": 7546.585559844971, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:11] (step=0002599) Train Loss: 0.2735, Train Steps/Sec: 0.12, Epoch: 0.05050524679362612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2600, "loss": 0.23471711575984955, "memory_gb": 7.721559524536133, "step_time_ms": 7531.129360198975, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:19] (step=0002600) Train Loss: 0.2685, Train Steps/Sec: 0.12, Epoch: 0.050524679362611735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2601, "loss": 0.13128839433193207, "memory_gb": 7.721559524536133, "step_time_ms": 7414.814233779907, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:27] (step=0002601) Train Loss: 0.1757, Train Steps/Sec: 0.12, Epoch: 0.05054411193159736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2602, "loss": 0.2750532031059265, "memory_gb": 7.721559524536133, "step_time_ms": 7432.364702224731, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:35] (step=0002602) Train Loss: 0.2892, Train Steps/Sec: 0.12, Epoch: 0.05056354450058298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2603, "loss": 0.21086983382701874, "memory_gb": 7.721559524536133, "step_time_ms": 7494.332551956177, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:43] (step=0002603) Train Loss: 0.1918, Train Steps/Sec: 0.12, Epoch: 0.050582977069568595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2604, "loss": 0.14908340573310852, "memory_gb": 7.721559524536133, "step_time_ms": 7420.2916622161865, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:51] (step=0002604) Train Loss: 0.1827, Train Steps/Sec: 0.12, Epoch: 0.05060240963855422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:42:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2605, "loss": 0.22532346844673157, "memory_gb": 7.721559524536133, "step_time_ms": 7463.516712188721, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:42:59] (step=0002605) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.05062184220753984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2606, "loss": 0.10041195154190063, "memory_gb": 7.721559524536133, "step_time_ms": 7499.517440795898, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:07] (step=0002606) Train Loss: 0.1435, Train Steps/Sec: 0.12, Epoch: 0.050641274776525455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2607, "loss": 0.24682240188121796, "memory_gb": 7.721559524536133, "step_time_ms": 7429.927587509155, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:15] (step=0002607) Train Loss: 0.2525, Train Steps/Sec: 0.13, Epoch: 0.05066070734551108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2608, "loss": 0.25041916966438293, "memory_gb": 7.721559524536133, "step_time_ms": 7478.279590606689, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:23] (step=0002608) Train Loss: 0.2501, Train Steps/Sec: 0.13, Epoch: 0.0506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2609, "loss": 0.1904890090227127, "memory_gb": 7.721559524536133, "step_time_ms": 7484.585285186768, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:31] (step=0002609) Train Loss: 0.2435, Train Steps/Sec: 0.13, Epoch: 0.050699572483482315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2610, "loss": 0.24454852938652039, "memory_gb": 7.721559524536133, "step_time_ms": 7493.234395980835, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:39] (step=0002610) Train Loss: 0.2836, Train Steps/Sec: 0.13, Epoch: 0.05071900505246794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2611, "loss": 0.258633553981781, "memory_gb": 7.721559524536133, "step_time_ms": 7472.058534622192, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:47] (step=0002611) Train Loss: 0.2156, Train Steps/Sec: 0.13, Epoch: 0.05073843762145356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2612, "loss": 0.25180572271347046, "memory_gb": 7.721559524536133, "step_time_ms": 7508.631706237793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:43:55] (step=0002612) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.050757870190439175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2613, "loss": 0.26305967569351196, "memory_gb": 7.721559524536133, "step_time_ms": 7281.5868854522705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:03] (step=0002613) Train Loss: 0.2701, Train Steps/Sec: 0.13, Epoch: 0.0507773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2614, "loss": 0.18069711327552795, "memory_gb": 7.721559524536133, "step_time_ms": 6833.9784145355225, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:10] (step=0002614) Train Loss: 0.2382, Train Steps/Sec: 0.14, Epoch: 0.05079673532841041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2615, "loss": 0.16932441294193268, "memory_gb": 7.721559524536133, "step_time_ms": 6462.554931640625, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:17] (step=0002615) Train Loss: 0.2475, Train Steps/Sec: 0.15, Epoch: 0.050816167897396035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2616, "loss": 0.25032559037208557, "memory_gb": 7.721559524536133, "step_time_ms": 7484.133005142212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:25] (step=0002616) Train Loss: 0.3005, Train Steps/Sec: 0.12, Epoch: 0.05083560046638166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2617, "loss": 0.23456993699073792, "memory_gb": 7.721559524536133, "step_time_ms": 7541.834592819214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:33] (step=0002617) Train Loss: 0.2535, Train Steps/Sec: 0.13, Epoch: 0.05085503303536727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2618, "loss": 0.17944088578224182, "memory_gb": 7.721559524536133, "step_time_ms": 7410.4673862457275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:41] (step=0002618) Train Loss: 0.1980, Train Steps/Sec: 0.12, Epoch: 0.050874465604352895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2619, "loss": 0.15123474597930908, "memory_gb": 7.721559524536133, "step_time_ms": 7454.385757446289, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:49] (step=0002619) Train Loss: 0.2095, Train Steps/Sec: 0.12, Epoch: 0.05089389817333852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:44:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2620, "loss": 0.12539944052696228, "memory_gb": 7.721559524536133, "step_time_ms": 7479.720592498779, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:44:57] (step=0002620) Train Loss: 0.2241, Train Steps/Sec: 0.12, Epoch: 0.05091333074232413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2621, "loss": 0.20560654997825623, "memory_gb": 7.721559524536133, "step_time_ms": 7450.199365615845, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:05] (step=0002621) Train Loss: 0.2410, Train Steps/Sec: 0.12, Epoch: 0.050932763311309755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2622, "loss": 0.11141913384199142, "memory_gb": 7.721559524536133, "step_time_ms": 7475.151538848877, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:13] (step=0002622) Train Loss: 0.2199, Train Steps/Sec: 0.12, Epoch: 0.05095219588029538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2623, "loss": 0.35732877254486084, "memory_gb": 7.721559524536133, "step_time_ms": 7538.935422897339, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:21] (step=0002623) Train Loss: 0.3265, Train Steps/Sec: 0.12, Epoch: 0.05097162844928099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2624, "loss": 0.2393827736377716, "memory_gb": 7.721559524536133, "step_time_ms": 7424.119472503662, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:29] (step=0002624) Train Loss: 0.2647, Train Steps/Sec: 0.12, Epoch: 0.050991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2625, "loss": 0.3191031515598297, "memory_gb": 7.721559524536133, "step_time_ms": 7465.826988220215, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:37] (step=0002625) Train Loss: 0.2867, Train Steps/Sec: 0.12, Epoch: 0.05101049358725224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2626, "loss": 0.2613198161125183, "memory_gb": 7.721559524536133, "step_time_ms": 7517.004728317261, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:45] (step=0002626) Train Loss: 0.2401, Train Steps/Sec: 0.12, Epoch: 0.05102992615623785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:45:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2627, "loss": 0.24084022641181946, "memory_gb": 7.721559524536133, "step_time_ms": 7438.938617706299, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:45:53] (step=0002627) Train Loss: 0.2094, Train Steps/Sec: 0.12, Epoch: 0.051049358725223475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2628, "loss": 0.249204620718956, "memory_gb": 7.721559524536133, "step_time_ms": 7472.227573394775, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:01] (step=0002628) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.0510687912942091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2629, "loss": 0.24321727454662323, "memory_gb": 7.721559524536133, "step_time_ms": 7520.2319622039795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:09] (step=0002629) Train Loss: 0.2468, Train Steps/Sec: 0.12, Epoch: 0.05108822386319471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2630, "loss": 0.29066896438598633, "memory_gb": 7.721559524536133, "step_time_ms": 7465.767621994019, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:17] (step=0002630) Train Loss: 0.2582, Train Steps/Sec: 0.12, Epoch: 0.051107656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2631, "loss": 0.24261167645454407, "memory_gb": 7.715639114379883, "step_time_ms": 7503.867387771606, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:25] (step=0002631) Train Loss: 0.2624, Train Steps/Sec: 0.12, Epoch: 0.05112708900116596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2632, "loss": 0.16688546538352966, "memory_gb": 7.721559524536133, "step_time_ms": 7566.771507263184, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:33] (step=0002632) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.05114652157015157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2633, "loss": 0.1321454644203186, "memory_gb": 7.721559524536133, "step_time_ms": 7528.3660888671875, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:41] (step=0002633) Train Loss: 0.1972, Train Steps/Sec: 0.13, Epoch: 0.051165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2634, "loss": 0.17521819472312927, "memory_gb": 7.721559524536133, "step_time_ms": 7536.705732345581, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:49] (step=0002634) Train Loss: 0.2170, Train Steps/Sec: 0.12, Epoch: 0.05118538670812282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:46:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2635, "loss": 0.2283901870250702, "memory_gb": 7.721559524536133, "step_time_ms": 7598.1128215789795, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:46:57] (step=0002635) Train Loss: 0.2149, Train Steps/Sec: 0.12, Epoch: 0.05120481927710843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2636, "loss": 0.3667709529399872, "memory_gb": 7.721559524536133, "step_time_ms": 7503.905534744263, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:05] (step=0002636) Train Loss: 0.2666, Train Steps/Sec: 0.13, Epoch: 0.051224251846094054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2637, "loss": 0.20828360319137573, "memory_gb": 7.721559524536133, "step_time_ms": 7508.5155963897705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:13] (step=0002637) Train Loss: 0.2060, Train Steps/Sec: 0.12, Epoch: 0.05124368441507968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2638, "loss": 0.140878826379776, "memory_gb": 7.721559524536133, "step_time_ms": 7538.140773773193, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:21] (step=0002638) Train Loss: 0.1527, Train Steps/Sec: 0.12, Epoch: 0.05126311698406529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2639, "loss": 0.25317075848579407, "memory_gb": 7.721559524536133, "step_time_ms": 7456.545114517212, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:29] (step=0002639) Train Loss: 0.2334, Train Steps/Sec: 0.12, Epoch: 0.051282549553050914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2640, "loss": 0.15570414066314697, "memory_gb": 7.721559524536133, "step_time_ms": 7506.886959075928, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:37] (step=0002640) Train Loss: 0.1385, Train Steps/Sec: 0.13, Epoch: 0.05130198212203654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2641, "loss": 0.19434770941734314, "memory_gb": 7.721559524536133, "step_time_ms": 7603.4650802612305, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:45] (step=0002641) Train Loss: 0.2696, Train Steps/Sec: 0.12, Epoch: 0.05132141469102215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2642, "loss": 0.2361484318971634, "memory_gb": 7.721559524536133, "step_time_ms": 7386.194229125977, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:47:53] (step=0002642) Train Loss: 0.1896, Train Steps/Sec: 0.13, Epoch: 0.051340847260007774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2643, "loss": 0.1736541986465454, "memory_gb": 7.721559524536133, "step_time_ms": 6792.650461196899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:00] (step=0002643) Train Loss: 0.2461, Train Steps/Sec: 0.14, Epoch: 0.05136027982899339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2644, "loss": 0.18073655664920807, "memory_gb": 7.721559524536133, "step_time_ms": 5899.839162826538, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:07] (step=0002644) Train Loss: 0.2266, Train Steps/Sec: 0.16, Epoch: 0.05137971239797901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2645, "loss": 0.272230327129364, "memory_gb": 7.721559524536133, "step_time_ms": 7563.747882843018, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:15] (step=0002645) Train Loss: 0.2903, Train Steps/Sec: 0.13, Epoch: 0.051399144966964634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2646, "loss": 0.18738438189029694, "memory_gb": 7.721559524536133, "step_time_ms": 7554.579257965088, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:23] (step=0002646) Train Loss: 0.2017, Train Steps/Sec: 0.12, Epoch: 0.05141857753595025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2647, "loss": 0.23915952444076538, "memory_gb": 7.721559524536133, "step_time_ms": 7464.816331863403, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:31] (step=0002647) Train Loss: 0.2276, Train Steps/Sec: 0.13, Epoch: 0.05143801010493587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2648, "loss": 0.33006787300109863, "memory_gb": 7.721559524536133, "step_time_ms": 7524.912595748901, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:39] (step=0002648) Train Loss: 0.2954, Train Steps/Sec: 0.12, Epoch: 0.051457442673921494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2649, "loss": 0.21472784876823425, "memory_gb": 7.721559524536133, "step_time_ms": 7619.726657867432, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:47] (step=0002649) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.05147687524290711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:48:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2650, "loss": 0.17784538865089417, "memory_gb": 7.721559524536133, "step_time_ms": 7485.929012298584, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:48:55] (step=0002650) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.05149630781189273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2651, "loss": 0.2170938104391098, "memory_gb": 7.721559524536133, "step_time_ms": 7474.57218170166, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:03] (step=0002651) Train Loss: 0.2815, Train Steps/Sec: 0.13, Epoch: 0.051515740380878354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2652, "loss": 0.20711687207221985, "memory_gb": 7.721559524536133, "step_time_ms": 7565.268993377686, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:11] (step=0002652) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.05153517294986397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2653, "loss": 0.29951953887939453, "memory_gb": 7.721559524536133, "step_time_ms": 7460.9198570251465, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:19] (step=0002653) Train Loss: 0.2963, Train Steps/Sec: 0.12, Epoch: 0.05155460551884959, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2654, "loss": 0.3325357735157013, "memory_gb": 7.721559524536133, "step_time_ms": 7508.51845741272, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:27] (step=0002654) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.051574038087835214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2655, "loss": 0.3219835162162781, "memory_gb": 7.721559524536133, "step_time_ms": 7659.984588623047, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:35] (step=0002655) Train Loss: 0.2991, Train Steps/Sec: 0.12, Epoch: 0.05159347065682083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2656, "loss": 0.18054142594337463, "memory_gb": 7.721559524536133, "step_time_ms": 7555.767774581909, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:43] (step=0002656) Train Loss: 0.2129, Train Steps/Sec: 0.13, Epoch: 0.05161290322580645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2657, "loss": 0.3027005195617676, "memory_gb": 7.721559524536133, "step_time_ms": 7541.531324386597, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:51] (step=0002657) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.051632335794792074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:49:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2658, "loss": 0.21472452580928802, "memory_gb": 7.721559524536133, "step_time_ms": 7573.397159576416, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:49:59] (step=0002658) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.05165176836377769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2659, "loss": 0.1839640736579895, "memory_gb": 7.721559524536133, "step_time_ms": 7467.246770858765, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:07] (step=0002659) Train Loss: 0.2393, Train Steps/Sec: 0.12, Epoch: 0.05167120093276331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2660, "loss": 0.30441248416900635, "memory_gb": 7.715639114379883, "step_time_ms": 7473.2842445373535, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:15] (step=0002660) Train Loss: 0.2600, Train Steps/Sec: 0.12, Epoch: 0.051690633501748934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2661, "loss": 0.3039066195487976, "memory_gb": 7.721559524536133, "step_time_ms": 7518.480539321899, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:23] (step=0002661) Train Loss: 0.2816, Train Steps/Sec: 0.12, Epoch: 0.05171006607073455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2662, "loss": 0.2640056610107422, "memory_gb": 7.721559524536133, "step_time_ms": 7442.6891803741455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:31] (step=0002662) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.05172949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2663, "loss": 0.09509754180908203, "memory_gb": 7.721559524536133, "step_time_ms": 7587.061643600464, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:40] (step=0002663) Train Loss: 0.1429, Train Steps/Sec: 0.12, Epoch: 0.051748931208705794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2664, "loss": 0.203754723072052, "memory_gb": 7.721559524536133, "step_time_ms": 7505.631923675537, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:48] (step=0002664) Train Loss: 0.2106, Train Steps/Sec: 0.12, Epoch: 0.05176836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:50:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2665, "loss": 0.29669737815856934, "memory_gb": 7.721559524536133, "step_time_ms": 7384.886980056763, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:50:56] (step=0002665) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.05178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2666, "loss": 0.23186840116977692, "memory_gb": 7.721559524536133, "step_time_ms": 7193.591594696045, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:04] (step=0002666) Train Loss: 0.1887, Train Steps/Sec: 0.12, Epoch: 0.051807228915662654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2667, "loss": 0.19591274857521057, "memory_gb": 7.721559524536133, "step_time_ms": 7462.805986404419, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:12] (step=0002667) Train Loss: 0.2144, Train Steps/Sec: 0.12, Epoch: 0.05182666148464827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2668, "loss": 0.07306378334760666, "memory_gb": 7.721559524536133, "step_time_ms": 7378.2384395599365, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:20] (step=0002668) Train Loss: 0.1520, Train Steps/Sec: 0.12, Epoch: 0.05184609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2669, "loss": 0.23337581753730774, "memory_gb": 7.721559524536133, "step_time_ms": 7413.652658462524, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:28] (step=0002669) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.051865526622619514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2670, "loss": 0.22546806931495667, "memory_gb": 7.721559524536133, "step_time_ms": 7493.540525436401, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:36] (step=0002670) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.05188495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2671, "loss": 0.19253508746623993, "memory_gb": 7.721559524536133, "step_time_ms": 7277.9388427734375, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:44] (step=0002671) Train Loss: 0.2170, Train Steps/Sec: 0.13, Epoch: 0.05190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2672, "loss": 0.3013146221637726, "memory_gb": 7.721559524536133, "step_time_ms": 6567.30055809021, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:51] (step=0002672) Train Loss: 0.2042, Train Steps/Sec: 0.15, Epoch: 0.05192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:51:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2673, "loss": 0.24256937205791473, "memory_gb": 7.721559524536133, "step_time_ms": 5992.378711700439, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:51:58] (step=0002673) Train Loss: 0.2525, Train Steps/Sec: 0.14, Epoch: 0.05194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2674, "loss": 0.38375717401504517, "memory_gb": 7.721559524536133, "step_time_ms": 7440.361499786377, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:06] (step=0002674) Train Loss: 0.3163, Train Steps/Sec: 0.13, Epoch: 0.05196268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2675, "loss": 0.18728218972682953, "memory_gb": 7.721559524536133, "step_time_ms": 7507.312536239624, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:14] (step=0002675) Train Loss: 0.2075, Train Steps/Sec: 0.12, Epoch: 0.051982122036533226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2676, "loss": 0.23182164132595062, "memory_gb": 7.721559524536133, "step_time_ms": 7471.680641174316, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:22] (step=0002676) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.05200155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2677, "loss": 0.2696044445037842, "memory_gb": 7.721559524536133, "step_time_ms": 7488.379955291748, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:30] (step=0002677) Train Loss: 0.2739, Train Steps/Sec: 0.13, Epoch: 0.05202098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2678, "loss": 0.2491374909877777, "memory_gb": 7.721559524536133, "step_time_ms": 7588.27543258667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:38] (step=0002678) Train Loss: 0.2211, Train Steps/Sec: 0.13, Epoch: 0.052040419743490086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2679, "loss": 0.20627495646476746, "memory_gb": 7.721559524536133, "step_time_ms": 7500.81992149353, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:46] (step=0002679) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.05205985231247571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:52:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2680, "loss": 0.30626028776168823, "memory_gb": 7.721559524536133, "step_time_ms": 7523.747444152832, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:52:54] (step=0002680) Train Loss: 0.2723, Train Steps/Sec: 0.12, Epoch: 0.05207928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2681, "loss": 0.2932053506374359, "memory_gb": 7.721559524536133, "step_time_ms": 7595.426797866821, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:02] (step=0002681) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.052098717450446946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2682, "loss": 0.24832533299922943, "memory_gb": 7.721559524536133, "step_time_ms": 7298.522472381592, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:10] (step=0002682) Train Loss: 0.1962, Train Steps/Sec: 0.13, Epoch: 0.05211815001943257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2683, "loss": 0.2073509395122528, "memory_gb": 7.721559524536133, "step_time_ms": 7490.368127822876, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:18] (step=0002683) Train Loss: 0.2030, Train Steps/Sec: 0.12, Epoch: 0.05213758258841819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2684, "loss": 0.1772390902042389, "memory_gb": 7.721559524536133, "step_time_ms": 7532.801389694214, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:26] (step=0002684) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.052157015157403806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2685, "loss": 0.29143285751342773, "memory_gb": 7.721559524536133, "step_time_ms": 7506.9849491119385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:34] (step=0002685) Train Loss: 0.2797, Train Steps/Sec: 0.12, Epoch: 0.05217644772638943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2686, "loss": 0.1830017864704132, "memory_gb": 7.721559524536133, "step_time_ms": 7475.875616073608, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:42] (step=0002686) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.05219588029537505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2687, "loss": 0.1438019573688507, "memory_gb": 7.721559524536133, "step_time_ms": 7527.060985565186, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:50] (step=0002687) Train Loss: 0.1835, Train Steps/Sec: 0.12, Epoch: 0.052215312864360666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:53:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2688, "loss": 0.2453654706478119, "memory_gb": 7.721559524536133, "step_time_ms": 7441.6823387146, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:53:58] (step=0002688) Train Loss: 0.2486, Train Steps/Sec: 0.12, Epoch: 0.05223474543334629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2689, "loss": 0.21532166004180908, "memory_gb": 7.721559524536133, "step_time_ms": 7448.896169662476, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:06] (step=0002689) Train Loss: 0.2346, Train Steps/Sec: 0.12, Epoch: 0.05225417800233191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2690, "loss": 0.28650617599487305, "memory_gb": 7.715639114379883, "step_time_ms": 7510.460376739502, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:14] (step=0002690) Train Loss: 0.2796, Train Steps/Sec: 0.12, Epoch: 0.052273610571317526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2691, "loss": 0.3816208839416504, "memory_gb": 7.721559524536133, "step_time_ms": 7532.734632492065, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:22] (step=0002691) Train Loss: 0.3513, Train Steps/Sec: 0.12, Epoch: 0.05229304314030315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2692, "loss": 0.22743456065654755, "memory_gb": 7.721559524536133, "step_time_ms": 7464.798927307129, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:30] (step=0002692) Train Loss: 0.2200, Train Steps/Sec: 0.13, Epoch: 0.05231247570928877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2693, "loss": 0.2602335810661316, "memory_gb": 7.721559524536133, "step_time_ms": 7545.218467712402, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:38] (step=0002693) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.052331908278274386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2694, "loss": 0.2886011004447937, "memory_gb": 7.721559524536133, "step_time_ms": 7458.191156387329, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:46] (step=0002694) Train Loss: 0.2374, Train Steps/Sec: 0.12, Epoch: 0.05235134084726001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:54:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2695, "loss": 0.2844516634941101, "memory_gb": 7.721559524536133, "step_time_ms": 7498.2922077178955, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:54:54] (step=0002695) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.05237077341624563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2696, "loss": 0.1766190528869629, "memory_gb": 7.721559524536133, "step_time_ms": 7562.427759170532, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:02] (step=0002696) Train Loss: 0.2279, Train Steps/Sec: 0.12, Epoch: 0.052390205985231246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2697, "loss": 0.22272959351539612, "memory_gb": 7.721559524536133, "step_time_ms": 7470.607280731201, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:10] (step=0002697) Train Loss: 0.2850, Train Steps/Sec: 0.12, Epoch: 0.05240963855421687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2698, "loss": 0.3106653094291687, "memory_gb": 7.721559524536133, "step_time_ms": 7497.658014297485, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:18] (step=0002698) Train Loss: 0.3414, Train Steps/Sec: 0.12, Epoch: 0.05242907112320249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2699, "loss": 0.30271926522254944, "memory_gb": 7.721559524536133, "step_time_ms": 7609.096050262451, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:26] (step=0002699) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.052448503692188106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2700, "loss": 0.1484089493751526, "memory_gb": 7.721559524536133, "step_time_ms": 7436.100482940674, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:34] (step=0002700) Train Loss: 0.2082, Train Steps/Sec: 0.13, Epoch: 0.05246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2701, "loss": 0.23870964348316193, "memory_gb": 7.721559524536133, "step_time_ms": 6735.822916030884, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:41] (step=0002701) Train Loss: 0.2605, Train Steps/Sec: 0.14, Epoch: 0.05248736883015935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2702, "loss": 0.1872246116399765, "memory_gb": 7.721559524536133, "step_time_ms": 6414.260625839233, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:48] (step=0002702) Train Loss: 0.2269, Train Steps/Sec: 0.15, Epoch: 0.052506801399144966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:55:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2703, "loss": 0.22352629899978638, "memory_gb": 7.721559524536133, "step_time_ms": 7651.854515075684, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:55:56] (step=0002703) Train Loss: 0.2182, Train Steps/Sec: 0.12, Epoch: 0.05252623396813059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2704, "loss": 0.23993536829948425, "memory_gb": 7.721559524536133, "step_time_ms": 7545.33314704895, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:04] (step=0002704) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.052545666537116203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2705, "loss": 0.15360823273658752, "memory_gb": 7.721559524536133, "step_time_ms": 7482.090711593628, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:12] (step=0002705) Train Loss: 0.1876, Train Steps/Sec: 0.13, Epoch: 0.052565099106101826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2706, "loss": 0.22974419593811035, "memory_gb": 7.721559524536133, "step_time_ms": 7510.843753814697, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:20] (step=0002706) Train Loss: 0.2514, Train Steps/Sec: 0.12, Epoch: 0.05258453167508745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2707, "loss": 0.1499195396900177, "memory_gb": 7.721559524536133, "step_time_ms": 7591.080188751221, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:28] (step=0002707) Train Loss: 0.1948, Train Steps/Sec: 0.12, Epoch: 0.05260396424407306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2708, "loss": 0.28022849559783936, "memory_gb": 7.721559524536133, "step_time_ms": 7486.5171909332275, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:36] (step=0002708) Train Loss: 0.2708, Train Steps/Sec: 0.12, Epoch: 0.052623396813058686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2709, "loss": 0.2768806517124176, "memory_gb": 7.721559524536133, "step_time_ms": 7458.571195602417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:44] (step=0002709) Train Loss: 0.2858, Train Steps/Sec: 0.12, Epoch: 0.05264282938204431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:56:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2710, "loss": 0.2442934811115265, "memory_gb": 7.721559524536133, "step_time_ms": 7586.020231246948, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:56:52] (step=0002710) Train Loss: 0.3014, Train Steps/Sec: 0.12, Epoch: 0.05266226195102992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2711, "loss": 0.24606963992118835, "memory_gb": 7.721559524536133, "step_time_ms": 7422.6975440979, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:00] (step=0002711) Train Loss: 0.2226, Train Steps/Sec: 0.13, Epoch: 0.052681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2712, "loss": 0.2992861270904541, "memory_gb": 7.721559524536133, "step_time_ms": 7461.232900619507, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:09] (step=0002712) Train Loss: 0.2577, Train Steps/Sec: 0.12, Epoch: 0.05270112708900117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2713, "loss": 0.2259959876537323, "memory_gb": 7.721559524536133, "step_time_ms": 7497.390031814575, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:17] (step=0002713) Train Loss: 0.2484, Train Steps/Sec: 0.12, Epoch: 0.05272055965798678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2714, "loss": 0.3232533037662506, "memory_gb": 7.721559524536133, "step_time_ms": 7431.044340133667, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:25] (step=0002714) Train Loss: 0.3099, Train Steps/Sec: 0.13, Epoch: 0.052739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2715, "loss": 0.21021848917007446, "memory_gb": 7.721559524536133, "step_time_ms": 7456.472635269165, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:33] (step=0002715) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.05275942479595803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2716, "loss": 0.1131732314825058, "memory_gb": 7.721559524536133, "step_time_ms": 7550.591468811035, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:41] (step=0002716) Train Loss: 0.1437, Train Steps/Sec: 0.12, Epoch: 0.05277885736494364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2717, "loss": 0.18139353394508362, "memory_gb": 7.721559524536133, "step_time_ms": 7449.786424636841, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:49] (step=0002717) Train Loss: 0.2213, Train Steps/Sec: 0.12, Epoch: 0.052798289933929266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:57:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2718, "loss": 0.3259878158569336, "memory_gb": 7.721559524536133, "step_time_ms": 7424.919366836548, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:57:57] (step=0002718) Train Loss: 0.3305, Train Steps/Sec: 0.12, Epoch: 0.05281772250291489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2719, "loss": 0.16053253412246704, "memory_gb": 7.721559524536133, "step_time_ms": 7514.037132263184, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:05] (step=0002719) Train Loss: 0.1901, Train Steps/Sec: 0.12, Epoch: 0.0528371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2720, "loss": 0.19385865330696106, "memory_gb": 7.721559524536133, "step_time_ms": 7425.612211227417, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:13] (step=0002720) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.052856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2721, "loss": 0.29738616943359375, "memory_gb": 7.721559524536133, "step_time_ms": 7426.9421100616455, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:21] (step=0002721) Train Loss: 0.2094, Train Steps/Sec: 0.12, Epoch: 0.05287602020987175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2722, "loss": 0.30646997690200806, "memory_gb": 7.721559524536133, "step_time_ms": 7494.7216510772705, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:29] (step=0002722) Train Loss: 0.2471, Train Steps/Sec: 0.12, Epoch: 0.05289545277885736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2723, "loss": 0.1964261531829834, "memory_gb": 7.721559524536133, "step_time_ms": 7413.233995437622, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:37] (step=0002723) Train Loss: 0.1787, Train Steps/Sec: 0.12, Epoch: 0.052914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2724, "loss": 0.3516056537628174, "memory_gb": 7.721559524536133, "step_time_ms": 7450.0696659088135, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:45] (step=0002724) Train Loss: 0.2668, Train Steps/Sec: 0.13, Epoch: 0.05293431791682861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:58:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2725, "loss": 0.10559774190187454, "memory_gb": 7.721559524536133, "step_time_ms": 7483.638763427734, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:58:53] (step=0002725) Train Loss: 0.1256, Train Steps/Sec: 0.12, Epoch: 0.05295375048581422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2726, "loss": 0.3024340569972992, "memory_gb": 7.721559524536133, "step_time_ms": 7385.49542427063, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:01] (step=0002726) Train Loss: 0.3108, Train Steps/Sec: 0.13, Epoch: 0.052973183054799845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2727, "loss": 0.13234937191009521, "memory_gb": 7.721559524536133, "step_time_ms": 7438.746929168701, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:09] (step=0002727) Train Loss: 0.1571, Train Steps/Sec: 0.13, Epoch: 0.05299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2728, "loss": 0.22579725086688995, "memory_gb": 7.721559524536133, "step_time_ms": 7545.679569244385, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:17] (step=0002728) Train Loss: 0.2310, Train Steps/Sec: 0.12, Epoch: 0.05301204819277108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2729, "loss": 0.4333333373069763, "memory_gb": 7.721559524536133, "step_time_ms": 7314.488172531128, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:25] (step=0002729) Train Loss: 0.3141, Train Steps/Sec: 0.13, Epoch: 0.053031480761756705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2730, "loss": 0.27023249864578247, "memory_gb": 7.721559524536133, "step_time_ms": 6350.96549987793, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:32] (step=0002730) Train Loss: 0.3199, Train Steps/Sec: 0.15, Epoch: 0.05305091333074233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2731, "loss": 0.16776347160339355, "memory_gb": 7.721559524536133, "step_time_ms": 6640.620708465576, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:39] (step=0002731) Train Loss: 0.1709, Train Steps/Sec: 0.14, Epoch: 0.05307034589972794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2732, "loss": 0.1927221119403839, "memory_gb": 7.721559524536133, "step_time_ms": 7443.598031997681, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:47] (step=0002732) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.053089778468713565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-28 23:59:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2733, "loss": 0.2812788486480713, "memory_gb": 7.721559524536133, "step_time_ms": 7514.858722686768, "trainable_params": 4718592, "method": "lora"} [2025-07-28 23:59:55] (step=0002733) Train Loss: 0.2517, Train Steps/Sec: 0.12, Epoch: 0.05310921103769918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2734, "loss": 0.2864925265312195, "memory_gb": 7.721559524536133, "step_time_ms": 7444.849252700806, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:03] (step=0002734) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.0531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2735, "loss": 0.22533875703811646, "memory_gb": 7.721559524536133, "step_time_ms": 7455.7044506073, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:11] (step=0002735) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.053148076175670425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2736, "loss": 0.2705046236515045, "memory_gb": 7.721559524536133, "step_time_ms": 7495.023488998413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:19] (step=0002736) Train Loss: 0.2974, Train Steps/Sec: 0.12, Epoch: 0.05316750874465604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2737, "loss": 0.2034563422203064, "memory_gb": 7.721559524536133, "step_time_ms": 7426.651954650879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:27] (step=0002737) Train Loss: 0.2128, Train Steps/Sec: 0.12, Epoch: 0.05318694131364166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2738, "loss": 0.17568331956863403, "memory_gb": 7.721559524536133, "step_time_ms": 7451.295614242554, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:35] (step=0002738) Train Loss: 0.2346, Train Steps/Sec: 0.12, Epoch: 0.053206373882627285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2739, "loss": 0.24596552550792694, "memory_gb": 7.721559524536133, "step_time_ms": 7522.133827209473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:43] (step=0002739) Train Loss: 0.2822, Train Steps/Sec: 0.12, Epoch: 0.0532258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2740, "loss": 0.27340155839920044, "memory_gb": 7.721559524536133, "step_time_ms": 7465.588808059692, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:51] (step=0002740) Train Loss: 0.2515, Train Steps/Sec: 0.13, Epoch: 0.05324523902059852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:00:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2741, "loss": 0.2312961220741272, "memory_gb": 7.721559524536133, "step_time_ms": 7471.728563308716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:00:59] (step=0002741) Train Loss: 0.2037, Train Steps/Sec: 0.13, Epoch: 0.053264671589584145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2742, "loss": 0.22114822268486023, "memory_gb": 7.721559524536133, "step_time_ms": 7569.636821746826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:07] (step=0002742) Train Loss: 0.1772, Train Steps/Sec: 0.12, Epoch: 0.05328410415856976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2743, "loss": 0.17419052124023438, "memory_gb": 7.721559524536133, "step_time_ms": 7468.7840938568115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:16] (step=0002743) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.05330353672755538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2744, "loss": 0.1819424033164978, "memory_gb": 7.721559524536133, "step_time_ms": 7485.9020709991455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:24] (step=0002744) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.053322969296541005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2745, "loss": 0.2657383680343628, "memory_gb": 7.721559524536133, "step_time_ms": 7551.3153076171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:32] (step=0002745) Train Loss: 0.2546, Train Steps/Sec: 0.12, Epoch: 0.05334240186552662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2746, "loss": 0.22063162922859192, "memory_gb": 7.721559524536133, "step_time_ms": 7500.833034515381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:40] (step=0002746) Train Loss: 0.2356, Train Steps/Sec: 0.12, Epoch: 0.05336183443451224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2747, "loss": 0.2731286287307739, "memory_gb": 7.721559524536133, "step_time_ms": 7363.590717315674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:48] (step=0002747) Train Loss: 0.2027, Train Steps/Sec: 0.12, Epoch: 0.053381267003497865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:01:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2748, "loss": 0.11822886765003204, "memory_gb": 7.721559524536133, "step_time_ms": 7575.894594192505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:01:56] (step=0002748) Train Loss: 0.1732, Train Steps/Sec: 0.12, Epoch: 0.05340069957248348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2749, "loss": 0.17623868584632874, "memory_gb": 7.721559524536133, "step_time_ms": 7574.639320373535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:04] (step=0002749) Train Loss: 0.2763, Train Steps/Sec: 0.13, Epoch: 0.0534201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2750, "loss": 0.1888657510280609, "memory_gb": 7.721559524536133, "step_time_ms": 7750.596761703491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:12] (step=0002750) Train Loss: 0.1899, Train Steps/Sec: 0.12, Epoch: 0.053439564710454725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2751, "loss": 0.17490127682685852, "memory_gb": 7.721559524536133, "step_time_ms": 7675.180196762085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:20] (step=0002751) Train Loss: 0.1740, Train Steps/Sec: 0.12, Epoch: 0.05345899727944034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2752, "loss": 0.18214809894561768, "memory_gb": 7.721559524536133, "step_time_ms": 7623.937129974365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:28] (step=0002752) Train Loss: 0.2648, Train Steps/Sec: 0.12, Epoch: 0.05347842984842596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2753, "loss": 0.2587485611438751, "memory_gb": 7.721559524536133, "step_time_ms": 7507.071018218994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:36] (step=0002753) Train Loss: 0.2126, Train Steps/Sec: 0.12, Epoch: 0.053497862417411585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2754, "loss": 0.24427784979343414, "memory_gb": 7.721559524536133, "step_time_ms": 7529.555082321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:44] (step=0002754) Train Loss: 0.2016, Train Steps/Sec: 0.12, Epoch: 0.0535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2755, "loss": 0.30573606491088867, "memory_gb": 7.721559524536133, "step_time_ms": 7451.068639755249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:02:52] (step=0002755) Train Loss: 0.2358, Train Steps/Sec: 0.13, Epoch: 0.05353672755538282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2756, "loss": 0.1380983591079712, "memory_gb": 7.721559524536133, "step_time_ms": 7464.712619781494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:00] (step=0002756) Train Loss: 0.1917, Train Steps/Sec: 0.12, Epoch: 0.053556160124368445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2757, "loss": 0.17304709553718567, "memory_gb": 7.721559524536133, "step_time_ms": 7574.18966293335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:08] (step=0002757) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.05357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2758, "loss": 0.3013072907924652, "memory_gb": 7.721559524536133, "step_time_ms": 7370.487689971924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:16] (step=0002758) Train Loss: 0.2419, Train Steps/Sec: 0.13, Epoch: 0.05359502526233968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2759, "loss": 0.28835529088974, "memory_gb": 7.721559524536133, "step_time_ms": 6024.350166320801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:22] (step=0002759) Train Loss: 0.2757, Train Steps/Sec: 0.16, Epoch: 0.053614457831325305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2760, "loss": 0.18172863125801086, "memory_gb": 7.721559524536133, "step_time_ms": 7225.2137660980225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:30] (step=0002760) Train Loss: 0.2158, Train Steps/Sec: 0.13, Epoch: 0.05363389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2761, "loss": 0.2892201542854309, "memory_gb": 7.721559524536133, "step_time_ms": 7465.227127075195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:38] (step=0002761) Train Loss: 0.2399, Train Steps/Sec: 0.12, Epoch: 0.05365332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2762, "loss": 0.2750346064567566, "memory_gb": 7.721559524536133, "step_time_ms": 7574.827671051025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:46] (step=0002762) Train Loss: 0.3145, Train Steps/Sec: 0.12, Epoch: 0.05367275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:03:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2763, "loss": 0.15767616033554077, "memory_gb": 7.721559524536133, "step_time_ms": 7495.457887649536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:03:54] (step=0002763) Train Loss: 0.1898, Train Steps/Sec: 0.12, Epoch: 0.05369218810726778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2764, "loss": 0.2819768786430359, "memory_gb": 7.721559524536133, "step_time_ms": 7429.705858230591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:02] (step=0002764) Train Loss: 0.2369, Train Steps/Sec: 0.12, Epoch: 0.0537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2765, "loss": 0.2806149125099182, "memory_gb": 7.721559524536133, "step_time_ms": 7497.472524642944, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:10] (step=0002765) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.05373105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2766, "loss": 0.3062678575515747, "memory_gb": 7.721559524536133, "step_time_ms": 7566.412448883057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:18] (step=0002766) Train Loss: 0.2365, Train Steps/Sec: 0.12, Epoch: 0.05375048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2767, "loss": 0.3537735342979431, "memory_gb": 7.721559524536133, "step_time_ms": 7519.623756408691, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:26] (step=0002767) Train Loss: 0.2863, Train Steps/Sec: 0.12, Epoch: 0.05376991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2768, "loss": 0.12550681829452515, "memory_gb": 7.721559524536133, "step_time_ms": 7531.123638153076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:34] (step=0002768) Train Loss: 0.1667, Train Steps/Sec: 0.12, Epoch: 0.05378935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2769, "loss": 0.3054458796977997, "memory_gb": 7.721559524536133, "step_time_ms": 7489.05873298645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:42] (step=0002769) Train Loss: 0.2985, Train Steps/Sec: 0.13, Epoch: 0.0538087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2770, "loss": 0.24793604016304016, "memory_gb": 7.721559524536133, "step_time_ms": 7460.8399868011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:50] (step=0002770) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.05382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:04:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2771, "loss": 0.25425058603286743, "memory_gb": 7.721559524536133, "step_time_ms": 7633.847236633301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:04:58] (step=0002771) Train Loss: 0.2262, Train Steps/Sec: 0.12, Epoch: 0.05384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2772, "loss": 0.27710995078086853, "memory_gb": 7.721559524536133, "step_time_ms": 7535.624742507935, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:06] (step=0002772) Train Loss: 0.2152, Train Steps/Sec: 0.12, Epoch: 0.05386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2773, "loss": 0.17396748065948486, "memory_gb": 7.721559524536133, "step_time_ms": 7487.017869949341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:15] (step=0002773) Train Loss: 0.2507, Train Steps/Sec: 0.12, Epoch: 0.05388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2774, "loss": 0.23745501041412354, "memory_gb": 7.721559524536133, "step_time_ms": 7569.047689437866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:23] (step=0002774) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.0539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2775, "loss": 0.1797395944595337, "memory_gb": 7.721559524536133, "step_time_ms": 7464.829444885254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:31] (step=0002775) Train Loss: 0.2525, Train Steps/Sec: 0.13, Epoch: 0.05392537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2776, "loss": 0.22337323427200317, "memory_gb": 7.721559524536133, "step_time_ms": 7444.644212722778, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:39] (step=0002776) Train Loss: 0.2028, Train Steps/Sec: 0.12, Epoch: 0.05394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2777, "loss": 0.266021728515625, "memory_gb": 7.721559524536133, "step_time_ms": 7524.348974227905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:47] (step=0002777) Train Loss: 0.3114, Train Steps/Sec: 0.12, Epoch: 0.05396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:05:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2778, "loss": 0.14664384722709656, "memory_gb": 7.721559524536133, "step_time_ms": 7483.2799434661865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:05:55] (step=0002778) Train Loss: 0.1777, Train Steps/Sec: 0.12, Epoch: 0.05398367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2779, "loss": 0.2860643267631531, "memory_gb": 7.721559524536133, "step_time_ms": 7411.915302276611, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:03] (step=0002779) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.0540031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2780, "loss": 0.21450838446617126, "memory_gb": 7.721559524536133, "step_time_ms": 7538.347244262695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:11] (step=0002780) Train Loss: 0.1960, Train Steps/Sec: 0.12, Epoch: 0.05402254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2781, "loss": 0.3050731420516968, "memory_gb": 7.721559524536133, "step_time_ms": 7504.542112350464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:19] (step=0002781) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.05404197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2782, "loss": 0.2965073883533478, "memory_gb": 7.721559524536133, "step_time_ms": 7396.326541900635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:27] (step=0002782) Train Loss: 0.2830, Train Steps/Sec: 0.12, Epoch: 0.05406140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2783, "loss": 0.1616138219833374, "memory_gb": 7.721559524536133, "step_time_ms": 7181.441068649292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:35] (step=0002783) Train Loss: 0.1683, Train Steps/Sec: 0.13, Epoch: 0.05408083948698018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2784, "loss": 0.16670192778110504, "memory_gb": 7.721559524536133, "step_time_ms": 7394.319295883179, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:43] (step=0002784) Train Loss: 0.1734, Train Steps/Sec: 0.13, Epoch: 0.0541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2785, "loss": 0.331493079662323, "memory_gb": 7.721559524536133, "step_time_ms": 7346.989393234253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:51] (step=0002785) Train Loss: 0.2855, Train Steps/Sec: 0.12, Epoch: 0.05411970462495142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2786, "loss": 0.24711018800735474, "memory_gb": 7.721559524536133, "step_time_ms": 7323.517799377441, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:06:59] (step=0002786) Train Loss: 0.2209, Train Steps/Sec: 0.13, Epoch: 0.05413913719393704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2787, "loss": 0.17955443263053894, "memory_gb": 7.721559524536133, "step_time_ms": 7460.867166519165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:07] (step=0002787) Train Loss: 0.1917, Train Steps/Sec: 0.12, Epoch: 0.05415856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2788, "loss": 0.2473708987236023, "memory_gb": 7.721559524536133, "step_time_ms": 5190.429449081421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:13] (step=0002788) Train Loss: 0.2081, Train Steps/Sec: 0.17, Epoch: 0.05417800233190828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2789, "loss": 0.21050751209259033, "memory_gb": 7.721559524536133, "step_time_ms": 7503.337144851685, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:21] (step=0002789) Train Loss: 0.1705, Train Steps/Sec: 0.12, Epoch: 0.0541974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2790, "loss": 0.20340663194656372, "memory_gb": 7.721559524536133, "step_time_ms": 7468.386173248291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:29] (step=0002790) Train Loss: 0.1817, Train Steps/Sec: 0.13, Epoch: 0.05421686746987952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2791, "loss": 0.18433278799057007, "memory_gb": 7.721559524536133, "step_time_ms": 7683.713912963867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:37] (step=0002791) Train Loss: 0.2269, Train Steps/Sec: 0.13, Epoch: 0.054236300038865135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2792, "loss": 0.18873369693756104, "memory_gb": 7.721559524536133, "step_time_ms": 7575.706720352173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:45] (step=0002792) Train Loss: 0.1957, Train Steps/Sec: 0.12, Epoch: 0.05425573260785076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2793, "loss": 0.37082552909851074, "memory_gb": 7.721559524536133, "step_time_ms": 7530.691146850586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:07:53] (step=0002793) Train Loss: 0.2892, Train Steps/Sec: 0.13, Epoch: 0.05427516517683638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2794, "loss": 0.26038455963134766, "memory_gb": 7.721559524536133, "step_time_ms": 7529.932498931885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:01] (step=0002794) Train Loss: 0.2367, Train Steps/Sec: 0.12, Epoch: 0.054294597745821994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2795, "loss": 0.19246932864189148, "memory_gb": 7.721559524536133, "step_time_ms": 7579.1027545928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:09] (step=0002795) Train Loss: 0.2865, Train Steps/Sec: 0.12, Epoch: 0.05431403031480762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2796, "loss": 0.3002273738384247, "memory_gb": 7.721559524536133, "step_time_ms": 7493.621587753296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:17] (step=0002796) Train Loss: 0.2548, Train Steps/Sec: 0.12, Epoch: 0.05433346288379324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2797, "loss": 0.2760471999645233, "memory_gb": 7.721559524536133, "step_time_ms": 7508.193016052246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:25] (step=0002797) Train Loss: 0.2905, Train Steps/Sec: 0.12, Epoch: 0.054352895452778854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2798, "loss": 0.2199469357728958, "memory_gb": 7.721559524536133, "step_time_ms": 7555.367708206177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:33] (step=0002798) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.05437232802176448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2799, "loss": 0.22550895810127258, "memory_gb": 7.721559524536133, "step_time_ms": 7506.389617919922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:41] (step=0002799) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.0543917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2800, "loss": 0.2346855103969574, "memory_gb": 7.721559524536133, "step_time_ms": 7529.287338256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:49] (step=0002800) Train Loss: 0.2294, Train Steps/Sec: 0.12, Epoch: 0.054411193159735714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2801, "loss": 0.1474279761314392, "memory_gb": 7.721559524536133, "step_time_ms": 7590.752363204956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:08:57] (step=0002801) Train Loss: 0.1684, Train Steps/Sec: 0.12, Epoch: 0.05443062572872134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2802, "loss": 0.24353355169296265, "memory_gb": 7.721559524536133, "step_time_ms": 7527.076959609985, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:05] (step=0002802) Train Loss: 0.2425, Train Steps/Sec: 0.12, Epoch: 0.05445005829770696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2803, "loss": 0.21444270014762878, "memory_gb": 7.721559524536133, "step_time_ms": 7591.093063354492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:13] (step=0002803) Train Loss: 0.2546, Train Steps/Sec: 0.12, Epoch: 0.054469490866692574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2804, "loss": 0.214026540517807, "memory_gb": 7.721559524536133, "step_time_ms": 7643.187761306763, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:21] (step=0002804) Train Loss: 0.2173, Train Steps/Sec: 0.12, Epoch: 0.054488923435678197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2805, "loss": 0.26787781715393066, "memory_gb": 7.721559524536133, "step_time_ms": 7573.102951049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:29] (step=0002805) Train Loss: 0.2652, Train Steps/Sec: 0.12, Epoch: 0.05450835600466382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2806, "loss": 0.20660817623138428, "memory_gb": 7.721559524536133, "step_time_ms": 7599.229335784912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:37] (step=0002806) Train Loss: 0.2200, Train Steps/Sec: 0.12, Epoch: 0.054527788573649434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2807, "loss": 0.26641565561294556, "memory_gb": 7.721559524536133, "step_time_ms": 7621.201992034912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:45] (step=0002807) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.054547221142635056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:09:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2808, "loss": 0.2918027937412262, "memory_gb": 7.721559524536133, "step_time_ms": 7568.3434009552, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:09:53] (step=0002808) Train Loss: 0.3444, Train Steps/Sec: 0.12, Epoch: 0.05456665371162068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2809, "loss": 0.2286968231201172, "memory_gb": 7.721559524536133, "step_time_ms": 7587.971925735474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:01] (step=0002809) Train Loss: 0.2724, Train Steps/Sec: 0.12, Epoch: 0.054586086280606294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2810, "loss": 0.280037522315979, "memory_gb": 7.721559524536133, "step_time_ms": 7619.360446929932, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:10] (step=0002810) Train Loss: 0.2260, Train Steps/Sec: 0.12, Epoch: 0.054605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2811, "loss": 0.2582634389400482, "memory_gb": 7.721559524536133, "step_time_ms": 7524.188756942749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:18] (step=0002811) Train Loss: 0.3021, Train Steps/Sec: 0.12, Epoch: 0.05462495141857754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2812, "loss": 0.28179293870925903, "memory_gb": 7.721559524536133, "step_time_ms": 7574.957370758057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:26] (step=0002812) Train Loss: 0.2109, Train Steps/Sec: 0.12, Epoch: 0.054644383987563154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2813, "loss": 0.1937112957239151, "memory_gb": 7.721559524536133, "step_time_ms": 7344.799757003784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:34] (step=0002813) Train Loss: 0.2308, Train Steps/Sec: 0.12, Epoch: 0.054663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2814, "loss": 0.3326244354248047, "memory_gb": 7.721559524536133, "step_time_ms": 7461.060047149658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:41] (step=0002814) Train Loss: 0.2538, Train Steps/Sec: 0.13, Epoch: 0.0546832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2815, "loss": 0.2954503297805786, "memory_gb": 7.721559524536133, "step_time_ms": 7525.566101074219, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:49] (step=0002815) Train Loss: 0.2878, Train Steps/Sec: 0.13, Epoch: 0.054702681694520014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2816, "loss": 0.15586836636066437, "memory_gb": 7.721559524536133, "step_time_ms": 7474.944829940796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:10:57] (step=0002816) Train Loss: 0.1994, Train Steps/Sec: 0.12, Epoch: 0.054722114263505636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2817, "loss": 0.23405715823173523, "memory_gb": 7.721559524536133, "step_time_ms": 5495.91064453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:03] (step=0002817) Train Loss: 0.2419, Train Steps/Sec: 0.17, Epoch: 0.05474154683249126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2818, "loss": 0.2813357710838318, "memory_gb": 7.721559524536133, "step_time_ms": 7551.684141159058, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:11] (step=0002818) Train Loss: 0.2465, Train Steps/Sec: 0.12, Epoch: 0.054760979401476874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2819, "loss": 0.23515495657920837, "memory_gb": 7.721559524536133, "step_time_ms": 7432.022333145142, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:19] (step=0002819) Train Loss: 0.2532, Train Steps/Sec: 0.12, Epoch: 0.054780411970462496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2820, "loss": 0.26050373911857605, "memory_gb": 7.721559524536133, "step_time_ms": 7552.809953689575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:27] (step=0002820) Train Loss: 0.2391, Train Steps/Sec: 0.12, Epoch: 0.05479984453944812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2821, "loss": 0.24980083107948303, "memory_gb": 7.721559524536133, "step_time_ms": 7482.558488845825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:35] (step=0002821) Train Loss: 0.2786, Train Steps/Sec: 0.12, Epoch: 0.054819277108433734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2822, "loss": 0.21783556044101715, "memory_gb": 7.721559524536133, "step_time_ms": 7500.860214233398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:44] (step=0002822) Train Loss: 0.2777, Train Steps/Sec: 0.12, Epoch: 0.054838709677419356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2823, "loss": 0.2418024241924286, "memory_gb": 7.721559524536133, "step_time_ms": 7536.400556564331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:52] (step=0002823) Train Loss: 0.2505, Train Steps/Sec: 0.12, Epoch: 0.05485814224640497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2824, "loss": 0.18723200261592865, "memory_gb": 7.721559524536133, "step_time_ms": 7454.960346221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:11:59] (step=0002824) Train Loss: 0.1927, Train Steps/Sec: 0.13, Epoch: 0.054877574815390594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2825, "loss": 0.2804962992668152, "memory_gb": 7.721559524536133, "step_time_ms": 7439.676523208618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:07] (step=0002825) Train Loss: 0.2577, Train Steps/Sec: 0.13, Epoch: 0.054897007384376216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2826, "loss": 0.3394816815853119, "memory_gb": 7.721559524536133, "step_time_ms": 7499.91250038147, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:15] (step=0002826) Train Loss: 0.3056, Train Steps/Sec: 0.12, Epoch: 0.05491643995336183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2827, "loss": 0.14386673271656036, "memory_gb": 7.721559524536133, "step_time_ms": 7501.336336135864, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:23] (step=0002827) Train Loss: 0.1802, Train Steps/Sec: 0.12, Epoch: 0.054935872522347454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2828, "loss": 0.26301735639572144, "memory_gb": 7.721559524536133, "step_time_ms": 7426.803112030029, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:31] (step=0002828) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.054955305091333076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2829, "loss": 0.17969340085983276, "memory_gb": 7.721559524536133, "step_time_ms": 7450.578212738037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:39] (step=0002829) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.05497473766031869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2830, "loss": 0.34327763319015503, "memory_gb": 7.721559524536133, "step_time_ms": 7505.179882049561, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:47] (step=0002830) Train Loss: 0.3046, Train Steps/Sec: 0.12, Epoch: 0.054994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2831, "loss": 0.18061338365077972, "memory_gb": 7.721559524536133, "step_time_ms": 7437.6702308654785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:12:55] (step=0002831) Train Loss: 0.1844, Train Steps/Sec: 0.13, Epoch: 0.055013602798289936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2832, "loss": 0.24860268831253052, "memory_gb": 7.721559524536133, "step_time_ms": 7476.784706115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:04] (step=0002832) Train Loss: 0.2051, Train Steps/Sec: 0.12, Epoch: 0.05503303536727555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2833, "loss": 0.2095024734735489, "memory_gb": 7.721559524536133, "step_time_ms": 7512.319803237915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:12] (step=0002833) Train Loss: 0.2241, Train Steps/Sec: 0.12, Epoch: 0.055052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2834, "loss": 0.17088881134986877, "memory_gb": 7.721559524536133, "step_time_ms": 7503.301620483398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:20] (step=0002834) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.055071900505246796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2835, "loss": 0.22825485467910767, "memory_gb": 7.721559524536133, "step_time_ms": 7435.46724319458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:28] (step=0002835) Train Loss: 0.2196, Train Steps/Sec: 0.13, Epoch: 0.05509133307423241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2836, "loss": 0.3073558211326599, "memory_gb": 7.721559524536133, "step_time_ms": 7562.713146209717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:36] (step=0002836) Train Loss: 0.2890, Train Steps/Sec: 0.12, Epoch: 0.05511076564321803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2837, "loss": 0.30102723836898804, "memory_gb": 7.721559524536133, "step_time_ms": 7451.3421058654785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:44] (step=0002837) Train Loss: 0.2481, Train Steps/Sec: 0.13, Epoch: 0.055130198212203656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2838, "loss": 0.1911543905735016, "memory_gb": 7.721559524536133, "step_time_ms": 7462.4974727630615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:13:52] (step=0002838) Train Loss: 0.1778, Train Steps/Sec: 0.12, Epoch: 0.05514963078118927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2839, "loss": 0.16115747392177582, "memory_gb": 7.721559524536133, "step_time_ms": 7629.6586990356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:00] (step=0002839) Train Loss: 0.1547, Train Steps/Sec: 0.12, Epoch: 0.05516906335017489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2840, "loss": 0.233836829662323, "memory_gb": 7.721559524536133, "step_time_ms": 7504.642724990845, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:08] (step=0002840) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.055188495919160516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2841, "loss": 0.266498863697052, "memory_gb": 7.721559524536133, "step_time_ms": 7433.000802993774, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:16] (step=0002841) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.05520792848814613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2842, "loss": 0.2670442759990692, "memory_gb": 7.721559524536133, "step_time_ms": 7528.732538223267, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:24] (step=0002842) Train Loss: 0.2806, Train Steps/Sec: 0.12, Epoch: 0.05522736105713175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2843, "loss": 0.22153280675411224, "memory_gb": 7.721559524536133, "step_time_ms": 7512.839794158936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:32] (step=0002843) Train Loss: 0.3036, Train Steps/Sec: 0.12, Epoch: 0.055246793626117376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2844, "loss": 0.2595699429512024, "memory_gb": 7.721559524536133, "step_time_ms": 7422.076225280762, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:40] (step=0002844) Train Loss: 0.2298, Train Steps/Sec: 0.13, Epoch: 0.05526622619510299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2845, "loss": 0.2764472961425781, "memory_gb": 7.721559524536133, "step_time_ms": 7609.221696853638, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:48] (step=0002845) Train Loss: 0.3026, Train Steps/Sec: 0.12, Epoch: 0.05528565876408861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:14:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 2846, "loss": 0.2678799629211426, "memory_gb": 7.721559524536133, "step_time_ms": 5449.7339725494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:14:54] (step=0002846) Train Loss: 0.1978, Train Steps/Sec: 0.18, Epoch: 0.055305091333074236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 2847, "loss": 0.28358399868011475, "memory_gb": 7.721559524536133, "step_time_ms": 7559.556245803833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:02] (step=0002847) Train Loss: 0.2986, Train Steps/Sec: 0.12, Epoch: 0.05532452390205985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 2848, "loss": 0.34265685081481934, "memory_gb": 7.721559524536133, "step_time_ms": 7524.106979370117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:10] (step=0002848) Train Loss: 0.3340, Train Steps/Sec: 0.12, Epoch: 0.05534395647104547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 2849, "loss": 0.3935678005218506, "memory_gb": 7.721559524536133, "step_time_ms": 7568.721055984497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:18] (step=0002849) Train Loss: 0.3147, Train Steps/Sec: 0.13, Epoch: 0.055363389040031095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 2850, "loss": 0.25081440806388855, "memory_gb": 7.721559524536133, "step_time_ms": 7632.342100143433, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:26] (step=0002850) Train Loss: 0.2324, Train Steps/Sec: 0.12, Epoch: 0.05538282160901671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 2851, "loss": 0.19890019297599792, "memory_gb": 7.721559524536133, "step_time_ms": 7574.509382247925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:34] (step=0002851) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.05540225417800233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 2852, "loss": 0.2345237135887146, "memory_gb": 7.721559524536133, "step_time_ms": 7573.176860809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:42] (step=0002852) Train Loss: 0.2204, Train Steps/Sec: 0.12, Epoch: 0.05542168674698795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 2853, "loss": 0.2733778953552246, "memory_gb": 7.721559524536133, "step_time_ms": 7635.31756401062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:50] (step=0002853) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.05544111931597357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 2854, "loss": 0.28423064947128296, "memory_gb": 7.721559524536133, "step_time_ms": 7529.942274093628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:15:58] (step=0002854) Train Loss: 0.2459, Train Steps/Sec: 0.12, Epoch: 0.05546055188495919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 2855, "loss": 0.16202276945114136, "memory_gb": 7.721559524536133, "step_time_ms": 7514.558553695679, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:06] (step=0002855) Train Loss: 0.1802, Train Steps/Sec: 0.12, Epoch: 0.05547998445394481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 2856, "loss": 0.2072117030620575, "memory_gb": 7.721559524536133, "step_time_ms": 7598.702669143677, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:14] (step=0002856) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.05549941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 2857, "loss": 0.1922311782836914, "memory_gb": 7.721559524536133, "step_time_ms": 7471.571922302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:22] (step=0002857) Train Loss: 0.2376, Train Steps/Sec: 0.12, Epoch: 0.05551884959191605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 2858, "loss": 0.18413980305194855, "memory_gb": 7.721559524536133, "step_time_ms": 7528.183221817017, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:30] (step=0002858) Train Loss: 0.2047, Train Steps/Sec: 0.12, Epoch: 0.05553828216090167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 2859, "loss": 0.23398908972740173, "memory_gb": 7.721559524536133, "step_time_ms": 7534.610986709595, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:38] (step=0002859) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.05555771472988729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 2860, "loss": 0.17076732218265533, "memory_gb": 7.721559524536133, "step_time_ms": 7509.409666061401, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:46] (step=0002860) Train Loss: 0.2158, Train Steps/Sec: 0.12, Epoch: 0.05557714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2861, "loss": 0.28900882601737976, "memory_gb": 7.721559524536133, "step_time_ms": 7507.291555404663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:16:55] (step=0002861) Train Loss: 0.2986, Train Steps/Sec: 0.12, Epoch: 0.05559657986785853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2862, "loss": 0.1971065104007721, "memory_gb": 7.721559524536133, "step_time_ms": 7604.223012924194, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:03] (step=0002862) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.05561601243684415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2863, "loss": 0.28593844175338745, "memory_gb": 7.721559524536133, "step_time_ms": 7477.5824546813965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:11] (step=0002863) Train Loss: 0.2744, Train Steps/Sec: 0.13, Epoch: 0.05563544500582977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2864, "loss": 0.18861472606658936, "memory_gb": 7.721559524536133, "step_time_ms": 7430.9844970703125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:19] (step=0002864) Train Loss: 0.2426, Train Steps/Sec: 0.12, Epoch: 0.05565487757481539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2865, "loss": 0.23435156047344208, "memory_gb": 7.721559524536133, "step_time_ms": 7535.542249679565, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:27] (step=0002865) Train Loss: 0.2686, Train Steps/Sec: 0.12, Epoch: 0.05567431014380101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2866, "loss": 0.30776992440223694, "memory_gb": 7.721559524536133, "step_time_ms": 7464.9717807769775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:35] (step=0002866) Train Loss: 0.2600, Train Steps/Sec: 0.13, Epoch: 0.05569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2867, "loss": 0.18727195262908936, "memory_gb": 7.721559524536133, "step_time_ms": 7428.257703781128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:43] (step=0002867) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.05571317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2868, "loss": 0.28457286953926086, "memory_gb": 7.721559524536133, "step_time_ms": 7497.636556625366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:51] (step=0002868) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.05573260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2869, "loss": 0.13742509484291077, "memory_gb": 7.721559524536133, "step_time_ms": 7479.808330535889, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:17:59] (step=0002869) Train Loss: 0.1986, Train Steps/Sec: 0.12, Epoch: 0.05575204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2870, "loss": 0.2754380404949188, "memory_gb": 7.721559524536133, "step_time_ms": 7446.250915527344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:07] (step=0002870) Train Loss: 0.2026, Train Steps/Sec: 0.12, Epoch: 0.05577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2871, "loss": 0.22977325320243835, "memory_gb": 7.721559524536133, "step_time_ms": 7501.694917678833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:15] (step=0002871) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.05579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2872, "loss": 0.3035423159599304, "memory_gb": 7.721559524536133, "step_time_ms": 7421.863079071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:23] (step=0002872) Train Loss: 0.3176, Train Steps/Sec: 0.13, Epoch: 0.05581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2873, "loss": 0.16530102491378784, "memory_gb": 7.721559524536133, "step_time_ms": 7326.253414154053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:31] (step=0002873) Train Loss: 0.2021, Train Steps/Sec: 0.13, Epoch: 0.05582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2874, "loss": 0.1571708768606186, "memory_gb": 7.721559524536133, "step_time_ms": 7521.334171295166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:39] (step=0002874) Train Loss: 0.2251, Train Steps/Sec: 0.12, Epoch: 0.05584920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2875, "loss": 0.3546708822250366, "memory_gb": 7.721559524536133, "step_time_ms": 5257.204294204712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:45] (step=0002875) Train Loss: 0.2554, Train Steps/Sec: 0.18, Epoch: 0.05586863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2876, "loss": 0.24752555787563324, "memory_gb": 7.721559524536133, "step_time_ms": 7495.596647262573, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:18:53] (step=0002876) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.05588806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2877, "loss": 0.2359876036643982, "memory_gb": 7.721559524536133, "step_time_ms": 7225.809574127197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:01] (step=0002877) Train Loss: 0.2394, Train Steps/Sec: 0.13, Epoch: 0.05590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2878, "loss": 0.2753611207008362, "memory_gb": 7.721559524536133, "step_time_ms": 7429.936170578003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:09] (step=0002878) Train Loss: 0.2866, Train Steps/Sec: 0.13, Epoch: 0.05592693354061407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2879, "loss": 0.2680487632751465, "memory_gb": 7.721559524536133, "step_time_ms": 7605.448961257935, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:17] (step=0002879) Train Loss: 0.2239, Train Steps/Sec: 0.13, Epoch: 0.05594636610959969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2880, "loss": 0.24404145777225494, "memory_gb": 7.721559524536133, "step_time_ms": 7438.446044921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:25] (step=0002880) Train Loss: 0.2361, Train Steps/Sec: 0.12, Epoch: 0.05596579867858531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2881, "loss": 0.18199749290943146, "memory_gb": 7.721559524536133, "step_time_ms": 7446.855306625366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:33] (step=0002881) Train Loss: 0.2166, Train Steps/Sec: 0.13, Epoch: 0.055985231247570925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2882, "loss": 0.14292696118354797, "memory_gb": 7.721559524536133, "step_time_ms": 7533.202171325684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:41] (step=0002882) Train Loss: 0.1773, Train Steps/Sec: 0.12, Epoch: 0.05600466381655655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2883, "loss": 0.3678065538406372, "memory_gb": 7.721559524536133, "step_time_ms": 7424.317836761475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:49] (step=0002883) Train Loss: 0.3172, Train Steps/Sec: 0.12, Epoch: 0.05602409638554217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:19:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2884, "loss": 0.17089307308197021, "memory_gb": 7.721559524536133, "step_time_ms": 7479.207992553711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:19:57] (step=0002884) Train Loss: 0.1925, Train Steps/Sec: 0.13, Epoch: 0.056043528954527785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2885, "loss": 0.2991904318332672, "memory_gb": 7.721559524536133, "step_time_ms": 7496.979475021362, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:05] (step=0002885) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.05606296152351341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2886, "loss": 0.2469393014907837, "memory_gb": 7.721559524536133, "step_time_ms": 7474.094390869141, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:13] (step=0002886) Train Loss: 0.2025, Train Steps/Sec: 0.12, Epoch: 0.05608239409249903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2887, "loss": 0.23600593209266663, "memory_gb": 7.721559524536133, "step_time_ms": 7438.0481243133545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:21] (step=0002887) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.056101826661484645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2888, "loss": 0.2788972556591034, "memory_gb": 7.721559524536133, "step_time_ms": 7495.514869689941, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:29] (step=0002888) Train Loss: 0.2100, Train Steps/Sec: 0.12, Epoch: 0.05612125923047027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2889, "loss": 0.33253946900367737, "memory_gb": 7.721559524536133, "step_time_ms": 7448.508739471436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:37] (step=0002889) Train Loss: 0.2372, Train Steps/Sec: 0.13, Epoch: 0.05614069179945589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2890, "loss": 0.2827922999858856, "memory_gb": 7.721559524536133, "step_time_ms": 7450.725078582764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:45] (step=0002890) Train Loss: 0.2640, Train Steps/Sec: 0.12, Epoch: 0.056160124368441505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2891, "loss": 0.16158297657966614, "memory_gb": 7.721559524536133, "step_time_ms": 7546.0968017578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:20:53] (step=0002891) Train Loss: 0.1828, Train Steps/Sec: 0.12, Epoch: 0.05617955693742713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2892, "loss": 0.3007328510284424, "memory_gb": 7.721559524536133, "step_time_ms": 7494.634389877319, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:01] (step=0002892) Train Loss: 0.2417, Train Steps/Sec: 0.12, Epoch: 0.05619898950641275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2893, "loss": 0.24455514550209045, "memory_gb": 7.721559524536133, "step_time_ms": 7514.394521713257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:09] (step=0002893) Train Loss: 0.2691, Train Steps/Sec: 0.12, Epoch: 0.056218422075398365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2894, "loss": 0.26290521025657654, "memory_gb": 7.721559524536133, "step_time_ms": 7600.966453552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:17] (step=0002894) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.05623785464438399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2895, "loss": 0.1052437573671341, "memory_gb": 7.721559524536133, "step_time_ms": 7534.811496734619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:25] (step=0002895) Train Loss: 0.1824, Train Steps/Sec: 0.12, Epoch: 0.05625728721336961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2896, "loss": 0.23947939276695251, "memory_gb": 7.721559524536133, "step_time_ms": 7532.034635543823, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:33] (step=0002896) Train Loss: 0.2348, Train Steps/Sec: 0.12, Epoch: 0.056276719782355225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2897, "loss": 0.16779470443725586, "memory_gb": 7.721559524536133, "step_time_ms": 7548.482894897461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:41] (step=0002897) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.05629615235134085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2898, "loss": 0.27952104806900024, "memory_gb": 7.721559524536133, "step_time_ms": 7510.5836391448975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:49] (step=0002898) Train Loss: 0.2655, Train Steps/Sec: 0.12, Epoch: 0.05631558492032647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2899, "loss": 0.3161197304725647, "memory_gb": 7.721559524536133, "step_time_ms": 7523.636817932129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:21:57] (step=0002899) Train Loss: 0.2627, Train Steps/Sec: 0.13, Epoch: 0.056335017489312085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2900, "loss": 0.2159656286239624, "memory_gb": 7.721559524536133, "step_time_ms": 7531.445264816284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:05] (step=0002900) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.05635445005829771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2901, "loss": 0.253274530172348, "memory_gb": 7.721559524536133, "step_time_ms": 7492.570161819458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:13] (step=0002901) Train Loss: 0.2874, Train Steps/Sec: 0.13, Epoch: 0.05637388262728333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2902, "loss": 0.24659007787704468, "memory_gb": 7.721559524536133, "step_time_ms": 7438.9472007751465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:21] (step=0002902) Train Loss: 0.2306, Train Steps/Sec: 0.13, Epoch: 0.056393315196268945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2903, "loss": 0.3065069615840912, "memory_gb": 7.721559524536133, "step_time_ms": 7641.957998275757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:29] (step=0002903) Train Loss: 0.2998, Train Steps/Sec: 0.12, Epoch: 0.05641274776525457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2904, "loss": 0.17336316406726837, "memory_gb": 7.721559524536133, "step_time_ms": 4888.436794281006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:35] (step=0002904) Train Loss: 0.1724, Train Steps/Sec: 0.17, Epoch: 0.05643218033424019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2905, "loss": 0.2772355079650879, "memory_gb": 7.721559524536133, "step_time_ms": 7638.688802719116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:43] (step=0002905) Train Loss: 0.2577, Train Steps/Sec: 0.12, Epoch: 0.056451612903225805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2906, "loss": 0.21271702647209167, "memory_gb": 7.721559524536133, "step_time_ms": 7567.901134490967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:51] (step=0002906) Train Loss: 0.2796, Train Steps/Sec: 0.12, Epoch: 0.05647104547221143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2907, "loss": 0.2229236662387848, "memory_gb": 7.721559524536133, "step_time_ms": 7547.893285751343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:22:59] (step=0002907) Train Loss: 0.1876, Train Steps/Sec: 0.12, Epoch: 0.05649047804119705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2908, "loss": 0.24881909787654877, "memory_gb": 7.721559524536133, "step_time_ms": 7639.315128326416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:07] (step=0002908) Train Loss: 0.2347, Train Steps/Sec: 0.12, Epoch: 0.056509910610182665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2909, "loss": 0.286672443151474, "memory_gb": 7.721559524536133, "step_time_ms": 7550.315618515015, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:15] (step=0002909) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.05652934317916829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2910, "loss": 0.26791098713874817, "memory_gb": 7.721559524536133, "step_time_ms": 7539.675712585449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:23] (step=0002910) Train Loss: 0.2473, Train Steps/Sec: 0.12, Epoch: 0.05654877574815391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2911, "loss": 0.14486658573150635, "memory_gb": 7.721559524536133, "step_time_ms": 7555.732727050781, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:31] (step=0002911) Train Loss: 0.2040, Train Steps/Sec: 0.12, Epoch: 0.056568208317139525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2912, "loss": 0.16960470378398895, "memory_gb": 7.721559524536133, "step_time_ms": 7515.242099761963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:39] (step=0002912) Train Loss: 0.1533, Train Steps/Sec: 0.13, Epoch: 0.05658764088612515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2913, "loss": 0.18786457180976868, "memory_gb": 7.721559524536133, "step_time_ms": 7555.828332901001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:47] (step=0002913) Train Loss: 0.1886, Train Steps/Sec: 0.12, Epoch: 0.05660707345511076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2914, "loss": 0.3056851625442505, "memory_gb": 7.721559524536133, "step_time_ms": 7578.1707763671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:23:56] (step=0002914) Train Loss: 0.2788, Train Steps/Sec: 0.12, Epoch: 0.056626506024096385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2915, "loss": 0.19952890276908875, "memory_gb": 7.721559524536133, "step_time_ms": 7528.189897537231, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:04] (step=0002915) Train Loss: 0.2877, Train Steps/Sec: 0.12, Epoch: 0.05664593859308201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2916, "loss": 0.25546249747276306, "memory_gb": 7.721559524536133, "step_time_ms": 7611.49001121521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:12] (step=0002916) Train Loss: 0.2677, Train Steps/Sec: 0.12, Epoch: 0.05666537116206762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2917, "loss": 0.3257119357585907, "memory_gb": 7.721559524536133, "step_time_ms": 7532.67240524292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:20] (step=0002917) Train Loss: 0.3546, Train Steps/Sec: 0.13, Epoch: 0.056684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 2918, "loss": 0.20233647525310516, "memory_gb": 7.721559524536133, "step_time_ms": 7530.097246170044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:28] (step=0002918) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.05670423630003887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 2919, "loss": 0.3793017566204071, "memory_gb": 7.715639114379883, "step_time_ms": 7465.056896209717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:36] (step=0002919) Train Loss: 0.3231, Train Steps/Sec: 0.12, Epoch: 0.05672366886902448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 2920, "loss": 0.2862839102745056, "memory_gb": 7.721559524536133, "step_time_ms": 7531.904935836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:44] (step=0002920) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.056743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 2921, "loss": 0.3501059412956238, "memory_gb": 7.721559524536133, "step_time_ms": 7510.299444198608, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:24:52] (step=0002921) Train Loss: 0.3362, Train Steps/Sec: 0.12, Epoch: 0.05676253400699573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 2922, "loss": 0.2538430094718933, "memory_gb": 7.721559524536133, "step_time_ms": 7442.552804946899, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:00] (step=0002922) Train Loss: 0.2760, Train Steps/Sec: 0.12, Epoch: 0.05678196657598134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 2923, "loss": 0.22053299844264984, "memory_gb": 7.721559524536133, "step_time_ms": 7490.889072418213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:08] (step=0002923) Train Loss: 0.2077, Train Steps/Sec: 0.12, Epoch: 0.056801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 2924, "loss": 0.23711146414279938, "memory_gb": 7.721559524536133, "step_time_ms": 7397.807836532593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:16] (step=0002924) Train Loss: 0.2755, Train Steps/Sec: 0.13, Epoch: 0.05682083171395259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2925, "loss": 0.269853413105011, "memory_gb": 7.721559524536133, "step_time_ms": 7412.760257720947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:24] (step=0002925) Train Loss: 0.2334, Train Steps/Sec: 0.12, Epoch: 0.0568402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2926, "loss": 0.2811400592327118, "memory_gb": 7.721559524536133, "step_time_ms": 7603.951215744019, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:32] (step=0002926) Train Loss: 0.2182, Train Steps/Sec: 0.12, Epoch: 0.056859696851923824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 2927, "loss": 0.21843384206295013, "memory_gb": 7.721559524536133, "step_time_ms": 7404.987812042236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:40] (step=0002927) Train Loss: 0.2738, Train Steps/Sec: 0.13, Epoch: 0.05687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2928, "loss": 0.3660523593425751, "memory_gb": 7.721559524536133, "step_time_ms": 7430.520296096802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:48] (step=0002928) Train Loss: 0.3377, Train Steps/Sec: 0.12, Epoch: 0.05689856198989506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:25:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2929, "loss": 0.1988169401884079, "memory_gb": 7.721559524536133, "step_time_ms": 7461.199760437012, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:25:56] (step=0002929) Train Loss: 0.1870, Train Steps/Sec: 0.12, Epoch: 0.056917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 2930, "loss": 0.2803990840911865, "memory_gb": 7.721559524536133, "step_time_ms": 7412.292957305908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:04] (step=0002930) Train Loss: 0.2347, Train Steps/Sec: 0.13, Epoch: 0.05693742712786631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 2931, "loss": 0.14297235012054443, "memory_gb": 7.721559524536133, "step_time_ms": 7290.074348449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:12] (step=0002931) Train Loss: 0.2555, Train Steps/Sec: 0.13, Epoch: 0.05695685969685192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 2932, "loss": 0.27660447359085083, "memory_gb": 7.721559524536133, "step_time_ms": 7491.767644882202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:20] (step=0002932) Train Loss: 0.2934, Train Steps/Sec: 0.12, Epoch: 0.056976292265837544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2933, "loss": 0.3083394765853882, "memory_gb": 7.721559524536133, "step_time_ms": 4992.706775665283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:25] (step=0002933) Train Loss: 0.2529, Train Steps/Sec: 0.19, Epoch: 0.05699572483482317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2934, "loss": 0.16836059093475342, "memory_gb": 7.721559524536133, "step_time_ms": 7522.757053375244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:33] (step=0002934) Train Loss: 0.2330, Train Steps/Sec: 0.13, Epoch: 0.05701515740380878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2935, "loss": 0.22990286350250244, "memory_gb": 7.721559524536133, "step_time_ms": 7464.3707275390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:41] (step=0002935) Train Loss: 0.2028, Train Steps/Sec: 0.12, Epoch: 0.057034589972794404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2936, "loss": 0.14138682186603546, "memory_gb": 7.721559524536133, "step_time_ms": 7449.5134353637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:49] (step=0002936) Train Loss: 0.1707, Train Steps/Sec: 0.13, Epoch: 0.057054022541780026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:26:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2937, "loss": 0.23047766089439392, "memory_gb": 7.721559524536133, "step_time_ms": 7504.07075881958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:26:57] (step=0002937) Train Loss: 0.2577, Train Steps/Sec: 0.13, Epoch: 0.05707345511076564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2938, "loss": 0.26452329754829407, "memory_gb": 7.721559524536133, "step_time_ms": 7495.279788970947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:05] (step=0002938) Train Loss: 0.2277, Train Steps/Sec: 0.13, Epoch: 0.057092887679751264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2939, "loss": 0.272264301776886, "memory_gb": 7.721559524536133, "step_time_ms": 7469.593286514282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:13] (step=0002939) Train Loss: 0.2220, Train Steps/Sec: 0.13, Epoch: 0.057112320248736886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2940, "loss": 0.27685046195983887, "memory_gb": 7.721559524536133, "step_time_ms": 7519.847869873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:21] (step=0002940) Train Loss: 0.2255, Train Steps/Sec: 0.13, Epoch: 0.0571317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2941, "loss": 0.2426305115222931, "memory_gb": 7.721559524536133, "step_time_ms": 7444.295883178711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:29] (step=0002941) Train Loss: 0.2289, Train Steps/Sec: 0.13, Epoch: 0.057151185386708124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2942, "loss": 0.20539599657058716, "memory_gb": 7.721559524536133, "step_time_ms": 7481.672286987305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:37] (step=0002942) Train Loss: 0.2049, Train Steps/Sec: 0.12, Epoch: 0.05717061795569374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2943, "loss": 0.2611783742904663, "memory_gb": 7.721559524536133, "step_time_ms": 7497.174263000488, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:45] (step=0002943) Train Loss: 0.2547, Train Steps/Sec: 0.12, Epoch: 0.05719005052467936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2944, "loss": 0.16619911789894104, "memory_gb": 7.721559524536133, "step_time_ms": 7166.969776153564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:27:53] (step=0002944) Train Loss: 0.1881, Train Steps/Sec: 0.13, Epoch: 0.057209483093664984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2945, "loss": 0.26042377948760986, "memory_gb": 7.721559524536133, "step_time_ms": 7479.001522064209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:01] (step=0002945) Train Loss: 0.2281, Train Steps/Sec: 0.13, Epoch: 0.0572289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2946, "loss": 0.230531245470047, "memory_gb": 7.721559524536133, "step_time_ms": 7521.97003364563, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:09] (step=0002946) Train Loss: 0.2034, Train Steps/Sec: 0.13, Epoch: 0.05724834823163622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 2947, "loss": 0.3711773157119751, "memory_gb": 7.721559524536133, "step_time_ms": 7516.321897506714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:17] (step=0002947) Train Loss: 0.3243, Train Steps/Sec: 0.13, Epoch: 0.057267780800621844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 2948, "loss": 0.24974624812602997, "memory_gb": 7.721559524536133, "step_time_ms": 7542.521953582764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:25] (step=0002948) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.05728721336960746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 2949, "loss": 0.24417391419410706, "memory_gb": 7.721559524536133, "step_time_ms": 7592.566013336182, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:33] (step=0002949) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.05730664593859308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 2950, "loss": 0.3336576521396637, "memory_gb": 7.721559524536133, "step_time_ms": 7552.490711212158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:41] (step=0002950) Train Loss: 0.2884, Train Steps/Sec: 0.12, Epoch: 0.057326078507578704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 2951, "loss": 0.23896241188049316, "memory_gb": 7.721559524536133, "step_time_ms": 7537.775039672852, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:49] (step=0002951) Train Loss: 0.2283, Train Steps/Sec: 0.13, Epoch: 0.05734551107656432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:28:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 2952, "loss": 0.11742173880338669, "memory_gb": 7.721559524536133, "step_time_ms": 7639.953136444092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:28:57] (step=0002952) Train Loss: 0.1815, Train Steps/Sec: 0.12, Epoch: 0.05736494364554994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2953, "loss": 0.30434486269950867, "memory_gb": 7.721559524536133, "step_time_ms": 7529.529809951782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:05] (step=0002953) Train Loss: 0.2474, Train Steps/Sec: 0.13, Epoch: 0.057384376214535564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2954, "loss": 0.23075750470161438, "memory_gb": 7.721559524536133, "step_time_ms": 7518.0723667144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:13] (step=0002954) Train Loss: 0.2106, Train Steps/Sec: 0.12, Epoch: 0.05740380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2955, "loss": 0.28292328119277954, "memory_gb": 7.721559524536133, "step_time_ms": 7614.474058151245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:21] (step=0002955) Train Loss: 0.2100, Train Steps/Sec: 0.12, Epoch: 0.0574232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2956, "loss": 0.375477135181427, "memory_gb": 7.721559524536133, "step_time_ms": 7436.089515686035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:29] (step=0002956) Train Loss: 0.3550, Train Steps/Sec: 0.12, Epoch: 0.057442673921492424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2957, "loss": 0.204164057970047, "memory_gb": 7.721559524536133, "step_time_ms": 7463.860034942627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:37] (step=0002957) Train Loss: 0.1483, Train Steps/Sec: 0.12, Epoch: 0.05746210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2958, "loss": 0.3151114583015442, "memory_gb": 7.721559524536133, "step_time_ms": 7538.135051727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:45] (step=0002958) Train Loss: 0.2773, Train Steps/Sec: 0.12, Epoch: 0.05748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:29:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2959, "loss": 0.2341919243335724, "memory_gb": 7.721559524536133, "step_time_ms": 7492.849826812744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:29:53] (step=0002959) Train Loss: 0.2580, Train Steps/Sec: 0.13, Epoch: 0.057500971628449284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2960, "loss": 0.11606946587562561, "memory_gb": 7.721559524536133, "step_time_ms": 7291.672468185425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:01] (step=0002960) Train Loss: 0.1941, Train Steps/Sec: 0.13, Epoch: 0.0575204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2961, "loss": 0.24117371439933777, "memory_gb": 7.721559524536133, "step_time_ms": 7484.59005355835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:09] (step=0002961) Train Loss: 0.2317, Train Steps/Sec: 0.12, Epoch: 0.05753983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2962, "loss": 0.2539488971233368, "memory_gb": 7.721559524536133, "step_time_ms": 4853.55806350708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:15] (step=0002962) Train Loss: 0.2378, Train Steps/Sec: 0.18, Epoch: 0.057559269335406144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 2963, "loss": 0.18913793563842773, "memory_gb": 7.721559524536133, "step_time_ms": 7587.433099746704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:23] (step=0002963) Train Loss: 0.2597, Train Steps/Sec: 0.12, Epoch: 0.05757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 2964, "loss": 0.25676780939102173, "memory_gb": 7.721559524536133, "step_time_ms": 7560.693979263306, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:31] (step=0002964) Train Loss: 0.2440, Train Steps/Sec: 0.12, Epoch: 0.05759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2965, "loss": 0.2789786159992218, "memory_gb": 7.721559524536133, "step_time_ms": 7511.659860610962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:39] (step=0002965) Train Loss: 0.2987, Train Steps/Sec: 0.12, Epoch: 0.057617567042363003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 2966, "loss": 0.14625577628612518, "memory_gb": 7.721559524536133, "step_time_ms": 7534.600496292114, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:47] (step=0002966) Train Loss: 0.1498, Train Steps/Sec: 0.12, Epoch: 0.05763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:30:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 2967, "loss": 0.27546608448028564, "memory_gb": 7.721559524536133, "step_time_ms": 7599.465131759644, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:30:55] (step=0002967) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.05765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2968, "loss": 0.34729573130607605, "memory_gb": 7.721559524536133, "step_time_ms": 7508.973598480225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:03] (step=0002968) Train Loss: 0.3157, Train Steps/Sec: 0.12, Epoch: 0.05767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2969, "loss": 0.16979780793190002, "memory_gb": 7.721559524536133, "step_time_ms": 7531.54182434082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:11] (step=0002969) Train Loss: 0.2315, Train Steps/Sec: 0.12, Epoch: 0.05769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2970, "loss": 0.2517613172531128, "memory_gb": 7.721559524536133, "step_time_ms": 7494.7028160095215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:19] (step=0002970) Train Loss: 0.2504, Train Steps/Sec: 0.12, Epoch: 0.0577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2971, "loss": 0.2487613558769226, "memory_gb": 7.721559524536133, "step_time_ms": 7431.183815002441, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:27] (step=0002971) Train Loss: 0.2506, Train Steps/Sec: 0.13, Epoch: 0.057734162456276716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2972, "loss": 0.22124984860420227, "memory_gb": 7.721559524536133, "step_time_ms": 7530.200719833374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:35] (step=0002972) Train Loss: 0.2678, Train Steps/Sec: 0.12, Epoch: 0.05775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2973, "loss": 0.15207108855247498, "memory_gb": 7.721559524536133, "step_time_ms": 7486.51909828186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:43] (step=0002973) Train Loss: 0.2132, Train Steps/Sec: 0.12, Epoch: 0.05777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2974, "loss": 0.21331903338432312, "memory_gb": 7.721559524536133, "step_time_ms": 7446.157693862915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:51] (step=0002974) Train Loss: 0.1941, Train Steps/Sec: 0.13, Epoch: 0.057792460163233576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:31:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2975, "loss": 0.2529146075248718, "memory_gb": 7.721559524536133, "step_time_ms": 7531.612396240234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:31:59] (step=0002975) Train Loss: 0.2788, Train Steps/Sec: 0.12, Epoch: 0.0578118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 2976, "loss": 0.2266179323196411, "memory_gb": 7.721559524536133, "step_time_ms": 7457.470893859863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:07] (step=0002976) Train Loss: 0.2710, Train Steps/Sec: 0.12, Epoch: 0.05783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 2977, "loss": 0.3160276412963867, "memory_gb": 7.721559524536133, "step_time_ms": 7458.589792251587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:16] (step=0002977) Train Loss: 0.2440, Train Steps/Sec: 0.12, Epoch: 0.057850757870190436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 2978, "loss": 0.0931713879108429, "memory_gb": 7.721559524536133, "step_time_ms": 7477.985143661499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:24] (step=0002978) Train Loss: 0.1276, Train Steps/Sec: 0.12, Epoch: 0.05787019043917606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 2979, "loss": 0.19589385390281677, "memory_gb": 7.721559524536133, "step_time_ms": 7410.550117492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:32] (step=0002979) Train Loss: 0.1726, Train Steps/Sec: 0.12, Epoch: 0.05788962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 2980, "loss": 0.2150912880897522, "memory_gb": 7.721559524536133, "step_time_ms": 7400.198221206665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:39] (step=0002980) Train Loss: 0.2643, Train Steps/Sec: 0.13, Epoch: 0.057909055577147296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 2981, "loss": 0.2252286672592163, "memory_gb": 7.721559524536133, "step_time_ms": 7535.60209274292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:48] (step=0002981) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.05792848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:32:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 2982, "loss": 0.29044216871261597, "memory_gb": 7.721559524536133, "step_time_ms": 7424.445390701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:32:56] (step=0002982) Train Loss: 0.2709, Train Steps/Sec: 0.13, Epoch: 0.05794792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 2983, "loss": 0.19473464787006378, "memory_gb": 7.721559524536133, "step_time_ms": 7439.847707748413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:04] (step=0002983) Train Loss: 0.2420, Train Steps/Sec: 0.13, Epoch: 0.057967353284104156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 2984, "loss": 0.28091374039649963, "memory_gb": 7.721559524536133, "step_time_ms": 7503.560304641724, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:11] (step=0002984) Train Loss: 0.3078, Train Steps/Sec: 0.13, Epoch: 0.05798678585308978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 2985, "loss": 0.3266306221485138, "memory_gb": 7.721559524536133, "step_time_ms": 7430.448532104492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:19] (step=0002985) Train Loss: 0.2842, Train Steps/Sec: 0.13, Epoch: 0.0580062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 2986, "loss": 0.20759421586990356, "memory_gb": 7.721559524536133, "step_time_ms": 7456.380128860474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:27] (step=0002986) Train Loss: 0.1757, Train Steps/Sec: 0.13, Epoch: 0.058025650991061016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 2987, "loss": 0.2700437903404236, "memory_gb": 7.721559524536133, "step_time_ms": 7549.298048019409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:35] (step=0002987) Train Loss: 0.2589, Train Steps/Sec: 0.12, Epoch: 0.05804508356004664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 2988, "loss": 0.3388078808784485, "memory_gb": 7.721559524536133, "step_time_ms": 7421.8909740448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:43] (step=0002988) Train Loss: 0.2824, Train Steps/Sec: 0.13, Epoch: 0.05806451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 2989, "loss": 0.3253733217716217, "memory_gb": 7.721559524536133, "step_time_ms": 7291.829824447632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:51] (step=0002989) Train Loss: 0.3321, Train Steps/Sec: 0.13, Epoch: 0.058083948698017876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:33:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 2990, "loss": 0.18745824694633484, "memory_gb": 7.721559524536133, "step_time_ms": 7514.326810836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:33:59] (step=0002990) Train Loss: 0.1672, Train Steps/Sec: 0.12, Epoch: 0.0581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 2991, "loss": 0.14945411682128906, "memory_gb": 7.721559524536133, "step_time_ms": 5070.08695602417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:05] (step=0002991) Train Loss: 0.2077, Train Steps/Sec: 0.17, Epoch: 0.05812281383598912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 2992, "loss": 0.29192066192626953, "memory_gb": 7.721559524536133, "step_time_ms": 7498.915433883667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:13] (step=0002992) Train Loss: 0.2718, Train Steps/Sec: 0.12, Epoch: 0.058142246404974736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 2993, "loss": 0.2807892858982086, "memory_gb": 7.721559524536133, "step_time_ms": 7474.309682846069, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:21] (step=0002993) Train Loss: 0.2063, Train Steps/Sec: 0.12, Epoch: 0.05816167897396036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 2994, "loss": 0.26333215832710266, "memory_gb": 7.721559524536133, "step_time_ms": 7450.325965881348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:29] (step=0002994) Train Loss: 0.2382, Train Steps/Sec: 0.12, Epoch: 0.05818111154294598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 2995, "loss": 0.15681082010269165, "memory_gb": 7.721559524536133, "step_time_ms": 7536.180734634399, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:37] (step=0002995) Train Loss: 0.1534, Train Steps/Sec: 0.12, Epoch: 0.058200544111931596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 2996, "loss": 0.20295044779777527, "memory_gb": 7.721559524536133, "step_time_ms": 7533.596992492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:45] (step=0002996) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.05821997668091722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 2997, "loss": 0.2165926694869995, "memory_gb": 7.721559524536133, "step_time_ms": 7489.563465118408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:34:53] (step=0002997) Train Loss: 0.2770, Train Steps/Sec: 0.12, Epoch: 0.05823940924990284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 2998, "loss": 0.22943279147148132, "memory_gb": 7.721559524536133, "step_time_ms": 7545.623779296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:01] (step=0002998) Train Loss: 0.2310, Train Steps/Sec: 0.12, Epoch: 0.058258841818888456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 2999, "loss": 0.28877073526382446, "memory_gb": 7.721559524536133, "step_time_ms": 7546.148061752319, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:09] (step=0002999) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.05827827438787408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3000, "loss": 0.26645076274871826, "memory_gb": 7.721559524536133, "step_time_ms": 7491.535186767578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:17] (step=0003000) Train Loss: 0.2563, Train Steps/Sec: 0.13, Epoch: 0.05829770695685969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3001, "loss": 0.3673997223377228, "memory_gb": 7.721559524536133, "step_time_ms": 7585.553169250488, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:25] (step=0003001) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.058317139525845316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3002, "loss": 0.30607643723487854, "memory_gb": 7.721559524536133, "step_time_ms": 7556.196689605713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:34] (step=0003002) Train Loss: 0.2147, Train Steps/Sec: 0.12, Epoch: 0.05833657209483094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3003, "loss": 0.25603196024894714, "memory_gb": 7.721559524536133, "step_time_ms": 7481.364011764526, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:42] (step=0003003) Train Loss: 0.2417, Train Steps/Sec: 0.12, Epoch: 0.05835600466381655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3004, "loss": 0.24756965041160583, "memory_gb": 7.721559524536133, "step_time_ms": 7551.609516143799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:50] (step=0003004) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.058375437232802176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3005, "loss": 0.24329861998558044, "memory_gb": 7.721559524536133, "step_time_ms": 7482.719898223877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:35:58] (step=0003005) Train Loss: 0.2664, Train Steps/Sec: 0.12, Epoch: 0.0583948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3006, "loss": 0.2868729531764984, "memory_gb": 7.721559524536133, "step_time_ms": 7447.386741638184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:06] (step=0003006) Train Loss: 0.2526, Train Steps/Sec: 0.12, Epoch: 0.05841430237077341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3007, "loss": 0.2523345649242401, "memory_gb": 7.721559524536133, "step_time_ms": 7584.125757217407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:14] (step=0003007) Train Loss: 0.2759, Train Steps/Sec: 0.12, Epoch: 0.058433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3008, "loss": 0.250921368598938, "memory_gb": 7.721559524536133, "step_time_ms": 7520.93243598938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:22] (step=0003008) Train Loss: 0.2151, Train Steps/Sec: 0.13, Epoch: 0.05845316750874466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3009, "loss": 0.2849443852901459, "memory_gb": 7.721559524536133, "step_time_ms": 7514.561176300049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:30] (step=0003009) Train Loss: 0.2526, Train Steps/Sec: 0.12, Epoch: 0.05847260007773027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3010, "loss": 0.3362756073474884, "memory_gb": 7.721559524536133, "step_time_ms": 7548.364877700806, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:38] (step=0003010) Train Loss: 0.3077, Train Steps/Sec: 0.12, Epoch: 0.058492032646715895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3011, "loss": 0.3363005518913269, "memory_gb": 7.721559524536133, "step_time_ms": 7560.396194458008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:46] (step=0003011) Train Loss: 0.2783, Train Steps/Sec: 0.13, Epoch: 0.05851146521570152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:36:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3012, "loss": 0.18333417177200317, "memory_gb": 7.721559524536133, "step_time_ms": 7282.720327377319, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:36:54] (step=0003012) Train Loss: 0.1844, Train Steps/Sec: 0.12, Epoch: 0.05853089778468713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3013, "loss": 0.17706377804279327, "memory_gb": 7.721559524536133, "step_time_ms": 7618.745565414429, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:02] (step=0003013) Train Loss: 0.2396, Train Steps/Sec: 0.12, Epoch: 0.058550330353672755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3014, "loss": 0.1432662159204483, "memory_gb": 7.721559524536133, "step_time_ms": 7533.966779708862, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:10] (step=0003014) Train Loss: 0.1999, Train Steps/Sec: 0.12, Epoch: 0.05856976292265838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3015, "loss": 0.22283576428890228, "memory_gb": 7.721559524536133, "step_time_ms": 7641.557693481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:18] (step=0003015) Train Loss: 0.2052, Train Steps/Sec: 0.12, Epoch: 0.05858919549164399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3016, "loss": 0.2255384474992752, "memory_gb": 7.721559524536133, "step_time_ms": 7536.484241485596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:26] (step=0003016) Train Loss: 0.2215, Train Steps/Sec: 0.12, Epoch: 0.058608628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3017, "loss": 0.19300806522369385, "memory_gb": 7.721559524536133, "step_time_ms": 7494.579076766968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:34] (step=0003017) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.05862806062961524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3018, "loss": 0.22267526388168335, "memory_gb": 7.721559524536133, "step_time_ms": 7363.309621810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:42] (step=0003018) Train Loss: 0.1977, Train Steps/Sec: 0.13, Epoch: 0.05864749319860085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3019, "loss": 0.26975947618484497, "memory_gb": 7.721559524536133, "step_time_ms": 7510.298490524292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:50] (step=0003019) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 0.058666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3020, "loss": 0.18423429131507874, "memory_gb": 7.721559524536133, "step_time_ms": 5329.103946685791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:37:56] (step=0003020) Train Loss: 0.2777, Train Steps/Sec: 0.17, Epoch: 0.0586863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3021, "loss": 0.2519184947013855, "memory_gb": 7.721559524536133, "step_time_ms": 7636.875867843628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:04] (step=0003021) Train Loss: 0.2353, Train Steps/Sec: 0.12, Epoch: 0.05870579090555771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3022, "loss": 0.2951897084712982, "memory_gb": 7.721559524536133, "step_time_ms": 7544.0473556518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:12] (step=0003022) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.058725223474543335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3023, "loss": 0.31468456983566284, "memory_gb": 7.721559524536133, "step_time_ms": 7498.262882232666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:20] (step=0003023) Train Loss: 0.2777, Train Steps/Sec: 0.13, Epoch: 0.05874465604352896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3024, "loss": 0.3122304677963257, "memory_gb": 7.721559524536133, "step_time_ms": 7594.565391540527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:28] (step=0003024) Train Loss: 0.2868, Train Steps/Sec: 0.12, Epoch: 0.05876408861251457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3025, "loss": 0.24879196286201477, "memory_gb": 7.721559524536133, "step_time_ms": 7537.031412124634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:36] (step=0003025) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.058783521181500195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3026, "loss": 0.15560665726661682, "memory_gb": 7.721559524536133, "step_time_ms": 7464.224815368652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:44] (step=0003026) Train Loss: 0.2126, Train Steps/Sec: 0.13, Epoch: 0.05880295375048582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3027, "loss": 0.2404547929763794, "memory_gb": 7.721559524536133, "step_time_ms": 7546.809434890747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:38:52] (step=0003027) Train Loss: 0.2549, Train Steps/Sec: 0.13, Epoch: 0.05882238631947143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3028, "loss": 0.1996840238571167, "memory_gb": 7.721559524536133, "step_time_ms": 7508.450984954834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:00] (step=0003028) Train Loss: 0.2613, Train Steps/Sec: 0.12, Epoch: 0.058841818888457055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3029, "loss": 0.22546032071113586, "memory_gb": 7.721559524536133, "step_time_ms": 7451.323747634888, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:08] (step=0003029) Train Loss: 0.2521, Train Steps/Sec: 0.12, Epoch: 0.05886125145744268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3030, "loss": 0.25172457098960876, "memory_gb": 7.721559524536133, "step_time_ms": 7531.905174255371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:16] (step=0003030) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.05888068402642829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3031, "loss": 0.1422695815563202, "memory_gb": 7.721559524536133, "step_time_ms": 7498.186349868774, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:24] (step=0003031) Train Loss: 0.1825, Train Steps/Sec: 0.12, Epoch: 0.058900116595413915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3032, "loss": 0.2599920928478241, "memory_gb": 7.721559524536133, "step_time_ms": 7476.208209991455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:32] (step=0003032) Train Loss: 0.2720, Train Steps/Sec: 0.12, Epoch: 0.05891954916439953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3033, "loss": 0.13290302455425262, "memory_gb": 7.721559524536133, "step_time_ms": 7502.892017364502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:40] (step=0003033) Train Loss: 0.1544, Train Steps/Sec: 0.12, Epoch: 0.05893898173338515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3034, "loss": 0.21367913484573364, "memory_gb": 7.721559524536133, "step_time_ms": 7461.312770843506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:48] (step=0003034) Train Loss: 0.2146, Train Steps/Sec: 0.13, Epoch: 0.058958414302370775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:39:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3035, "loss": 0.30437910556793213, "memory_gb": 7.721559524536133, "step_time_ms": 7413.572549819946, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:39:56] (step=0003035) Train Loss: 0.2889, Train Steps/Sec: 0.13, Epoch: 0.05897784687135639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3036, "loss": 0.3126528263092041, "memory_gb": 7.721559524536133, "step_time_ms": 7461.444854736328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:04] (step=0003036) Train Loss: 0.2486, Train Steps/Sec: 0.13, Epoch: 0.05899727944034201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3037, "loss": 0.3367420732975006, "memory_gb": 7.721559524536133, "step_time_ms": 7461.901426315308, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:12] (step=0003037) Train Loss: 0.3188, Train Steps/Sec: 0.13, Epoch: 0.059016712009327635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3038, "loss": 0.21557411551475525, "memory_gb": 7.721559524536133, "step_time_ms": 7429.100036621094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:20] (step=0003038) Train Loss: 0.2095, Train Steps/Sec: 0.13, Epoch: 0.05903614457831325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3039, "loss": 0.27807673811912537, "memory_gb": 7.721559524536133, "step_time_ms": 7504.993200302124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:28] (step=0003039) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.05905557714729887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3040, "loss": 0.18219655752182007, "memory_gb": 7.721559524536133, "step_time_ms": 7513.3209228515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:36] (step=0003040) Train Loss: 0.1764, Train Steps/Sec: 0.12, Epoch: 0.059075009716284495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3041, "loss": 0.22878748178482056, "memory_gb": 7.721559524536133, "step_time_ms": 7463.174819946289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:44] (step=0003041) Train Loss: 0.2303, Train Steps/Sec: 0.12, Epoch: 0.05909444228527011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3042, "loss": 0.24536745250225067, "memory_gb": 7.715639114379883, "step_time_ms": 7427.5712966918945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:40:52] (step=0003042) Train Loss: 0.2268, Train Steps/Sec: 0.13, Epoch: 0.05911387485425573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3043, "loss": 0.26180094480514526, "memory_gb": 7.721559524536133, "step_time_ms": 7454.410076141357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:00] (step=0003043) Train Loss: 0.2240, Train Steps/Sec: 0.12, Epoch: 0.059133307423241355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3044, "loss": 0.25608357787132263, "memory_gb": 7.721559524536133, "step_time_ms": 7374.632358551025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:08] (step=0003044) Train Loss: 0.2759, Train Steps/Sec: 0.12, Epoch: 0.05915273999222697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3045, "loss": 0.2523336112499237, "memory_gb": 7.721559524536133, "step_time_ms": 7422.127723693848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:16] (step=0003045) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.05917217256121259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3046, "loss": 0.174143448472023, "memory_gb": 7.721559524536133, "step_time_ms": 7541.043758392334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:24] (step=0003046) Train Loss: 0.2128, Train Steps/Sec: 0.12, Epoch: 0.059191605130198215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3047, "loss": 0.2193892002105713, "memory_gb": 7.721559524536133, "step_time_ms": 7302.947998046875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:32] (step=0003047) Train Loss: 0.2057, Train Steps/Sec: 0.13, Epoch: 0.05921103769918383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3048, "loss": 0.2593788504600525, "memory_gb": 7.721559524536133, "step_time_ms": 6882.368326187134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:39] (step=0003048) Train Loss: 0.2453, Train Steps/Sec: 0.14, Epoch: 0.05923047026816945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3049, "loss": 0.22546525299549103, "memory_gb": 7.721559524536133, "step_time_ms": 6240.121364593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:46] (step=0003049) Train Loss: 0.1732, Train Steps/Sec: 0.15, Epoch: 0.059249902837155075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:41:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3050, "loss": 0.25904104113578796, "memory_gb": 7.721559524536133, "step_time_ms": 7476.43780708313, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:41:54] (step=0003050) Train Loss: 0.2945, Train Steps/Sec: 0.12, Epoch: 0.05926933540614069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3051, "loss": 0.23615029454231262, "memory_gb": 7.721559524536133, "step_time_ms": 7492.739200592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:02] (step=0003051) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.05928876797512631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3052, "loss": 0.22495070099830627, "memory_gb": 7.721559524536133, "step_time_ms": 7454.941749572754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:10] (step=0003052) Train Loss: 0.1901, Train Steps/Sec: 0.12, Epoch: 0.059308200544111934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3053, "loss": 0.2356439083814621, "memory_gb": 7.721559524536133, "step_time_ms": 7456.207275390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:18] (step=0003053) Train Loss: 0.1775, Train Steps/Sec: 0.12, Epoch: 0.05932763311309755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3054, "loss": 0.22255930304527283, "memory_gb": 7.721559524536133, "step_time_ms": 7484.124422073364, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:26] (step=0003054) Train Loss: 0.2743, Train Steps/Sec: 0.12, Epoch: 0.05934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3055, "loss": 0.1917388141155243, "memory_gb": 7.721559524536133, "step_time_ms": 7420.033693313599, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:34] (step=0003055) Train Loss: 0.1591, Train Steps/Sec: 0.12, Epoch: 0.059366498251068794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3056, "loss": 0.26881811022758484, "memory_gb": 7.721559524536133, "step_time_ms": 7528.123140335083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:42] (step=0003056) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.05938593082005441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3057, "loss": 0.31069469451904297, "memory_gb": 7.721559524536133, "step_time_ms": 7499.595642089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:50] (step=0003057) Train Loss: 0.2961, Train Steps/Sec: 0.12, Epoch: 0.05940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:42:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3058, "loss": 0.26399126648902893, "memory_gb": 7.721559524536133, "step_time_ms": 7422.8551387786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:42:58] (step=0003058) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.059424795958025654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3059, "loss": 0.26854878664016724, "memory_gb": 7.721559524536133, "step_time_ms": 7488.092422485352, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:06] (step=0003059) Train Loss: 0.2676, Train Steps/Sec: 0.12, Epoch: 0.05944422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3060, "loss": 0.30777668952941895, "memory_gb": 7.721559524536133, "step_time_ms": 7569.792032241821, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:15] (step=0003060) Train Loss: 0.2909, Train Steps/Sec: 0.12, Epoch: 0.05946366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3061, "loss": 0.1706099510192871, "memory_gb": 7.721559524536133, "step_time_ms": 7490.687131881714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:23] (step=0003061) Train Loss: 0.1885, Train Steps/Sec: 0.12, Epoch: 0.05948309366498251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3062, "loss": 0.2666368782520294, "memory_gb": 7.721559524536133, "step_time_ms": 7517.411470413208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:31] (step=0003062) Train Loss: 0.2560, Train Steps/Sec: 0.13, Epoch: 0.05950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3063, "loss": 0.3474052846431732, "memory_gb": 7.715639114379883, "step_time_ms": 7691.203355789185, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:39] (step=0003063) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.05952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3064, "loss": 0.2808605134487152, "memory_gb": 7.721559524536133, "step_time_ms": 7500.258684158325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:47] (step=0003064) Train Loss: 0.2874, Train Steps/Sec: 0.12, Epoch: 0.05954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3065, "loss": 0.2845652997493744, "memory_gb": 7.721559524536133, "step_time_ms": 7526.747703552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:43:55] (step=0003065) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.05956082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3066, "loss": 0.17786382138729095, "memory_gb": 7.721559524536133, "step_time_ms": 7585.796356201172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:03] (step=0003066) Train Loss: 0.1566, Train Steps/Sec: 0.12, Epoch: 0.05958025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3067, "loss": 0.29744112491607666, "memory_gb": 7.721559524536133, "step_time_ms": 7528.852701187134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:11] (step=0003067) Train Loss: 0.2679, Train Steps/Sec: 0.13, Epoch: 0.05959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3068, "loss": 0.26342451572418213, "memory_gb": 7.721559524536133, "step_time_ms": 7556.751012802124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:19] (step=0003068) Train Loss: 0.2173, Train Steps/Sec: 0.12, Epoch: 0.05961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3069, "loss": 0.16622020304203033, "memory_gb": 7.721559524536133, "step_time_ms": 7578.762531280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:27] (step=0003069) Train Loss: 0.1949, Train Steps/Sec: 0.12, Epoch: 0.05963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3070, "loss": 0.3134635090827942, "memory_gb": 7.721559524536133, "step_time_ms": 7578.366041183472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:35] (step=0003070) Train Loss: 0.2736, Train Steps/Sec: 0.12, Epoch: 0.05965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3071, "loss": 0.24216455221176147, "memory_gb": 7.721559524536133, "step_time_ms": 7520.651817321777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:43] (step=0003071) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.05967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3072, "loss": 0.2937394380569458, "memory_gb": 7.721559524536133, "step_time_ms": 7668.35880279541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:51] (step=0003072) Train Loss: 0.2461, Train Steps/Sec: 0.12, Epoch: 0.05969685192382433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:44:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3073, "loss": 0.2379944771528244, "memory_gb": 7.721559524536133, "step_time_ms": 7556.696891784668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:44:59] (step=0003073) Train Loss: 0.3102, Train Steps/Sec: 0.12, Epoch: 0.05971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3074, "loss": 0.30237534642219543, "memory_gb": 7.721559524536133, "step_time_ms": 7464.926242828369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:07] (step=0003074) Train Loss: 0.2861, Train Steps/Sec: 0.12, Epoch: 0.05973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3075, "loss": 0.30122271180152893, "memory_gb": 7.721559524536133, "step_time_ms": 7620.279550552368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:15] (step=0003075) Train Loss: 0.2952, Train Steps/Sec: 0.12, Epoch: 0.05975514963078119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3076, "loss": 0.17591829597949982, "memory_gb": 7.721559524536133, "step_time_ms": 7411.36622428894, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:23] (step=0003076) Train Loss: 0.1900, Train Steps/Sec: 0.13, Epoch: 0.05977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3077, "loss": 0.28100845217704773, "memory_gb": 7.721559524536133, "step_time_ms": 6239.962816238403, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:30] (step=0003077) Train Loss: 0.2390, Train Steps/Sec: 0.15, Epoch: 0.05979401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3078, "loss": 0.25860172510147095, "memory_gb": 7.721559524536133, "step_time_ms": 6741.394758224487, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:37] (step=0003078) Train Loss: 0.2268, Train Steps/Sec: 0.13, Epoch: 0.05981344733773805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3079, "loss": 0.2753086984157562, "memory_gb": 7.721559524536133, "step_time_ms": 7528.610467910767, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:45] (step=0003079) Train Loss: 0.2782, Train Steps/Sec: 0.12, Epoch: 0.05983287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:45:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3080, "loss": 0.30202895402908325, "memory_gb": 7.721559524536133, "step_time_ms": 7616.986274719238, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:45:53] (step=0003080) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.05985231247570929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3081, "loss": 0.3617388904094696, "memory_gb": 7.721559524536133, "step_time_ms": 7506.37412071228, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:01] (step=0003081) Train Loss: 0.3224, Train Steps/Sec: 0.12, Epoch: 0.05987174504469491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3082, "loss": 0.17622731626033783, "memory_gb": 7.721559524536133, "step_time_ms": 7506.863832473755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:09] (step=0003082) Train Loss: 0.2414, Train Steps/Sec: 0.13, Epoch: 0.05989117761368053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3083, "loss": 0.2082807719707489, "memory_gb": 7.721559524536133, "step_time_ms": 7598.352909088135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:17] (step=0003083) Train Loss: 0.2033, Train Steps/Sec: 0.12, Epoch: 0.05991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3084, "loss": 0.2536862790584564, "memory_gb": 7.721559524536133, "step_time_ms": 7543.77818107605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:26] (step=0003084) Train Loss: 0.2487, Train Steps/Sec: 0.12, Epoch: 0.05993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3085, "loss": 0.2902963161468506, "memory_gb": 7.721559524536133, "step_time_ms": 7470.512866973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:34] (step=0003085) Train Loss: 0.3147, Train Steps/Sec: 0.12, Epoch: 0.05994947532063739, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3086, "loss": 0.2814999520778656, "memory_gb": 7.721559524536133, "step_time_ms": 7514.740705490112, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:42] (step=0003086) Train Loss: 0.3000, Train Steps/Sec: 0.12, Epoch: 0.05996890788962301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3087, "loss": 0.35270705819129944, "memory_gb": 7.721559524536133, "step_time_ms": 7446.211576461792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:50] (step=0003087) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.05998834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:46:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3088, "loss": 0.28027859330177307, "memory_gb": 7.721559524536133, "step_time_ms": 7313.953399658203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:46:58] (step=0003088) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.06000777302759425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3089, "loss": 0.18585185706615448, "memory_gb": 7.721559524536133, "step_time_ms": 7526.5820026397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:06] (step=0003089) Train Loss: 0.2518, Train Steps/Sec: 0.12, Epoch: 0.06002720559657987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3090, "loss": 0.23553381860256195, "memory_gb": 7.721559524536133, "step_time_ms": 7419.542551040649, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:14] (step=0003090) Train Loss: 0.2506, Train Steps/Sec: 0.12, Epoch: 0.060046638165565484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3091, "loss": 0.21751604974269867, "memory_gb": 7.721559524536133, "step_time_ms": 7436.878442764282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:22] (step=0003091) Train Loss: 0.1958, Train Steps/Sec: 0.12, Epoch: 0.06006607073455111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3092, "loss": 0.3279147744178772, "memory_gb": 7.721559524536133, "step_time_ms": 7526.440620422363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:30] (step=0003092) Train Loss: 0.2898, Train Steps/Sec: 0.12, Epoch: 0.06008550330353673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3093, "loss": 0.17291325330734253, "memory_gb": 7.721559524536133, "step_time_ms": 7463.869571685791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:38] (step=0003093) Train Loss: 0.1331, Train Steps/Sec: 0.12, Epoch: 0.060104935872522344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3094, "loss": 0.22145403921604156, "memory_gb": 7.721559524536133, "step_time_ms": 7460.991621017456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:46] (step=0003094) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.06012436844150797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:47:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3095, "loss": 0.2055715024471283, "memory_gb": 7.721559524536133, "step_time_ms": 7518.505334854126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:47:54] (step=0003095) Train Loss: 0.1870, Train Steps/Sec: 0.12, Epoch: 0.06014380101049359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3096, "loss": 0.3149285316467285, "memory_gb": 7.721559524536133, "step_time_ms": 7484.44128036499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:02] (step=0003096) Train Loss: 0.2413, Train Steps/Sec: 0.12, Epoch: 0.060163233579479204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3097, "loss": 0.29842039942741394, "memory_gb": 7.721559524536133, "step_time_ms": 7510.052919387817, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:10] (step=0003097) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.060182666148464826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3098, "loss": 0.3504525423049927, "memory_gb": 7.721559524536133, "step_time_ms": 7491.774797439575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:18] (step=0003098) Train Loss: 0.3453, Train Steps/Sec: 0.12, Epoch: 0.06020209871745045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3099, "loss": 0.3262724280357361, "memory_gb": 7.721559524536133, "step_time_ms": 7443.412780761719, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:26] (step=0003099) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.060221531286436064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3100, "loss": 0.19557979702949524, "memory_gb": 7.721559524536133, "step_time_ms": 7437.017202377319, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:34] (step=0003100) Train Loss: 0.1832, Train Steps/Sec: 0.12, Epoch: 0.060240963855421686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3101, "loss": 0.20786048471927643, "memory_gb": 7.721559524536133, "step_time_ms": 7463.618755340576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:42] (step=0003101) Train Loss: 0.2413, Train Steps/Sec: 0.13, Epoch: 0.06026039642440731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3102, "loss": 0.2646002769470215, "memory_gb": 7.721559524536133, "step_time_ms": 7459.797620773315, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:50] (step=0003102) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.060279828993392924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:48:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3103, "loss": 0.2585667371749878, "memory_gb": 7.721559524536133, "step_time_ms": 7504.876136779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:48:59] (step=0003103) Train Loss: 0.3160, Train Steps/Sec: 0.12, Epoch: 0.060299261562378546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3104, "loss": 0.282371461391449, "memory_gb": 7.721559524536133, "step_time_ms": 7316.18332862854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:06] (step=0003104) Train Loss: 0.2616, Train Steps/Sec: 0.13, Epoch: 0.06031869413136417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3105, "loss": 0.21537987887859344, "memory_gb": 7.721559524536133, "step_time_ms": 7334.638357162476, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:14] (step=0003105) Train Loss: 0.2198, Train Steps/Sec: 0.13, Epoch: 0.060338126700349784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3106, "loss": 0.2599662244319916, "memory_gb": 7.721559524536133, "step_time_ms": 5684.381723403931, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:20] (step=0003106) Train Loss: 0.2303, Train Steps/Sec: 0.17, Epoch: 0.060357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3107, "loss": 0.15133273601531982, "memory_gb": 7.721559524536133, "step_time_ms": 7139.111757278442, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:28] (step=0003107) Train Loss: 0.1847, Train Steps/Sec: 0.13, Epoch: 0.06037699183832103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3108, "loss": 0.15974724292755127, "memory_gb": 7.721559524536133, "step_time_ms": 7393.890619277954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:36] (step=0003108) Train Loss: 0.2318, Train Steps/Sec: 0.12, Epoch: 0.060396424407306644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3109, "loss": 0.27410516142845154, "memory_gb": 7.721559524536133, "step_time_ms": 7505.50651550293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:44] (step=0003109) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.060415856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:49:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3110, "loss": 0.2655949592590332, "memory_gb": 7.721559524536133, "step_time_ms": 7411.560773849487, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:49:52] (step=0003110) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 0.06043528954527789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3111, "loss": 0.31487852334976196, "memory_gb": 7.721559524536133, "step_time_ms": 7399.026393890381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:00] (step=0003111) Train Loss: 0.2718, Train Steps/Sec: 0.12, Epoch: 0.060454722114263504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3112, "loss": 0.2726457417011261, "memory_gb": 7.721559524536133, "step_time_ms": 7329.1802406311035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:08] (step=0003112) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.060474154683249126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3113, "loss": 0.22843021154403687, "memory_gb": 7.721559524536133, "step_time_ms": 7413.201093673706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:16] (step=0003113) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.06049358725223475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3114, "loss": 0.1437610685825348, "memory_gb": 7.721559524536133, "step_time_ms": 7420.642852783203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:24] (step=0003114) Train Loss: 0.2008, Train Steps/Sec: 0.12, Epoch: 0.060513019821220364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3115, "loss": 0.288047194480896, "memory_gb": 7.721559524536133, "step_time_ms": 7498.945951461792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:32] (step=0003115) Train Loss: 0.3207, Train Steps/Sec: 0.12, Epoch: 0.060532452390205986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3116, "loss": 0.2482529580593109, "memory_gb": 7.721559524536133, "step_time_ms": 7488.968372344971, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:40] (step=0003116) Train Loss: 0.2371, Train Steps/Sec: 0.12, Epoch: 0.06055188495919161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3117, "loss": 0.18228405714035034, "memory_gb": 7.715639114379883, "step_time_ms": 7430.462121963501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:48] (step=0003117) Train Loss: 0.2178, Train Steps/Sec: 0.12, Epoch: 0.060571317528177224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:50:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3118, "loss": 0.27094390988349915, "memory_gb": 7.721559524536133, "step_time_ms": 7540.981769561768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:50:57] (step=0003118) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.060590750097162846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3119, "loss": 0.1820705235004425, "memory_gb": 7.721559524536133, "step_time_ms": 7563.684940338135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:05] (step=0003119) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.06061018266614847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3120, "loss": 0.284282922744751, "memory_gb": 7.721559524536133, "step_time_ms": 7523.947954177856, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:13] (step=0003120) Train Loss: 0.2944, Train Steps/Sec: 0.12, Epoch: 0.060629615235134084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3121, "loss": 0.18824833631515503, "memory_gb": 7.721559524536133, "step_time_ms": 7574.461460113525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:21] (step=0003121) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.060649047804119706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3122, "loss": 0.2432153970003128, "memory_gb": 7.721559524536133, "step_time_ms": 7576.7316818237305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:29] (step=0003122) Train Loss: 0.1982, Train Steps/Sec: 0.12, Epoch: 0.06066848037310532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3123, "loss": 0.22162196040153503, "memory_gb": 7.721559524536133, "step_time_ms": 7512.185573577881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:37] (step=0003123) Train Loss: 0.2867, Train Steps/Sec: 0.12, Epoch: 0.060687912942090944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3124, "loss": 0.22375501692295074, "memory_gb": 7.721559524536133, "step_time_ms": 7594.376564025879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:45] (step=0003124) Train Loss: 0.1955, Train Steps/Sec: 0.12, Epoch: 0.060707345511076566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:51:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3125, "loss": 0.2533271908760071, "memory_gb": 7.721559524536133, "step_time_ms": 7589.048862457275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:51:53] (step=0003125) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.06072677808006218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3126, "loss": 0.19296616315841675, "memory_gb": 7.721559524536133, "step_time_ms": 7547.515630722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:01] (step=0003126) Train Loss: 0.1847, Train Steps/Sec: 0.12, Epoch: 0.060746210649047803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3127, "loss": 0.14209291338920593, "memory_gb": 7.721559524536133, "step_time_ms": 7598.557710647583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:09] (step=0003127) Train Loss: 0.1707, Train Steps/Sec: 0.12, Epoch: 0.060765643218033426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3128, "loss": 0.24087414145469666, "memory_gb": 7.721559524536133, "step_time_ms": 7630.01012802124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:17] (step=0003128) Train Loss: 0.2672, Train Steps/Sec: 0.12, Epoch: 0.06078507578701904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3129, "loss": 0.23954810202121735, "memory_gb": 7.721559524536133, "step_time_ms": 7551.440477371216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:25] (step=0003129) Train Loss: 0.2483, Train Steps/Sec: 0.13, Epoch: 0.06080450835600466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3130, "loss": 0.2601560652256012, "memory_gb": 7.721559524536133, "step_time_ms": 7577.461957931519, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:33] (step=0003130) Train Loss: 0.2105, Train Steps/Sec: 0.12, Epoch: 0.060823940924990286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3131, "loss": 0.2621685266494751, "memory_gb": 7.721559524536133, "step_time_ms": 7596.340656280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:41] (step=0003131) Train Loss: 0.2225, Train Steps/Sec: 0.12, Epoch: 0.0608433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3132, "loss": 0.3015144169330597, "memory_gb": 7.721559524536133, "step_time_ms": 7555.148124694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:49] (step=0003132) Train Loss: 0.3097, Train Steps/Sec: 0.12, Epoch: 0.06086280606296152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:52:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3133, "loss": 0.38189125061035156, "memory_gb": 7.721559524536133, "step_time_ms": 7435.011148452759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:52:57] (step=0003133) Train Loss: 0.2428, Train Steps/Sec: 0.13, Epoch: 0.060882238631947146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3134, "loss": 0.20806577801704407, "memory_gb": 7.721559524536133, "step_time_ms": 7536.522388458252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:05] (step=0003134) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.06090167120093276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3135, "loss": 0.34126016497612, "memory_gb": 7.721559524536133, "step_time_ms": 5263.48090171814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:11] (step=0003135) Train Loss: 0.3026, Train Steps/Sec: 0.18, Epoch: 0.06092110376991838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3136, "loss": 0.2735961675643921, "memory_gb": 7.721559524536133, "step_time_ms": 7507.909297943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:19] (step=0003136) Train Loss: 0.2715, Train Steps/Sec: 0.12, Epoch: 0.060940536338904006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3137, "loss": 0.2597894072532654, "memory_gb": 7.721559524536133, "step_time_ms": 7438.0433559417725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:27] (step=0003137) Train Loss: 0.2339, Train Steps/Sec: 0.13, Epoch: 0.06095996890788962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3138, "loss": 0.20339348912239075, "memory_gb": 7.721559524536133, "step_time_ms": 7479.501485824585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:35] (step=0003138) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.06097940147687524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3139, "loss": 0.22069412469863892, "memory_gb": 7.715639114379883, "step_time_ms": 7444.889545440674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:43] (step=0003139) Train Loss: 0.2262, Train Steps/Sec: 0.12, Epoch: 0.060998834045860865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3140, "loss": 0.3388078510761261, "memory_gb": 7.721559524536133, "step_time_ms": 7398.427486419678, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:51] (step=0003140) Train Loss: 0.3392, Train Steps/Sec: 0.13, Epoch: 0.06101826661484648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3141, "loss": 0.29128915071487427, "memory_gb": 7.721559524536133, "step_time_ms": 7440.218448638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:53:59] (step=0003141) Train Loss: 0.2522, Train Steps/Sec: 0.13, Epoch: 0.0610376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3142, "loss": 0.25035661458969116, "memory_gb": 7.721559524536133, "step_time_ms": 7481.724262237549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:07] (step=0003142) Train Loss: 0.2360, Train Steps/Sec: 0.12, Epoch: 0.061057131752817725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3143, "loss": 0.27649158239364624, "memory_gb": 7.721559524536133, "step_time_ms": 7463.640213012695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:15] (step=0003143) Train Loss: 0.2359, Train Steps/Sec: 0.13, Epoch: 0.06107656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3144, "loss": 0.39968541264533997, "memory_gb": 7.721559524536133, "step_time_ms": 7471.499681472778, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:23] (step=0003144) Train Loss: 0.2944, Train Steps/Sec: 0.12, Epoch: 0.06109599689078896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3145, "loss": 0.1604645550251007, "memory_gb": 7.721559524536133, "step_time_ms": 7435.487270355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:31] (step=0003145) Train Loss: 0.2095, Train Steps/Sec: 0.13, Epoch: 0.061115429459774585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3146, "loss": 0.13718703389167786, "memory_gb": 7.721559524536133, "step_time_ms": 7211.065292358398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:39] (step=0003146) Train Loss: 0.1860, Train Steps/Sec: 0.13, Epoch: 0.0611348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3147, "loss": 0.1364811360836029, "memory_gb": 7.721559524536133, "step_time_ms": 7464.722394943237, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:47] (step=0003147) Train Loss: 0.1734, Train Steps/Sec: 0.13, Epoch: 0.06115429459774582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3148, "loss": 0.2241651713848114, "memory_gb": 7.721559524536133, "step_time_ms": 7487.682104110718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:54:55] (step=0003148) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.061173727166731445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3149, "loss": 0.228080153465271, "memory_gb": 7.721559524536133, "step_time_ms": 7452.053070068359, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:03] (step=0003149) Train Loss: 0.2196, Train Steps/Sec: 0.12, Epoch: 0.06119315973571706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3150, "loss": 0.21995171904563904, "memory_gb": 7.721559524536133, "step_time_ms": 7623.922348022461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:11] (step=0003150) Train Loss: 0.2990, Train Steps/Sec: 0.12, Epoch: 0.06121259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3151, "loss": 0.36892813444137573, "memory_gb": 7.721559524536133, "step_time_ms": 7470.076560974121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:19] (step=0003151) Train Loss: 0.2748, Train Steps/Sec: 0.13, Epoch: 0.0612320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3152, "loss": 0.218917578458786, "memory_gb": 7.721559524536133, "step_time_ms": 7426.451921463013, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:27] (step=0003152) Train Loss: 0.2425, Train Steps/Sec: 0.12, Epoch: 0.06125145744267392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3153, "loss": 0.339947909116745, "memory_gb": 7.721559524536133, "step_time_ms": 7464.677572250366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:35] (step=0003153) Train Loss: 0.3406, Train Steps/Sec: 0.12, Epoch: 0.06127089001165954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3154, "loss": 0.18647602200508118, "memory_gb": 7.721559524536133, "step_time_ms": 7523.8306522369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:43] (step=0003154) Train Loss: 0.2257, Train Steps/Sec: 0.12, Epoch: 0.06129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3155, "loss": 0.2895260453224182, "memory_gb": 7.721559524536133, "step_time_ms": 7466.860294342041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:51] (step=0003155) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.06130975514963078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:55:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3156, "loss": 0.3071952164173126, "memory_gb": 7.721559524536133, "step_time_ms": 7446.179628372192, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:55:59] (step=0003156) Train Loss: 0.2614, Train Steps/Sec: 0.12, Epoch: 0.0613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3157, "loss": 0.1752939224243164, "memory_gb": 7.721559524536133, "step_time_ms": 7498.5480308532715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:07] (step=0003157) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.06134862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3158, "loss": 0.1657712608575821, "memory_gb": 7.721559524536133, "step_time_ms": 7417.621850967407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:15] (step=0003158) Train Loss: 0.2023, Train Steps/Sec: 0.12, Epoch: 0.06136805285658764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3159, "loss": 0.25555747747421265, "memory_gb": 7.721559524536133, "step_time_ms": 7460.453510284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:23] (step=0003159) Train Loss: 0.2376, Train Steps/Sec: 0.12, Epoch: 0.06138748542557326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3160, "loss": 0.29097533226013184, "memory_gb": 7.721559524536133, "step_time_ms": 7550.02498626709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:31] (step=0003160) Train Loss: 0.2573, Train Steps/Sec: 0.12, Epoch: 0.06140691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3161, "loss": 0.27224814891815186, "memory_gb": 7.721559524536133, "step_time_ms": 7536.842346191406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:39] (step=0003161) Train Loss: 0.2643, Train Steps/Sec: 0.12, Epoch: 0.0614263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3162, "loss": 0.23879729211330414, "memory_gb": 7.721559524536133, "step_time_ms": 7440.155506134033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:47] (step=0003162) Train Loss: 0.2008, Train Steps/Sec: 0.13, Epoch: 0.06144578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:56:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3163, "loss": 0.26135724782943726, "memory_gb": 7.721559524536133, "step_time_ms": 7526.660203933716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:56:55] (step=0003163) Train Loss: 0.3223, Train Steps/Sec: 0.12, Epoch: 0.06146521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3164, "loss": 0.1801469922065735, "memory_gb": 7.721559524536133, "step_time_ms": 5084.8658084869385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:01] (step=0003164) Train Loss: 0.2604, Train Steps/Sec: 0.17, Epoch: 0.06148464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3165, "loss": 0.3408915102481842, "memory_gb": 7.721559524536133, "step_time_ms": 7606.300592422485, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:09] (step=0003165) Train Loss: 0.2627, Train Steps/Sec: 0.12, Epoch: 0.06150408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3166, "loss": 0.17130976915359497, "memory_gb": 7.721559524536133, "step_time_ms": 7548.388957977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:17] (step=0003166) Train Loss: 0.2027, Train Steps/Sec: 0.13, Epoch: 0.0615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3167, "loss": 0.2531871199607849, "memory_gb": 7.721559524536133, "step_time_ms": 7558.659315109253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:25] (step=0003167) Train Loss: 0.2563, Train Steps/Sec: 0.12, Epoch: 0.06154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3168, "loss": 0.28760647773742676, "memory_gb": 7.721559524536133, "step_time_ms": 7610.005140304565, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:33] (step=0003168) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.06156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3169, "loss": 0.38099074363708496, "memory_gb": 7.715639114379883, "step_time_ms": 7484.891653060913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:41] (step=0003169) Train Loss: 0.3085, Train Steps/Sec: 0.12, Epoch: 0.06158181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3170, "loss": 0.1716383993625641, "memory_gb": 7.721559524536133, "step_time_ms": 7545.118808746338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:49] (step=0003170) Train Loss: 0.1572, Train Steps/Sec: 0.12, Epoch: 0.06160124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:57:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3171, "loss": 0.18292617797851562, "memory_gb": 7.721559524536133, "step_time_ms": 7579.38289642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:57:57] (step=0003171) Train Loss: 0.2361, Train Steps/Sec: 0.12, Epoch: 0.0616206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3172, "loss": 0.19308750331401825, "memory_gb": 7.721559524536133, "step_time_ms": 7510.823488235474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:06] (step=0003172) Train Loss: 0.2371, Train Steps/Sec: 0.12, Epoch: 0.06164010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3173, "loss": 0.17739704251289368, "memory_gb": 7.721559524536133, "step_time_ms": 7491.954326629639, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:14] (step=0003173) Train Loss: 0.2617, Train Steps/Sec: 0.13, Epoch: 0.06165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3174, "loss": 0.28760650753974915, "memory_gb": 7.721559524536133, "step_time_ms": 7524.561643600464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:22] (step=0003174) Train Loss: 0.3090, Train Steps/Sec: 0.13, Epoch: 0.06167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3175, "loss": 0.19343698024749756, "memory_gb": 7.721559524536133, "step_time_ms": 7460.074663162231, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:30] (step=0003175) Train Loss: 0.2331, Train Steps/Sec: 0.12, Epoch: 0.06169840652934318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3176, "loss": 0.23736156523227692, "memory_gb": 7.721559524536133, "step_time_ms": 7450.375318527222, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:38] (step=0003176) Train Loss: 0.1979, Train Steps/Sec: 0.12, Epoch: 0.0617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3177, "loss": 0.22330567240715027, "memory_gb": 7.721559524536133, "step_time_ms": 7496.8836307525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:46] (step=0003177) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.06173727166731442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:58:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3178, "loss": 0.23685342073440552, "memory_gb": 7.721559524536133, "step_time_ms": 7482.175350189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:58:54] (step=0003178) Train Loss: 0.2614, Train Steps/Sec: 0.13, Epoch: 0.06175670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3179, "loss": 0.3004380166530609, "memory_gb": 7.721559524536133, "step_time_ms": 7503.647804260254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:02] (step=0003179) Train Loss: 0.2657, Train Steps/Sec: 0.13, Epoch: 0.06177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3180, "loss": 0.2778829336166382, "memory_gb": 7.721559524536133, "step_time_ms": 7510.76340675354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:10] (step=0003180) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.061795569374271275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3181, "loss": 0.2276640236377716, "memory_gb": 7.721559524536133, "step_time_ms": 7478.999614715576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:18] (step=0003181) Train Loss: 0.2878, Train Steps/Sec: 0.13, Epoch: 0.0618150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3182, "loss": 0.24766750633716583, "memory_gb": 7.721559524536133, "step_time_ms": 7458.659410476685, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:26] (step=0003182) Train Loss: 0.2635, Train Steps/Sec: 0.13, Epoch: 0.06183443451224252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3183, "loss": 0.1782844364643097, "memory_gb": 7.721559524536133, "step_time_ms": 7539.206266403198, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:34] (step=0003183) Train Loss: 0.1850, Train Steps/Sec: 0.12, Epoch: 0.061853867081228135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3184, "loss": 0.22165051102638245, "memory_gb": 7.721559524536133, "step_time_ms": 7458.782196044922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:42] (step=0003184) Train Loss: 0.1890, Train Steps/Sec: 0.13, Epoch: 0.06187329965021376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3185, "loss": 0.3318208158016205, "memory_gb": 7.721559524536133, "step_time_ms": 7497.724294662476, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:50] (step=0003185) Train Loss: 0.2978, Train Steps/Sec: 0.12, Epoch: 0.06189273221919938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 00:59:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3186, "loss": 0.29263848066329956, "memory_gb": 7.721559524536133, "step_time_ms": 7531.615257263184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 00:59:58] (step=0003186) Train Loss: 0.3066, Train Steps/Sec: 0.12, Epoch: 0.061912164788184995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3187, "loss": 0.24218645691871643, "memory_gb": 7.721559524536133, "step_time_ms": 7461.629390716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:06] (step=0003187) Train Loss: 0.2505, Train Steps/Sec: 0.12, Epoch: 0.06193159735717062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3188, "loss": 0.2720572352409363, "memory_gb": 7.721559524536133, "step_time_ms": 7473.64354133606, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:14] (step=0003188) Train Loss: 0.2423, Train Steps/Sec: 0.13, Epoch: 0.06195102992615624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3189, "loss": 0.17307177186012268, "memory_gb": 7.721559524536133, "step_time_ms": 7531.4061641693115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:22] (step=0003189) Train Loss: 0.2260, Train Steps/Sec: 0.12, Epoch: 0.061970462495141855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3190, "loss": 0.3019225299358368, "memory_gb": 7.721559524536133, "step_time_ms": 7491.5807247161865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:30] (step=0003190) Train Loss: 0.2508, Train Steps/Sec: 0.12, Epoch: 0.06198989506412748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3191, "loss": 0.32085686922073364, "memory_gb": 7.715639114379883, "step_time_ms": 7482.544898986816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:38] (step=0003191) Train Loss: 0.3039, Train Steps/Sec: 0.13, Epoch: 0.0620093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3192, "loss": 0.17100641131401062, "memory_gb": 7.721559524536133, "step_time_ms": 7482.452154159546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:46] (step=0003192) Train Loss: 0.1995, Train Steps/Sec: 0.12, Epoch: 0.062028760202098715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:00:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3193, "loss": 0.18039929866790771, "memory_gb": 7.721559524536133, "step_time_ms": 5425.337553024292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:00:52] (step=0003193) Train Loss: 0.2148, Train Steps/Sec: 0.17, Epoch: 0.06204819277108434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3194, "loss": 0.2977176308631897, "memory_gb": 7.721559524536133, "step_time_ms": 7514.197826385498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:00] (step=0003194) Train Loss: 0.2675, Train Steps/Sec: 0.12, Epoch: 0.06206762534006996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3195, "loss": 0.23476997017860413, "memory_gb": 7.721559524536133, "step_time_ms": 7442.976951599121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:08] (step=0003195) Train Loss: 0.2581, Train Steps/Sec: 0.12, Epoch: 0.062087057909055575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3196, "loss": 0.3738844096660614, "memory_gb": 7.721559524536133, "step_time_ms": 7484.8809242248535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:16] (step=0003196) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 0.0621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3197, "loss": 0.281622976064682, "memory_gb": 7.721559524536133, "step_time_ms": 7545.933723449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:24] (step=0003197) Train Loss: 0.2640, Train Steps/Sec: 0.12, Epoch: 0.06212592304702682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3198, "loss": 0.34061017632484436, "memory_gb": 7.721559524536133, "step_time_ms": 7490.302085876465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:32] (step=0003198) Train Loss: 0.2530, Train Steps/Sec: 0.12, Epoch: 0.062145355616012435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3199, "loss": 0.12098240852355957, "memory_gb": 7.721559524536133, "step_time_ms": 7512.366056442261, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:40] (step=0003199) Train Loss: 0.2168, Train Steps/Sec: 0.12, Epoch: 0.06216478818499806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3200, "loss": 0.20427949726581573, "memory_gb": 7.721559524536133, "step_time_ms": 7531.310558319092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:48] (step=0003200) Train Loss: 0.2107, Train Steps/Sec: 0.12, Epoch: 0.06218422075398368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:01:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3201, "loss": 0.28220897912979126, "memory_gb": 7.721559524536133, "step_time_ms": 7437.365293502808, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:01:56] (step=0003201) Train Loss: 0.2383, Train Steps/Sec: 0.12, Epoch: 0.062203653322969295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3202, "loss": 0.23290331661701202, "memory_gb": 7.721559524536133, "step_time_ms": 7474.860191345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:04] (step=0003202) Train Loss: 0.3022, Train Steps/Sec: 0.12, Epoch: 0.06222308589195492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3203, "loss": 0.2545030415058136, "memory_gb": 7.721559524536133, "step_time_ms": 7533.424854278564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:12] (step=0003203) Train Loss: 0.2407, Train Steps/Sec: 0.12, Epoch: 0.06224251846094054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3204, "loss": 0.2257801592350006, "memory_gb": 7.721559524536133, "step_time_ms": 7398.753881454468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:20] (step=0003204) Train Loss: 0.2076, Train Steps/Sec: 0.13, Epoch: 0.062261951029926155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3205, "loss": 0.20236164331436157, "memory_gb": 7.721559524536133, "step_time_ms": 7444.6752071380615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:28] (step=0003205) Train Loss: 0.2256, Train Steps/Sec: 0.12, Epoch: 0.06228138359891178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3206, "loss": 0.23269401490688324, "memory_gb": 7.721559524536133, "step_time_ms": 7552.822828292847, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:37] (step=0003206) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.0623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3207, "loss": 0.18871521949768066, "memory_gb": 7.721559524536133, "step_time_ms": 7459.97428894043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:45] (step=0003207) Train Loss: 0.2021, Train Steps/Sec: 0.12, Epoch: 0.062320248736883015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:02:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3208, "loss": 0.2744583189487457, "memory_gb": 7.721559524536133, "step_time_ms": 7503.575801849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:02:53] (step=0003208) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.06233968130586864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3209, "loss": 0.2758607864379883, "memory_gb": 7.721559524536133, "step_time_ms": 7614.988088607788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:01] (step=0003209) Train Loss: 0.2363, Train Steps/Sec: 0.12, Epoch: 0.06235911387485425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3210, "loss": 0.24289122223854065, "memory_gb": 7.721559524536133, "step_time_ms": 7554.898500442505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:09] (step=0003210) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.062378546443839875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3211, "loss": 0.23975035548210144, "memory_gb": 7.721559524536133, "step_time_ms": 7527.201175689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:17] (step=0003211) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.0623979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3212, "loss": 0.19929805397987366, "memory_gb": 7.721559524536133, "step_time_ms": 7540.072202682495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:25] (step=0003212) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.06241741158181111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3213, "loss": 0.28884178400039673, "memory_gb": 7.721559524536133, "step_time_ms": 7287.058115005493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:33] (step=0003213) Train Loss: 0.3025, Train Steps/Sec: 0.12, Epoch: 0.062436844150796734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3214, "loss": 0.2187788486480713, "memory_gb": 7.721559524536133, "step_time_ms": 7457.522630691528, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:41] (step=0003214) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.06245627671978236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3215, "loss": 0.26907700300216675, "memory_gb": 7.721559524536133, "step_time_ms": 7508.591651916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:49] (step=0003215) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.06247570928876797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3216, "loss": 0.3160291314125061, "memory_gb": 7.721559524536133, "step_time_ms": 7474.76601600647, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:03:57] (step=0003216) Train Loss: 0.2862, Train Steps/Sec: 0.13, Epoch: 0.062495141857753594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3217, "loss": 0.2343696653842926, "memory_gb": 7.721559524536133, "step_time_ms": 7447.029590606689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:05] (step=0003217) Train Loss: 0.1978, Train Steps/Sec: 0.13, Epoch: 0.06251457442673922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3218, "loss": 0.2858157157897949, "memory_gb": 7.721559524536133, "step_time_ms": 7539.692163467407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:13] (step=0003218) Train Loss: 0.2364, Train Steps/Sec: 0.12, Epoch: 0.06253400699572484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3219, "loss": 0.23356759548187256, "memory_gb": 7.721559524536133, "step_time_ms": 7451.375484466553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:21] (step=0003219) Train Loss: 0.2313, Train Steps/Sec: 0.12, Epoch: 0.06255343956471046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3220, "loss": 0.19265855848789215, "memory_gb": 7.721559524536133, "step_time_ms": 7268.953561782837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:29] (step=0003220) Train Loss: 0.2624, Train Steps/Sec: 0.13, Epoch: 0.06257287213369607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3221, "loss": 0.33300167322158813, "memory_gb": 7.721559524536133, "step_time_ms": 7482.4535846710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:37] (step=0003221) Train Loss: 0.3285, Train Steps/Sec: 0.12, Epoch: 0.06259230470268169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3222, "loss": 0.1739245355129242, "memory_gb": 7.721559524536133, "step_time_ms": 5059.547662734985, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:43] (step=0003222) Train Loss: 0.1666, Train Steps/Sec: 0.17, Epoch: 0.06261173727166731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3223, "loss": 0.1691497266292572, "memory_gb": 7.721559524536133, "step_time_ms": 7492.1369552612305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:51] (step=0003223) Train Loss: 0.2047, Train Steps/Sec: 0.12, Epoch: 0.06263116984065294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:04:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3224, "loss": 0.2931514084339142, "memory_gb": 7.721559524536133, "step_time_ms": 7447.054386138916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:04:59] (step=0003224) Train Loss: 0.2603, Train Steps/Sec: 0.13, Epoch: 0.06265060240963856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3225, "loss": 0.31291478872299194, "memory_gb": 7.721559524536133, "step_time_ms": 7441.413640975952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:07] (step=0003225) Train Loss: 0.2703, Train Steps/Sec: 0.13, Epoch: 0.06267003497862417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3226, "loss": 0.31001558899879456, "memory_gb": 7.721559524536133, "step_time_ms": 7505.492210388184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:15] (step=0003226) Train Loss: 0.2525, Train Steps/Sec: 0.12, Epoch: 0.06268946754760979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3227, "loss": 0.30751124024391174, "memory_gb": 7.721559524536133, "step_time_ms": 7475.398778915405, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:23] (step=0003227) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.06270890011659541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3228, "loss": 0.29468750953674316, "memory_gb": 7.721559524536133, "step_time_ms": 7430.862903594971, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:31] (step=0003228) Train Loss: 0.3005, Train Steps/Sec: 0.13, Epoch: 0.06272833268558103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3229, "loss": 0.28058016300201416, "memory_gb": 7.721559524536133, "step_time_ms": 7508.559703826904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:39] (step=0003229) Train Loss: 0.2755, Train Steps/Sec: 0.12, Epoch: 0.06274776525456666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3230, "loss": 0.14156058430671692, "memory_gb": 7.721559524536133, "step_time_ms": 7481.940507888794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:47] (step=0003230) Train Loss: 0.1613, Train Steps/Sec: 0.13, Epoch: 0.06276719782355228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:05:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3231, "loss": 0.19079630076885223, "memory_gb": 7.721559524536133, "step_time_ms": 7471.530914306641, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:05:55] (step=0003231) Train Loss: 0.2204, Train Steps/Sec: 0.12, Epoch: 0.06278663039253789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3232, "loss": 0.21964995563030243, "memory_gb": 7.721559524536133, "step_time_ms": 7551.833868026733, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:03] (step=0003232) Train Loss: 0.1903, Train Steps/Sec: 0.12, Epoch: 0.06280606296152351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3233, "loss": 0.18985247611999512, "memory_gb": 7.721559524536133, "step_time_ms": 7465.682029724121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:11] (step=0003233) Train Loss: 0.2414, Train Steps/Sec: 0.13, Epoch: 0.06282549553050913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3234, "loss": 0.17862193286418915, "memory_gb": 7.721559524536133, "step_time_ms": 7467.492580413818, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:19] (step=0003234) Train Loss: 0.2619, Train Steps/Sec: 0.12, Epoch: 0.06284492809949475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3235, "loss": 0.29905641078948975, "memory_gb": 7.721559524536133, "step_time_ms": 7509.6776485443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:27] (step=0003235) Train Loss: 0.2443, Train Steps/Sec: 0.12, Epoch: 0.06286436066848038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3236, "loss": 0.2388560175895691, "memory_gb": 7.721559524536133, "step_time_ms": 7506.306409835815, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:35] (step=0003236) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.062883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3237, "loss": 0.18045225739479065, "memory_gb": 7.721559524536133, "step_time_ms": 7415.5402183532715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:43] (step=0003237) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.06290322580645161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3238, "loss": 0.2338162213563919, "memory_gb": 7.721559524536133, "step_time_ms": 7503.338813781738, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:51] (step=0003238) Train Loss: 0.2281, Train Steps/Sec: 0.12, Epoch: 0.06292265837543723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3239, "loss": 0.3302777409553528, "memory_gb": 7.721559524536133, "step_time_ms": 7609.7986698150635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:06:59] (step=0003239) Train Loss: 0.3087, Train Steps/Sec: 0.13, Epoch: 0.06294209094442285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3240, "loss": 0.2946301996707916, "memory_gb": 7.721559524536133, "step_time_ms": 7417.449951171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:07] (step=0003240) Train Loss: 0.2307, Train Steps/Sec: 0.13, Epoch: 0.06296152351340847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3241, "loss": 0.29065948724746704, "memory_gb": 7.715639114379883, "step_time_ms": 7464.513778686523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:16] (step=0003241) Train Loss: 0.2838, Train Steps/Sec: 0.13, Epoch: 0.0629809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3242, "loss": 0.27482539415359497, "memory_gb": 7.721559524536133, "step_time_ms": 7486.260414123535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:24] (step=0003242) Train Loss: 0.2628, Train Steps/Sec: 0.12, Epoch: 0.06300038865137972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3243, "loss": 0.3072884678840637, "memory_gb": 7.721559524536133, "step_time_ms": 7435.84680557251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:32] (step=0003243) Train Loss: 0.3123, Train Steps/Sec: 0.12, Epoch: 0.06301982122036533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3244, "loss": 0.17919233441352844, "memory_gb": 7.721559524536133, "step_time_ms": 7535.350561141968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:40] (step=0003244) Train Loss: 0.1762, Train Steps/Sec: 0.12, Epoch: 0.06303925378935095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3245, "loss": 0.17677156627178192, "memory_gb": 7.721559524536133, "step_time_ms": 7520.845651626587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:48] (step=0003245) Train Loss: 0.1816, Train Steps/Sec: 0.12, Epoch: 0.06305868635833657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3246, "loss": 0.31092777848243713, "memory_gb": 7.721559524536133, "step_time_ms": 7461.777925491333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:07:56] (step=0003246) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.0630781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3247, "loss": 0.2814646363258362, "memory_gb": 7.721559524536133, "step_time_ms": 7511.096477508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:04] (step=0003247) Train Loss: 0.2509, Train Steps/Sec: 0.13, Epoch: 0.06309755149630782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3248, "loss": 0.16970640420913696, "memory_gb": 7.721559524536133, "step_time_ms": 7495.5902099609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:12] (step=0003248) Train Loss: 0.2669, Train Steps/Sec: 0.13, Epoch: 0.06311698406529344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3249, "loss": 0.2764495611190796, "memory_gb": 7.721559524536133, "step_time_ms": 7331.693887710571, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:20] (step=0003249) Train Loss: 0.2114, Train Steps/Sec: 0.13, Epoch: 0.06313641663427905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3250, "loss": 0.3803122043609619, "memory_gb": 7.721559524536133, "step_time_ms": 7548.978805541992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:28] (step=0003250) Train Loss: 0.3241, Train Steps/Sec: 0.13, Epoch: 0.06315584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3251, "loss": 0.19812078773975372, "memory_gb": 7.721559524536133, "step_time_ms": 5080.116987228394, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:34] (step=0003251) Train Loss: 0.2528, Train Steps/Sec: 0.17, Epoch: 0.06317528177225029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3252, "loss": 0.24094276130199432, "memory_gb": 7.721559524536133, "step_time_ms": 7522.783994674683, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:42] (step=0003252) Train Loss: 0.2839, Train Steps/Sec: 0.12, Epoch: 0.06319471434123591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3253, "loss": 0.2503008246421814, "memory_gb": 7.721559524536133, "step_time_ms": 7481.4817905426025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:50] (step=0003253) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.06321414691022154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3254, "loss": 0.17907777428627014, "memory_gb": 7.721559524536133, "step_time_ms": 7437.697649002075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:08:58] (step=0003254) Train Loss: 0.2208, Train Steps/Sec: 0.12, Epoch: 0.06323357947920714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3255, "loss": 0.19718845188617706, "memory_gb": 7.721559524536133, "step_time_ms": 7521.753787994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:06] (step=0003255) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.06325301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3256, "loss": 0.2711528539657593, "memory_gb": 7.721559524536133, "step_time_ms": 7523.405313491821, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:14] (step=0003256) Train Loss: 0.2601, Train Steps/Sec: 0.12, Epoch: 0.06327244461717839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3257, "loss": 0.23350374400615692, "memory_gb": 7.721559524536133, "step_time_ms": 7449.585914611816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:22] (step=0003257) Train Loss: 0.1834, Train Steps/Sec: 0.13, Epoch: 0.06329187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3258, "loss": 0.21253517270088196, "memory_gb": 7.721559524536133, "step_time_ms": 7519.865036010742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:30] (step=0003258) Train Loss: 0.2755, Train Steps/Sec: 0.13, Epoch: 0.06331130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3259, "loss": 0.27639690041542053, "memory_gb": 7.721559524536133, "step_time_ms": 7528.64933013916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:38] (step=0003259) Train Loss: 0.2259, Train Steps/Sec: 0.12, Epoch: 0.06333074232413526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3260, "loss": 0.19886964559555054, "memory_gb": 7.721559524536133, "step_time_ms": 7471.494197845459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:46] (step=0003260) Train Loss: 0.2524, Train Steps/Sec: 0.12, Epoch: 0.06335017489312086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:09:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3261, "loss": 0.28310155868530273, "memory_gb": 7.721559524536133, "step_time_ms": 7496.349573135376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:09:54] (step=0003261) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.06336960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3262, "loss": 0.2566530108451843, "memory_gb": 7.721559524536133, "step_time_ms": 7456.565141677856, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:02] (step=0003262) Train Loss: 0.2283, Train Steps/Sec: 0.12, Epoch: 0.06338904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3263, "loss": 0.25966402888298035, "memory_gb": 7.721559524536133, "step_time_ms": 7401.815891265869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:10] (step=0003263) Train Loss: 0.2442, Train Steps/Sec: 0.12, Epoch: 0.06340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3264, "loss": 0.23549702763557434, "memory_gb": 7.721559524536133, "step_time_ms": 7461.213827133179, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:18] (step=0003264) Train Loss: 0.3038, Train Steps/Sec: 0.12, Epoch: 0.06342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3265, "loss": 0.217537522315979, "memory_gb": 7.721559524536133, "step_time_ms": 7476.135015487671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:26] (step=0003265) Train Loss: 0.2390, Train Steps/Sec: 0.12, Epoch: 0.06344733773804898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3266, "loss": 0.3245978355407715, "memory_gb": 7.721559524536133, "step_time_ms": 7416.050910949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:34] (step=0003266) Train Loss: 0.2681, Train Steps/Sec: 0.13, Epoch: 0.06346677030703458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3267, "loss": 0.28209611773490906, "memory_gb": 7.721559524536133, "step_time_ms": 7519.7882652282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:42] (step=0003267) Train Loss: 0.2810, Train Steps/Sec: 0.12, Epoch: 0.0634862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3268, "loss": 0.18119582533836365, "memory_gb": 7.721559524536133, "step_time_ms": 7525.959014892578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:50] (step=0003268) Train Loss: 0.1888, Train Steps/Sec: 0.12, Epoch: 0.06350563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:10:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3269, "loss": 0.19224971532821655, "memory_gb": 7.721559524536133, "step_time_ms": 7408.883333206177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:10:58] (step=0003269) Train Loss: 0.1989, Train Steps/Sec: 0.12, Epoch: 0.06352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3270, "loss": 0.30284395813941956, "memory_gb": 7.721559524536133, "step_time_ms": 7476.186990737915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:06] (step=0003270) Train Loss: 0.2360, Train Steps/Sec: 0.13, Epoch: 0.06354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3271, "loss": 0.2891615033149719, "memory_gb": 7.721559524536133, "step_time_ms": 7466.647386550903, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:14] (step=0003271) Train Loss: 0.2702, Train Steps/Sec: 0.13, Epoch: 0.0635639331519627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3272, "loss": 0.17461849749088287, "memory_gb": 7.721559524536133, "step_time_ms": 7399.806499481201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:22] (step=0003272) Train Loss: 0.1802, Train Steps/Sec: 0.13, Epoch: 0.0635833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3273, "loss": 0.2916930913925171, "memory_gb": 7.721559524536133, "step_time_ms": 7479.571342468262, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:30] (step=0003273) Train Loss: 0.3015, Train Steps/Sec: 0.13, Epoch: 0.06360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3274, "loss": 0.13904573023319244, "memory_gb": 7.721559524536133, "step_time_ms": 7508.4967613220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:38] (step=0003274) Train Loss: 0.1903, Train Steps/Sec: 0.12, Epoch: 0.06362223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3275, "loss": 0.1603422611951828, "memory_gb": 7.721559524536133, "step_time_ms": 7419.448137283325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:46] (step=0003275) Train Loss: 0.1845, Train Steps/Sec: 0.13, Epoch: 0.06364166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3276, "loss": 0.20455411076545715, "memory_gb": 7.721559524536133, "step_time_ms": 7436.751127243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:11:55] (step=0003276) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.06366109599689079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3277, "loss": 0.17611920833587646, "memory_gb": 7.721559524536133, "step_time_ms": 7458.635091781616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:03] (step=0003277) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.06368052856587642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3278, "loss": 0.19524914026260376, "memory_gb": 7.721559524536133, "step_time_ms": 7269.166707992554, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:10] (step=0003278) Train Loss: 0.1954, Train Steps/Sec: 0.13, Epoch: 0.06369996113486202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3279, "loss": 0.15725132822990417, "memory_gb": 7.721559524536133, "step_time_ms": 7374.650478363037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:18] (step=0003279) Train Loss: 0.1565, Train Steps/Sec: 0.13, Epoch: 0.06371939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3280, "loss": 0.2167624533176422, "memory_gb": 7.721559524536133, "step_time_ms": 4465.246200561523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:24] (step=0003280) Train Loss: 0.2088, Train Steps/Sec: 0.17, Epoch: 0.06373882627283327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3281, "loss": 0.17223861813545227, "memory_gb": 7.721559524536133, "step_time_ms": 7536.205053329468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:32] (step=0003281) Train Loss: 0.2155, Train Steps/Sec: 0.12, Epoch: 0.06375825884181889, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3282, "loss": 0.23805451393127441, "memory_gb": 7.721559524536133, "step_time_ms": 7448.523998260498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:40] (step=0003282) Train Loss: 0.1949, Train Steps/Sec: 0.13, Epoch: 0.06377769141080451, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3283, "loss": 0.26397842168807983, "memory_gb": 7.721559524536133, "step_time_ms": 7391.176462173462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:48] (step=0003283) Train Loss: 0.2476, Train Steps/Sec: 0.13, Epoch: 0.06379712397979014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:12:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3284, "loss": 0.20295877754688263, "memory_gb": 7.721559524536133, "step_time_ms": 7471.829891204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:12:56] (step=0003284) Train Loss: 0.2091, Train Steps/Sec: 0.12, Epoch: 0.06381655654877574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3285, "loss": 0.28894224762916565, "memory_gb": 7.721559524536133, "step_time_ms": 7405.049085617065, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:04] (step=0003285) Train Loss: 0.2045, Train Steps/Sec: 0.13, Epoch: 0.06383598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3286, "loss": 0.24110472202301025, "memory_gb": 7.721559524536133, "step_time_ms": 7419.297456741333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:12] (step=0003286) Train Loss: 0.2361, Train Steps/Sec: 0.12, Epoch: 0.06385542168674699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3287, "loss": 0.17160716652870178, "memory_gb": 7.721559524536133, "step_time_ms": 7488.925457000732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:20] (step=0003287) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.06387485425573261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3288, "loss": 0.16872748732566833, "memory_gb": 7.721559524536133, "step_time_ms": 7457.488536834717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:28] (step=0003288) Train Loss: 0.1515, Train Steps/Sec: 0.12, Epoch: 0.06389428682471823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3289, "loss": 0.20777051150798798, "memory_gb": 7.721559524536133, "step_time_ms": 7419.955492019653, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:36] (step=0003289) Train Loss: 0.2426, Train Steps/Sec: 0.12, Epoch: 0.06391371939370384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3290, "loss": 0.1955685168504715, "memory_gb": 7.721559524536133, "step_time_ms": 7502.108097076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:44] (step=0003290) Train Loss: 0.2325, Train Steps/Sec: 0.12, Epoch: 0.06393315196268946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3291, "loss": 0.29749777913093567, "memory_gb": 7.721559524536133, "step_time_ms": 7459.094047546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:13:52] (step=0003291) Train Loss: 0.2452, Train Steps/Sec: 0.13, Epoch: 0.06395258453167509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3292, "loss": 0.17542299628257751, "memory_gb": 7.721559524536133, "step_time_ms": 7395.283460617065, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:00] (step=0003292) Train Loss: 0.2320, Train Steps/Sec: 0.12, Epoch: 0.06397201710066071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3293, "loss": 0.17367668449878693, "memory_gb": 7.721559524536133, "step_time_ms": 7527.858257293701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:08] (step=0003293) Train Loss: 0.2031, Train Steps/Sec: 0.12, Epoch: 0.06399144966964633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3294, "loss": 0.24543088674545288, "memory_gb": 7.721559524536133, "step_time_ms": 7450.384616851807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:16] (step=0003294) Train Loss: 0.2760, Train Steps/Sec: 0.12, Epoch: 0.06401088223863195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3295, "loss": 0.20491068065166473, "memory_gb": 7.721559524536133, "step_time_ms": 7412.371635437012, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:24] (step=0003295) Train Loss: 0.2309, Train Steps/Sec: 0.12, Epoch: 0.06403031480761756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3296, "loss": 0.2444084733724594, "memory_gb": 7.721559524536133, "step_time_ms": 7535.3264808654785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:33] (step=0003296) Train Loss: 0.2743, Train Steps/Sec: 0.12, Epoch: 0.06404974737660318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3297, "loss": 0.19998134672641754, "memory_gb": 7.721559524536133, "step_time_ms": 7441.619157791138, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:41] (step=0003297) Train Loss: 0.2228, Train Steps/Sec: 0.13, Epoch: 0.0640691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3298, "loss": 0.22761115431785583, "memory_gb": 7.721559524536133, "step_time_ms": 7450.719356536865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:49] (step=0003298) Train Loss: 0.2270, Train Steps/Sec: 0.13, Epoch: 0.06408861251457443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3299, "loss": 0.24094288051128387, "memory_gb": 7.721559524536133, "step_time_ms": 7523.760795593262, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:14:57] (step=0003299) Train Loss: 0.2326, Train Steps/Sec: 0.13, Epoch: 0.06410804508356005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3300, "loss": 0.30540746450424194, "memory_gb": 7.721559524536133, "step_time_ms": 7461.616992950439, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:05] (step=0003300) Train Loss: 0.2725, Train Steps/Sec: 0.12, Epoch: 0.06412747765254567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3301, "loss": 0.1499125361442566, "memory_gb": 7.721559524536133, "step_time_ms": 7553.9374351501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:13] (step=0003301) Train Loss: 0.1662, Train Steps/Sec: 0.12, Epoch: 0.06414691022153128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3302, "loss": 0.20822423696517944, "memory_gb": 7.721559524536133, "step_time_ms": 7607.369899749756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:21] (step=0003302) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.0641663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3303, "loss": 0.29566267132759094, "memory_gb": 7.721559524536133, "step_time_ms": 7602.686882019043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:29] (step=0003303) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.06418577535950253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3304, "loss": 0.19172464311122894, "memory_gb": 7.721559524536133, "step_time_ms": 7462.562084197998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:37] (step=0003304) Train Loss: 0.2346, Train Steps/Sec: 0.13, Epoch: 0.06420520792848815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3305, "loss": 0.24639394879341125, "memory_gb": 7.721559524536133, "step_time_ms": 7574.026823043823, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:45] (step=0003305) Train Loss: 0.2730, Train Steps/Sec: 0.12, Epoch: 0.06422464049747377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:15:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3306, "loss": 0.27767205238342285, "memory_gb": 7.721559524536133, "step_time_ms": 7574.6824741363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:15:53] (step=0003306) Train Loss: 0.2990, Train Steps/Sec: 0.12, Epoch: 0.06424407306645939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3307, "loss": 0.20314568281173706, "memory_gb": 7.721559524536133, "step_time_ms": 7345.357656478882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:01] (step=0003307) Train Loss: 0.1951, Train Steps/Sec: 0.13, Epoch: 0.064263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3308, "loss": 0.20790018141269684, "memory_gb": 7.721559524536133, "step_time_ms": 7618.050098419189, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:09] (step=0003308) Train Loss: 0.1882, Train Steps/Sec: 0.12, Epoch: 0.06428293820443062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3309, "loss": 0.22183829545974731, "memory_gb": 7.721559524536133, "step_time_ms": 5140.785217285156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:14] (step=0003309) Train Loss: 0.1971, Train Steps/Sec: 0.19, Epoch: 0.06430237077341625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3310, "loss": 0.19102682173252106, "memory_gb": 7.721559524536133, "step_time_ms": 7620.77522277832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:22] (step=0003310) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.06432180334240187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3311, "loss": 0.17913028597831726, "memory_gb": 7.721559524536133, "step_time_ms": 7597.993850708008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:30] (step=0003311) Train Loss: 0.2094, Train Steps/Sec: 0.13, Epoch: 0.06434123591138749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3312, "loss": 0.19986924529075623, "memory_gb": 7.721559524536133, "step_time_ms": 7501.615285873413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:38] (step=0003312) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.06436066848037311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3313, "loss": 0.2116682529449463, "memory_gb": 7.721559524536133, "step_time_ms": 7558.675765991211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:46] (step=0003313) Train Loss: 0.2341, Train Steps/Sec: 0.12, Epoch: 0.06438010104935872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3314, "loss": 0.3573121130466461, "memory_gb": 7.715639114379883, "step_time_ms": 7436.6295337677, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:16:55] (step=0003314) Train Loss: 0.2783, Train Steps/Sec: 0.12, Epoch: 0.06439953361834434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3315, "loss": 0.2797318398952484, "memory_gb": 7.721559524536133, "step_time_ms": 7481.333494186401, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:03] (step=0003315) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.06441896618732997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3316, "loss": 0.21584033966064453, "memory_gb": 7.721559524536133, "step_time_ms": 7513.335704803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:11] (step=0003316) Train Loss: 0.1714, Train Steps/Sec: 0.12, Epoch: 0.06443839875631559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3317, "loss": 0.3364943265914917, "memory_gb": 7.721559524536133, "step_time_ms": 7455.489158630371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:19] (step=0003317) Train Loss: 0.3068, Train Steps/Sec: 0.12, Epoch: 0.06445783132530121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3318, "loss": 0.2960401773452759, "memory_gb": 7.721559524536133, "step_time_ms": 7462.458372116089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:27] (step=0003318) Train Loss: 0.2826, Train Steps/Sec: 0.12, Epoch: 0.06447726389428682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3319, "loss": 0.29711300134658813, "memory_gb": 7.721559524536133, "step_time_ms": 7497.306108474731, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:35] (step=0003319) Train Loss: 0.2924, Train Steps/Sec: 0.12, Epoch: 0.06449669646327244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3320, "loss": 0.2914198040962219, "memory_gb": 7.721559524536133, "step_time_ms": 7465.101718902588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:43] (step=0003320) Train Loss: 0.3168, Train Steps/Sec: 0.12, Epoch: 0.06451612903225806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3321, "loss": 0.24649184942245483, "memory_gb": 7.721559524536133, "step_time_ms": 7416.142702102661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:51] (step=0003321) Train Loss: 0.2186, Train Steps/Sec: 0.12, Epoch: 0.06453556160124369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3322, "loss": 0.15071940422058105, "memory_gb": 7.721559524536133, "step_time_ms": 7499.618768692017, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:17:59] (step=0003322) Train Loss: 0.1983, Train Steps/Sec: 0.12, Epoch: 0.06455499417022931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3323, "loss": 0.26228219270706177, "memory_gb": 7.721559524536133, "step_time_ms": 7442.439079284668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:07] (step=0003323) Train Loss: 0.2823, Train Steps/Sec: 0.13, Epoch: 0.06457442673921493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3324, "loss": 0.25791341066360474, "memory_gb": 7.721559524536133, "step_time_ms": 7417.332887649536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:15] (step=0003324) Train Loss: 0.2498, Train Steps/Sec: 0.13, Epoch: 0.06459385930820054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3325, "loss": 0.20284992456436157, "memory_gb": 7.721559524536133, "step_time_ms": 7466.442823410034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:23] (step=0003325) Train Loss: 0.1947, Train Steps/Sec: 0.13, Epoch: 0.06461329187718616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3326, "loss": 0.138470858335495, "memory_gb": 7.721559524536133, "step_time_ms": 7595.483303070068, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:31] (step=0003326) Train Loss: 0.2530, Train Steps/Sec: 0.13, Epoch: 0.06463272444617178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3327, "loss": 0.28455451130867004, "memory_gb": 7.721559524536133, "step_time_ms": 7389.498233795166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:39] (step=0003327) Train Loss: 0.2856, Train Steps/Sec: 0.13, Epoch: 0.0646521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3328, "loss": 0.15948256850242615, "memory_gb": 7.721559524536133, "step_time_ms": 7444.967031478882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:47] (step=0003328) Train Loss: 0.1737, Train Steps/Sec: 0.12, Epoch: 0.06467158958414303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:18:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3329, "loss": 0.2863282561302185, "memory_gb": 7.721559524536133, "step_time_ms": 7467.053651809692, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:18:55] (step=0003329) Train Loss: 0.2131, Train Steps/Sec: 0.13, Epoch: 0.06469102215312865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3330, "loss": 0.2813405394554138, "memory_gb": 7.721559524536133, "step_time_ms": 7435.347318649292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:03] (step=0003330) Train Loss: 0.2293, Train Steps/Sec: 0.12, Epoch: 0.06471045472211426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3331, "loss": 0.3279578387737274, "memory_gb": 7.721559524536133, "step_time_ms": 7461.199522018433, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:11] (step=0003331) Train Loss: 0.2796, Train Steps/Sec: 0.12, Epoch: 0.06472988729109988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3332, "loss": 0.2875244617462158, "memory_gb": 7.721559524536133, "step_time_ms": 7485.564708709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:19] (step=0003332) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.0647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3333, "loss": 0.19126316905021667, "memory_gb": 7.721559524536133, "step_time_ms": 7448.339462280273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:27] (step=0003333) Train Loss: 0.2037, Train Steps/Sec: 0.12, Epoch: 0.06476875242907112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3334, "loss": 0.26560091972351074, "memory_gb": 7.721559524536133, "step_time_ms": 7450.141429901123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:35] (step=0003334) Train Loss: 0.2180, Train Steps/Sec: 0.12, Epoch: 0.06478818499805675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3335, "loss": 0.37124502658843994, "memory_gb": 7.721559524536133, "step_time_ms": 7262.207746505737, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:43] (step=0003335) Train Loss: 0.3031, Train Steps/Sec: 0.13, Epoch: 0.06480761756704237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3336, "loss": 0.2110939770936966, "memory_gb": 7.721559524536133, "step_time_ms": 7147.475957870483, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:51] (step=0003336) Train Loss: 0.2157, Train Steps/Sec: 0.13, Epoch: 0.06482705013602798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:19:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3337, "loss": 0.3057873845100403, "memory_gb": 7.721559524536133, "step_time_ms": 7295.40228843689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:19:59] (step=0003337) Train Loss: 0.1990, Train Steps/Sec: 0.13, Epoch: 0.0648464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3338, "loss": 0.2633005976676941, "memory_gb": 7.721559524536133, "step_time_ms": 5487.040996551514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:05] (step=0003338) Train Loss: 0.2239, Train Steps/Sec: 0.16, Epoch: 0.06486591527399922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3339, "loss": 0.34568238258361816, "memory_gb": 7.721559524536133, "step_time_ms": 7479.565858840942, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:13] (step=0003339) Train Loss: 0.2959, Train Steps/Sec: 0.12, Epoch: 0.06488534784298484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3340, "loss": 0.14361126720905304, "memory_gb": 7.721559524536133, "step_time_ms": 7497.015476226807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:21] (step=0003340) Train Loss: 0.1922, Train Steps/Sec: 0.12, Epoch: 0.06490478041197047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3341, "loss": 0.2712307572364807, "memory_gb": 7.721559524536133, "step_time_ms": 7395.691633224487, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:29] (step=0003341) Train Loss: 0.2468, Train Steps/Sec: 0.12, Epoch: 0.06492421298095609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3342, "loss": 0.22005876898765564, "memory_gb": 7.721559524536133, "step_time_ms": 7478.297233581543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:37] (step=0003342) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.0649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3343, "loss": 0.24089178442955017, "memory_gb": 7.721559524536133, "step_time_ms": 7522.815465927124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:45] (step=0003343) Train Loss: 0.2033, Train Steps/Sec: 0.12, Epoch: 0.06496307811892732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3344, "loss": 0.15880757570266724, "memory_gb": 7.721559524536133, "step_time_ms": 7422.809362411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:20:53] (step=0003344) Train Loss: 0.1959, Train Steps/Sec: 0.12, Epoch: 0.06498251068791294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3345, "loss": 0.1763082891702652, "memory_gb": 7.721559524536133, "step_time_ms": 7457.547426223755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:01] (step=0003345) Train Loss: 0.2030, Train Steps/Sec: 0.12, Epoch: 0.06500194325689856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3346, "loss": 0.1847657412290573, "memory_gb": 7.721559524536133, "step_time_ms": 7513.249158859253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:09] (step=0003346) Train Loss: 0.1692, Train Steps/Sec: 0.13, Epoch: 0.06502137582588419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3347, "loss": 0.2637310028076172, "memory_gb": 7.721559524536133, "step_time_ms": 7470.412254333496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:17] (step=0003347) Train Loss: 0.2885, Train Steps/Sec: 0.13, Epoch: 0.0650408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3348, "loss": 0.23672722280025482, "memory_gb": 7.721559524536133, "step_time_ms": 7560.828924179077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:25] (step=0003348) Train Loss: 0.1963, Train Steps/Sec: 0.12, Epoch: 0.06506024096385542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3349, "loss": 0.24372930824756622, "memory_gb": 7.721559524536133, "step_time_ms": 7598.1385707855225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:33] (step=0003349) Train Loss: 0.2509, Train Steps/Sec: 0.12, Epoch: 0.06507967353284104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3350, "loss": 0.18305708467960358, "memory_gb": 7.721559524536133, "step_time_ms": 7214.079141616821, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:41] (step=0003350) Train Loss: 0.2568, Train Steps/Sec: 0.13, Epoch: 0.06509910610182666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3351, "loss": 0.1452523022890091, "memory_gb": 7.721559524536133, "step_time_ms": 7480.590343475342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:49] (step=0003351) Train Loss: 0.2095, Train Steps/Sec: 0.12, Epoch: 0.06511853867081228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3352, "loss": 0.1640194058418274, "memory_gb": 7.721559524536133, "step_time_ms": 7467.571020126343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:21:57] (step=0003352) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.0651379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3353, "loss": 0.16887933015823364, "memory_gb": 7.721559524536133, "step_time_ms": 7408.878803253174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:05] (step=0003353) Train Loss: 0.2267, Train Steps/Sec: 0.13, Epoch: 0.06515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3354, "loss": 0.3835141062736511, "memory_gb": 7.721559524536133, "step_time_ms": 7487.472295761108, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:13] (step=0003354) Train Loss: 0.2834, Train Steps/Sec: 0.12, Epoch: 0.06517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3355, "loss": 0.12127449363470078, "memory_gb": 7.721559524536133, "step_time_ms": 7490.755796432495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:21] (step=0003355) Train Loss: 0.1869, Train Steps/Sec: 0.13, Epoch: 0.06519626894675476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3356, "loss": 0.21690818667411804, "memory_gb": 7.721559524536133, "step_time_ms": 7457.185983657837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:29] (step=0003356) Train Loss: 0.1973, Train Steps/Sec: 0.12, Epoch: 0.06521570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3357, "loss": 0.3695397675037384, "memory_gb": 7.721559524536133, "step_time_ms": 7487.437009811401, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:37] (step=0003357) Train Loss: 0.2931, Train Steps/Sec: 0.12, Epoch: 0.065235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3358, "loss": 0.2077142894268036, "memory_gb": 7.721559524536133, "step_time_ms": 7494.27342414856, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:45] (step=0003358) Train Loss: 0.2016, Train Steps/Sec: 0.12, Epoch: 0.06525456665371163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:22:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3359, "loss": 0.27389320731163025, "memory_gb": 7.721559524536133, "step_time_ms": 7479.958772659302, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:22:53] (step=0003359) Train Loss: 0.2640, Train Steps/Sec: 0.12, Epoch: 0.06527399922269723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3360, "loss": 0.19521188735961914, "memory_gb": 7.721559524536133, "step_time_ms": 7484.77840423584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:01] (step=0003360) Train Loss: 0.2134, Train Steps/Sec: 0.12, Epoch: 0.06529343179168286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3361, "loss": 0.20618392527103424, "memory_gb": 7.721559524536133, "step_time_ms": 7532.55295753479, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:09] (step=0003361) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.06531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3362, "loss": 0.24511483311653137, "memory_gb": 7.721559524536133, "step_time_ms": 7431.162595748901, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:18] (step=0003362) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.0653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3363, "loss": 0.24790425598621368, "memory_gb": 7.721559524536133, "step_time_ms": 7492.069482803345, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:26] (step=0003363) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.06535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3364, "loss": 0.2540413737297058, "memory_gb": 7.721559524536133, "step_time_ms": 7531.404256820679, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:34] (step=0003364) Train Loss: 0.2675, Train Steps/Sec: 0.12, Epoch: 0.06537116206762535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3365, "loss": 0.37595483660697937, "memory_gb": 7.721559524536133, "step_time_ms": 7310.6369972229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:42] (step=0003365) Train Loss: 0.2740, Train Steps/Sec: 0.13, Epoch: 0.06539059463661095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3366, "loss": 0.3480421304702759, "memory_gb": 7.721559524536133, "step_time_ms": 7349.584341049194, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:49] (step=0003366) Train Loss: 0.3080, Train Steps/Sec: 0.13, Epoch: 0.06541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:23:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3367, "loss": 0.17693205177783966, "memory_gb": 7.721559524536133, "step_time_ms": 6013.659954071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:23:55] (step=0003367) Train Loss: 0.2184, Train Steps/Sec: 0.16, Epoch: 0.0654294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3368, "loss": 0.19654464721679688, "memory_gb": 7.721559524536133, "step_time_ms": 7472.5892543792725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:03] (step=0003368) Train Loss: 0.2046, Train Steps/Sec: 0.12, Epoch: 0.06544889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3369, "loss": 0.3307081460952759, "memory_gb": 7.721559524536133, "step_time_ms": 7470.527648925781, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:11] (step=0003369) Train Loss: 0.3446, Train Steps/Sec: 0.12, Epoch: 0.06546832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3370, "loss": 0.27998995780944824, "memory_gb": 7.721559524536133, "step_time_ms": 7389.237403869629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:19] (step=0003370) Train Loss: 0.2247, Train Steps/Sec: 0.13, Epoch: 0.06548775748153907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3371, "loss": 0.2743198573589325, "memory_gb": 7.721559524536133, "step_time_ms": 7450.347423553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:27] (step=0003371) Train Loss: 0.2367, Train Steps/Sec: 0.12, Epoch: 0.06550719005052467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3372, "loss": 0.1878209114074707, "memory_gb": 7.721559524536133, "step_time_ms": 7423.567771911621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:36] (step=0003372) Train Loss: 0.2502, Train Steps/Sec: 0.12, Epoch: 0.0655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3373, "loss": 0.252902626991272, "memory_gb": 7.721559524536133, "step_time_ms": 7432.6348304748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:44] (step=0003373) Train Loss: 0.2392, Train Steps/Sec: 0.12, Epoch: 0.06554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3374, "loss": 0.30882906913757324, "memory_gb": 7.721559524536133, "step_time_ms": 7479.46834564209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:24:52] (step=0003374) Train Loss: 0.2633, Train Steps/Sec: 0.12, Epoch: 0.06556548775748154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3375, "loss": 0.15896065533161163, "memory_gb": 7.721559524536133, "step_time_ms": 7472.750186920166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:00] (step=0003375) Train Loss: 0.2143, Train Steps/Sec: 0.12, Epoch: 0.06558492032646716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3376, "loss": 0.3798752427101135, "memory_gb": 7.715639114379883, "step_time_ms": 7361.560344696045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:08] (step=0003376) Train Loss: 0.2585, Train Steps/Sec: 0.13, Epoch: 0.06560435289545277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3377, "loss": 0.2602166533470154, "memory_gb": 7.721559524536133, "step_time_ms": 7466.089725494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:16] (step=0003377) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.0656237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3378, "loss": 0.34730732440948486, "memory_gb": 7.721559524536133, "step_time_ms": 7486.327886581421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:24] (step=0003378) Train Loss: 0.3357, Train Steps/Sec: 0.12, Epoch: 0.06564321803342402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3379, "loss": 0.3337368965148926, "memory_gb": 7.721559524536133, "step_time_ms": 7445.611476898193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:32] (step=0003379) Train Loss: 0.2793, Train Steps/Sec: 0.12, Epoch: 0.06566265060240964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3380, "loss": 0.22661221027374268, "memory_gb": 7.721559524536133, "step_time_ms": 7454.930305480957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:40] (step=0003380) Train Loss: 0.2305, Train Steps/Sec: 0.12, Epoch: 0.06568208317139526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3381, "loss": 0.2561280131340027, "memory_gb": 7.721559524536133, "step_time_ms": 7576.576471328735, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:48] (step=0003381) Train Loss: 0.2140, Train Steps/Sec: 0.12, Epoch: 0.06570151574038088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:25:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3382, "loss": 0.22306329011917114, "memory_gb": 7.715639114379883, "step_time_ms": 7468.391895294189, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:25:56] (step=0003382) Train Loss: 0.2449, Train Steps/Sec: 0.12, Epoch: 0.06572094830936649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3383, "loss": 0.12304101139307022, "memory_gb": 7.721559524536133, "step_time_ms": 7471.256971359253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:04] (step=0003383) Train Loss: 0.2003, Train Steps/Sec: 0.12, Epoch: 0.06574038087835211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3384, "loss": 0.2211875170469284, "memory_gb": 7.721559524536133, "step_time_ms": 7538.183689117432, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:12] (step=0003384) Train Loss: 0.2326, Train Steps/Sec: 0.12, Epoch: 0.06575981344733774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3385, "loss": 0.33082735538482666, "memory_gb": 7.721559524536133, "step_time_ms": 7490.298509597778, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:20] (step=0003385) Train Loss: 0.2950, Train Steps/Sec: 0.12, Epoch: 0.06577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3386, "loss": 0.3246231973171234, "memory_gb": 7.721559524536133, "step_time_ms": 7530.3521156311035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:28] (step=0003386) Train Loss: 0.2681, Train Steps/Sec: 0.12, Epoch: 0.06579867858530898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3387, "loss": 0.19246815145015717, "memory_gb": 7.721559524536133, "step_time_ms": 7572.856426239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:36] (step=0003387) Train Loss: 0.2501, Train Steps/Sec: 0.12, Epoch: 0.0658181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3388, "loss": 0.18271705508232117, "memory_gb": 7.721559524536133, "step_time_ms": 7565.479278564453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:44] (step=0003388) Train Loss: 0.2111, Train Steps/Sec: 0.12, Epoch: 0.06583754372328021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3389, "loss": 0.2536906599998474, "memory_gb": 7.721559524536133, "step_time_ms": 7544.522047042847, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:26:52] (step=0003389) Train Loss: 0.2609, Train Steps/Sec: 0.13, Epoch: 0.06585697629226583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3390, "loss": 0.29802677035331726, "memory_gb": 7.721559524536133, "step_time_ms": 7557.301759719849, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:00] (step=0003390) Train Loss: 0.2582, Train Steps/Sec: 0.12, Epoch: 0.06587640886125146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3391, "loss": 0.3257317543029785, "memory_gb": 7.721559524536133, "step_time_ms": 7531.476736068726, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:08] (step=0003391) Train Loss: 0.2424, Train Steps/Sec: 0.12, Epoch: 0.06589584143023708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3392, "loss": 0.24740368127822876, "memory_gb": 7.721559524536133, "step_time_ms": 7517.760515213013, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:16] (step=0003392) Train Loss: 0.2325, Train Steps/Sec: 0.12, Epoch: 0.0659152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3393, "loss": 0.1842801570892334, "memory_gb": 7.721559524536133, "step_time_ms": 7551.627635955811, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:24] (step=0003393) Train Loss: 0.2316, Train Steps/Sec: 0.12, Epoch: 0.06593470656820832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3394, "loss": 0.25539031624794006, "memory_gb": 7.721559524536133, "step_time_ms": 7419.443607330322, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:32] (step=0003394) Train Loss: 0.2155, Train Steps/Sec: 0.13, Epoch: 0.06595413913719393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3395, "loss": 0.15851429104804993, "memory_gb": 7.721559524536133, "step_time_ms": 7022.094249725342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:40] (step=0003395) Train Loss: 0.1711, Train Steps/Sec: 0.14, Epoch: 0.06597357170617955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3396, "loss": 0.31944429874420166, "memory_gb": 7.721559524536133, "step_time_ms": 6506.003379821777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:46] (step=0003396) Train Loss: 0.2841, Train Steps/Sec: 0.15, Epoch: 0.06599300427516518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:27:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3397, "loss": 0.17583008110523224, "memory_gb": 7.721559524536133, "step_time_ms": 7563.384056091309, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:27:54] (step=0003397) Train Loss: 0.1949, Train Steps/Sec: 0.12, Epoch: 0.0660124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3398, "loss": 0.3743908107280731, "memory_gb": 7.721559524536133, "step_time_ms": 7607.119083404541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:02] (step=0003398) Train Loss: 0.2866, Train Steps/Sec: 0.12, Epoch: 0.06603186941313642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3399, "loss": 0.29840517044067383, "memory_gb": 7.721559524536133, "step_time_ms": 7523.61273765564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:10] (step=0003399) Train Loss: 0.2332, Train Steps/Sec: 0.13, Epoch: 0.06605130198212204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3400, "loss": 0.3033098578453064, "memory_gb": 7.721559524536133, "step_time_ms": 7554.008483886719, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:19] (step=0003400) Train Loss: 0.3225, Train Steps/Sec: 0.12, Epoch: 0.06607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3401, "loss": 0.23006832599639893, "memory_gb": 7.721559524536133, "step_time_ms": 7652.75764465332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:27] (step=0003401) Train Loss: 0.2403, Train Steps/Sec: 0.12, Epoch: 0.06609016712009327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3402, "loss": 0.19839131832122803, "memory_gb": 7.721559524536133, "step_time_ms": 7553.453207015991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:35] (step=0003402) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.0661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3403, "loss": 0.28582730889320374, "memory_gb": 7.721559524536133, "step_time_ms": 7474.176406860352, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:43] (step=0003403) Train Loss: 0.2962, Train Steps/Sec: 0.12, Epoch: 0.06612903225806452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3404, "loss": 0.3385664224624634, "memory_gb": 7.721559524536133, "step_time_ms": 7555.364370346069, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:51] (step=0003404) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.06614846482705014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:28:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3405, "loss": 0.23171979188919067, "memory_gb": 7.721559524536133, "step_time_ms": 7547.479152679443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:28:59] (step=0003405) Train Loss: 0.2471, Train Steps/Sec: 0.12, Epoch: 0.06616789739603575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3406, "loss": 0.28230804204940796, "memory_gb": 7.721559524536133, "step_time_ms": 7520.159721374512, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:07] (step=0003406) Train Loss: 0.2452, Train Steps/Sec: 0.13, Epoch: 0.06618732996502137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3407, "loss": 0.2717049717903137, "memory_gb": 7.721559524536133, "step_time_ms": 7596.781015396118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:15] (step=0003407) Train Loss: 0.2669, Train Steps/Sec: 0.12, Epoch: 0.066206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3408, "loss": 0.1792604625225067, "memory_gb": 7.721559524536133, "step_time_ms": 7462.329626083374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:23] (step=0003408) Train Loss: 0.2184, Train Steps/Sec: 0.12, Epoch: 0.06622619510299262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3409, "loss": 0.26263803243637085, "memory_gb": 7.721559524536133, "step_time_ms": 7444.661378860474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:31] (step=0003409) Train Loss: 0.3094, Train Steps/Sec: 0.13, Epoch: 0.06624562767197824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3410, "loss": 0.18448922038078308, "memory_gb": 7.721559524536133, "step_time_ms": 7515.789270401001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:39] (step=0003410) Train Loss: 0.2050, Train Steps/Sec: 0.12, Epoch: 0.06626506024096386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3411, "loss": 0.2578701674938202, "memory_gb": 7.721559524536133, "step_time_ms": 7483.925580978394, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:47] (step=0003411) Train Loss: 0.2500, Train Steps/Sec: 0.13, Epoch: 0.06628449280994947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3412, "loss": 0.14827367663383484, "memory_gb": 7.721559524536133, "step_time_ms": 7464.972496032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:29:55] (step=0003412) Train Loss: 0.2227, Train Steps/Sec: 0.13, Epoch: 0.06630392537893509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3413, "loss": 0.2586948573589325, "memory_gb": 7.721559524536133, "step_time_ms": 7568.508863449097, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:03] (step=0003413) Train Loss: 0.2766, Train Steps/Sec: 0.12, Epoch: 0.06632335794792071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3414, "loss": 0.22270090878009796, "memory_gb": 7.721559524536133, "step_time_ms": 7448.533296585083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:11] (step=0003414) Train Loss: 0.2473, Train Steps/Sec: 0.13, Epoch: 0.06634279051690634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3415, "loss": 0.22176162898540497, "memory_gb": 7.721559524536133, "step_time_ms": 7535.732984542847, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:19] (step=0003415) Train Loss: 0.1952, Train Steps/Sec: 0.13, Epoch: 0.06636222308589196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3416, "loss": 0.16928189992904663, "memory_gb": 7.721559524536133, "step_time_ms": 7481.429815292358, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:27] (step=0003416) Train Loss: 0.1930, Train Steps/Sec: 0.12, Epoch: 0.06638165565487758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3417, "loss": 0.3191676437854767, "memory_gb": 7.721559524536133, "step_time_ms": 7248.146057128906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:35] (step=0003417) Train Loss: 0.2587, Train Steps/Sec: 0.13, Epoch: 0.06640108822386319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3418, "loss": 0.2372860610485077, "memory_gb": 7.721559524536133, "step_time_ms": 7459.29741859436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:43] (step=0003418) Train Loss: 0.2018, Train Steps/Sec: 0.13, Epoch: 0.06642052079284881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3419, "loss": 0.2657541334629059, "memory_gb": 7.721559524536133, "step_time_ms": 7516.766786575317, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:51] (step=0003419) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.06643995336183443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:30:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3420, "loss": 0.3134475350379944, "memory_gb": 7.721559524536133, "step_time_ms": 7445.0860023498535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:30:59] (step=0003420) Train Loss: 0.2964, Train Steps/Sec: 0.13, Epoch: 0.06645938593082006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3421, "loss": 0.27742087841033936, "memory_gb": 7.721559524536133, "step_time_ms": 7445.819139480591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:07] (step=0003421) Train Loss: 0.2329, Train Steps/Sec: 0.12, Epoch: 0.06647881849980568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3422, "loss": 0.20401762425899506, "memory_gb": 7.721559524536133, "step_time_ms": 7533.952474594116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:15] (step=0003422) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.0664982510687913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3423, "loss": 0.20245198905467987, "memory_gb": 7.721559524536133, "step_time_ms": 7332.803726196289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:23] (step=0003423) Train Loss: 0.1896, Train Steps/Sec: 0.13, Epoch: 0.06651768363777691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3424, "loss": 0.2701091170310974, "memory_gb": 7.721559524536133, "step_time_ms": 6254.059791564941, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:29] (step=0003424) Train Loss: 0.2611, Train Steps/Sec: 0.15, Epoch: 0.06653711620676253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3425, "loss": 0.262264609336853, "memory_gb": 7.721559524536133, "step_time_ms": 6831.8140506744385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:36] (step=0003425) Train Loss: 0.2589, Train Steps/Sec: 0.14, Epoch: 0.06655654877574815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3426, "loss": 0.27767279744148254, "memory_gb": 7.721559524536133, "step_time_ms": 7480.7329177856445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:44] (step=0003426) Train Loss: 0.2339, Train Steps/Sec: 0.12, Epoch: 0.06657598134473378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:31:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3427, "loss": 0.2858128547668457, "memory_gb": 7.721559524536133, "step_time_ms": 7529.767274856567, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:31:53] (step=0003427) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.0665954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3428, "loss": 0.3159804940223694, "memory_gb": 7.721559524536133, "step_time_ms": 7447.305917739868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:01] (step=0003428) Train Loss: 0.2965, Train Steps/Sec: 0.12, Epoch: 0.06661484648270502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3429, "loss": 0.13453596830368042, "memory_gb": 7.721559524536133, "step_time_ms": 7524.708986282349, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:09] (step=0003429) Train Loss: 0.1595, Train Steps/Sec: 0.12, Epoch: 0.06663427905169063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3430, "loss": 0.15563614666461945, "memory_gb": 7.721559524536133, "step_time_ms": 7553.409099578857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:17] (step=0003430) Train Loss: 0.1322, Train Steps/Sec: 0.12, Epoch: 0.06665371162067625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3431, "loss": 0.28962084650993347, "memory_gb": 7.721559524536133, "step_time_ms": 7406.75950050354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:25] (step=0003431) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.06667314418966187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3432, "loss": 0.1847565472126007, "memory_gb": 7.721559524536133, "step_time_ms": 7460.00599861145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:33] (step=0003432) Train Loss: 0.2220, Train Steps/Sec: 0.12, Epoch: 0.0666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3433, "loss": 0.2852519154548645, "memory_gb": 7.721559524536133, "step_time_ms": 7520.255565643311, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:41] (step=0003433) Train Loss: 0.2549, Train Steps/Sec: 0.12, Epoch: 0.06671200932763312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3434, "loss": 0.16876843571662903, "memory_gb": 7.721559524536133, "step_time_ms": 7424.906015396118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:49] (step=0003434) Train Loss: 0.2045, Train Steps/Sec: 0.13, Epoch: 0.06673144189661873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3435, "loss": 0.30964407324790955, "memory_gb": 7.721559524536133, "step_time_ms": 7437.588214874268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:32:57] (step=0003435) Train Loss: 0.2919, Train Steps/Sec: 0.12, Epoch: 0.06675087446560435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3436, "loss": 0.2944042980670929, "memory_gb": 7.721559524536133, "step_time_ms": 7551.555871963501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:05] (step=0003436) Train Loss: 0.2685, Train Steps/Sec: 0.12, Epoch: 0.06677030703458997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3437, "loss": 0.22775080800056458, "memory_gb": 7.721559524536133, "step_time_ms": 7474.412679672241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:13] (step=0003437) Train Loss: 0.1943, Train Steps/Sec: 0.13, Epoch: 0.06678973960357559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3438, "loss": 0.34297335147857666, "memory_gb": 7.721559524536133, "step_time_ms": 7477.7281284332275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:21] (step=0003438) Train Loss: 0.3167, Train Steps/Sec: 0.12, Epoch: 0.06680917217256122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3439, "loss": 0.2806040048599243, "memory_gb": 7.721559524536133, "step_time_ms": 7568.11785697937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:29] (step=0003439) Train Loss: 0.2821, Train Steps/Sec: 0.12, Epoch: 0.06682860474154684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3440, "loss": 0.18903520703315735, "memory_gb": 7.721559524536133, "step_time_ms": 7517.980337142944, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:37] (step=0003440) Train Loss: 0.1814, Train Steps/Sec: 0.12, Epoch: 0.06684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3441, "loss": 0.24937140941619873, "memory_gb": 7.721559524536133, "step_time_ms": 7539.301872253418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:45] (step=0003441) Train Loss: 0.2410, Train Steps/Sec: 0.13, Epoch: 0.06686746987951807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:33:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3442, "loss": 0.34694284200668335, "memory_gb": 7.721559524536133, "step_time_ms": 7659.23285484314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:33:53] (step=0003442) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.06688690244850369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3443, "loss": 0.2753880023956299, "memory_gb": 7.721559524536133, "step_time_ms": 7525.813817977905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:01] (step=0003443) Train Loss: 0.2261, Train Steps/Sec: 0.12, Epoch: 0.06690633501748931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3444, "loss": 0.30196043848991394, "memory_gb": 7.721559524536133, "step_time_ms": 7459.904909133911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:09] (step=0003444) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.06692576758647494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3445, "loss": 0.24099594354629517, "memory_gb": 7.721559524536133, "step_time_ms": 7575.183153152466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:17] (step=0003445) Train Loss: 0.2540, Train Steps/Sec: 0.12, Epoch: 0.06694520015546056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3446, "loss": 0.2619393765926361, "memory_gb": 7.721559524536133, "step_time_ms": 7432.928800582886, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:25] (step=0003446) Train Loss: 0.2121, Train Steps/Sec: 0.13, Epoch: 0.06696463272444617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3447, "loss": 0.20929047465324402, "memory_gb": 7.721559524536133, "step_time_ms": 7432.463884353638, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:33] (step=0003447) Train Loss: 0.1905, Train Steps/Sec: 0.12, Epoch: 0.06698406529343179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3448, "loss": 0.25705093145370483, "memory_gb": 7.721559524536133, "step_time_ms": 7531.5375328063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:41] (step=0003448) Train Loss: 0.2906, Train Steps/Sec: 0.12, Epoch: 0.06700349786241741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3449, "loss": 0.27811819314956665, "memory_gb": 7.721559524536133, "step_time_ms": 7484.1718673706055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:49] (step=0003449) Train Loss: 0.2818, Train Steps/Sec: 0.12, Epoch: 0.06702293043140303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:34:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3450, "loss": 0.236289381980896, "memory_gb": 7.721559524536133, "step_time_ms": 7485.443592071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:34:58] (step=0003450) Train Loss: 0.2116, Train Steps/Sec: 0.12, Epoch: 0.06704236300038866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3451, "loss": 0.23832228779792786, "memory_gb": 7.721559524536133, "step_time_ms": 7556.090354919434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:06] (step=0003451) Train Loss: 0.2435, Train Steps/Sec: 0.12, Epoch: 0.06706179556937428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3452, "loss": 0.3141840100288391, "memory_gb": 7.721559524536133, "step_time_ms": 7377.249240875244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:14] (step=0003452) Train Loss: 0.2723, Train Steps/Sec: 0.13, Epoch: 0.06708122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3453, "loss": 0.29133135080337524, "memory_gb": 7.721559524536133, "step_time_ms": 6250.301122665405, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:20] (step=0003453) Train Loss: 0.2608, Train Steps/Sec: 0.15, Epoch: 0.06710066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3454, "loss": 0.23210451006889343, "memory_gb": 7.721559524536133, "step_time_ms": 6451.324224472046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:27] (step=0003454) Train Loss: 0.3016, Train Steps/Sec: 0.14, Epoch: 0.06712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3455, "loss": 0.2839217483997345, "memory_gb": 7.721559524536133, "step_time_ms": 7555.888891220093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:35] (step=0003455) Train Loss: 0.3037, Train Steps/Sec: 0.12, Epoch: 0.06713952584531675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3456, "loss": 0.34802404046058655, "memory_gb": 7.721559524536133, "step_time_ms": 7483.758449554443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:43] (step=0003456) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.06715895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3457, "loss": 0.24822786450386047, "memory_gb": 7.721559524536133, "step_time_ms": 7398.655891418457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:35:51] (step=0003457) Train Loss: 0.2248, Train Steps/Sec: 0.12, Epoch: 0.067178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3458, "loss": 0.3435811400413513, "memory_gb": 7.721559524536133, "step_time_ms": 7448.884725570679, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:00] (step=0003458) Train Loss: 0.3034, Train Steps/Sec: 0.12, Epoch: 0.0671978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3459, "loss": 0.27236104011535645, "memory_gb": 7.721559524536133, "step_time_ms": 7455.937623977661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:08] (step=0003459) Train Loss: 0.2265, Train Steps/Sec: 0.12, Epoch: 0.06721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3460, "loss": 0.2440408170223236, "memory_gb": 7.721559524536133, "step_time_ms": 7399.069786071777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:16] (step=0003460) Train Loss: 0.2606, Train Steps/Sec: 0.12, Epoch: 0.06723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3461, "loss": 0.25559931993484497, "memory_gb": 7.721559524536133, "step_time_ms": 7389.706373214722, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:24] (step=0003461) Train Loss: 0.2386, Train Steps/Sec: 0.13, Epoch: 0.06725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3462, "loss": 0.22291073203086853, "memory_gb": 7.721559524536133, "step_time_ms": 7467.435121536255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:32] (step=0003462) Train Loss: 0.2459, Train Steps/Sec: 0.12, Epoch: 0.0672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3463, "loss": 0.28020790219306946, "memory_gb": 7.721559524536133, "step_time_ms": 7456.179618835449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:40] (step=0003463) Train Loss: 0.2195, Train Steps/Sec: 0.12, Epoch: 0.0672949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3464, "loss": 0.25170713663101196, "memory_gb": 7.721559524536133, "step_time_ms": 7400.8026123046875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:48] (step=0003464) Train Loss: 0.2309, Train Steps/Sec: 0.13, Epoch: 0.06731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3465, "loss": 0.20594072341918945, "memory_gb": 7.721559524536133, "step_time_ms": 7471.579551696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:36:56] (step=0003465) Train Loss: 0.1898, Train Steps/Sec: 0.12, Epoch: 0.06733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3466, "loss": 0.3143274188041687, "memory_gb": 7.721559524536133, "step_time_ms": 7455.797910690308, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:04] (step=0003466) Train Loss: 0.2831, Train Steps/Sec: 0.12, Epoch: 0.06735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3467, "loss": 0.24933043122291565, "memory_gb": 7.721559524536133, "step_time_ms": 7491.0337924957275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:12] (step=0003467) Train Loss: 0.2472, Train Steps/Sec: 0.12, Epoch: 0.06737271667314419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3468, "loss": 0.26707732677459717, "memory_gb": 7.721559524536133, "step_time_ms": 7494.769811630249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:20] (step=0003468) Train Loss: 0.2926, Train Steps/Sec: 0.12, Epoch: 0.06739214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3469, "loss": 0.1913895308971405, "memory_gb": 7.721559524536133, "step_time_ms": 7449.315309524536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:28] (step=0003469) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.06741158181111542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3470, "loss": 0.22970671951770782, "memory_gb": 7.721559524536133, "step_time_ms": 7371.767997741699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:36] (step=0003470) Train Loss: 0.2797, Train Steps/Sec: 0.13, Epoch: 0.06743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3471, "loss": 0.26712557673454285, "memory_gb": 7.721559524536133, "step_time_ms": 7478.512763977051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:44] (step=0003471) Train Loss: 0.2611, Train Steps/Sec: 0.12, Epoch: 0.06745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3472, "loss": 0.2698224186897278, "memory_gb": 7.721559524536133, "step_time_ms": 7414.080619812012, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:37:52] (step=0003472) Train Loss: 0.2583, Train Steps/Sec: 0.13, Epoch: 0.06746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3473, "loss": 0.24066439270973206, "memory_gb": 7.721559524536133, "step_time_ms": 7418.271780014038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:00] (step=0003473) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.06748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3474, "loss": 0.224318265914917, "memory_gb": 7.721559524536133, "step_time_ms": 7551.8553256988525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:08] (step=0003474) Train Loss: 0.2119, Train Steps/Sec: 0.12, Epoch: 0.06750874465604353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3475, "loss": 0.2689334452152252, "memory_gb": 7.721559524536133, "step_time_ms": 7483.266353607178, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:16] (step=0003475) Train Loss: 0.2825, Train Steps/Sec: 0.12, Epoch: 0.06752817722502914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3476, "loss": 0.23514235019683838, "memory_gb": 7.721559524536133, "step_time_ms": 7476.147890090942, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:24] (step=0003476) Train Loss: 0.1885, Train Steps/Sec: 0.12, Epoch: 0.06754760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3477, "loss": 0.321260929107666, "memory_gb": 7.721559524536133, "step_time_ms": 7566.016435623169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:32] (step=0003477) Train Loss: 0.2922, Train Steps/Sec: 0.12, Epoch: 0.06756704236300039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3478, "loss": 0.2934567928314209, "memory_gb": 7.721559524536133, "step_time_ms": 7503.175973892212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:40] (step=0003478) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 0.06758647493198601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3479, "loss": 0.2231384813785553, "memory_gb": 7.721559524536133, "step_time_ms": 7454.720973968506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:48] (step=0003479) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.06760590750097163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3480, "loss": 0.2052239626646042, "memory_gb": 7.721559524536133, "step_time_ms": 7530.334234237671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:38:56] (step=0003480) Train Loss: 0.2230, Train Steps/Sec: 0.12, Epoch: 0.06762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3481, "loss": 0.16654090583324432, "memory_gb": 7.721559524536133, "step_time_ms": 7387.904405593872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:04] (step=0003481) Train Loss: 0.1548, Train Steps/Sec: 0.13, Epoch: 0.06764477263894286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3482, "loss": 0.3097503185272217, "memory_gb": 7.721559524536133, "step_time_ms": 6075.935363769531, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:10] (step=0003482) Train Loss: 0.2739, Train Steps/Sec: 0.16, Epoch: 0.06766420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3483, "loss": 0.28601330518722534, "memory_gb": 7.721559524536133, "step_time_ms": 6529.723405838013, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:18] (step=0003483) Train Loss: 0.2625, Train Steps/Sec: 0.14, Epoch: 0.06768363777691411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3484, "loss": 0.22545965015888214, "memory_gb": 7.721559524536133, "step_time_ms": 7474.609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:26] (step=0003484) Train Loss: 0.2425, Train Steps/Sec: 0.13, Epoch: 0.06770307034589973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3485, "loss": 0.15667015314102173, "memory_gb": 7.721559524536133, "step_time_ms": 7385.677814483643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:34] (step=0003485) Train Loss: 0.1498, Train Steps/Sec: 0.12, Epoch: 0.06772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3486, "loss": 0.21422460675239563, "memory_gb": 7.721559524536133, "step_time_ms": 7529.844522476196, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:42] (step=0003486) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.06774193548387097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3487, "loss": 0.2518528699874878, "memory_gb": 7.721559524536133, "step_time_ms": 7556.269407272339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:50] (step=0003487) Train Loss: 0.2413, Train Steps/Sec: 0.12, Epoch: 0.06776136805285658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3488, "loss": 0.15341228246688843, "memory_gb": 7.721559524536133, "step_time_ms": 7620.217323303223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:39:58] (step=0003488) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.0677808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3489, "loss": 0.3027969300746918, "memory_gb": 7.721559524536133, "step_time_ms": 7506.559371948242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:06] (step=0003489) Train Loss: 0.2356, Train Steps/Sec: 0.12, Epoch: 0.06780023319082783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3490, "loss": 0.2695120573043823, "memory_gb": 7.721559524536133, "step_time_ms": 7471.766471862793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:14] (step=0003490) Train Loss: 0.2212, Train Steps/Sec: 0.12, Epoch: 0.06781966575981345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3491, "loss": 0.27982455492019653, "memory_gb": 7.721559524536133, "step_time_ms": 7562.437057495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:22] (step=0003491) Train Loss: 0.2769, Train Steps/Sec: 0.12, Epoch: 0.06783909832879907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3492, "loss": 0.16424331068992615, "memory_gb": 7.721559524536133, "step_time_ms": 7500.807046890259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:30] (step=0003492) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.0678585308977847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3493, "loss": 0.28996706008911133, "memory_gb": 7.721559524536133, "step_time_ms": 7527.783393859863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:38] (step=0003493) Train Loss: 0.2833, Train Steps/Sec: 0.12, Epoch: 0.0678779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3494, "loss": 0.21824604272842407, "memory_gb": 7.721559524536133, "step_time_ms": 7610.854625701904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:46] (step=0003494) Train Loss: 0.2122, Train Steps/Sec: 0.12, Epoch: 0.06789739603575592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3495, "loss": 0.2264660745859146, "memory_gb": 7.721559524536133, "step_time_ms": 7495.764255523682, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:40:54] (step=0003495) Train Loss: 0.2370, Train Steps/Sec: 0.13, Epoch: 0.06791682860474155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3496, "loss": 0.2964188754558563, "memory_gb": 7.721559524536133, "step_time_ms": 7492.390394210815, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:02] (step=0003496) Train Loss: 0.2688, Train Steps/Sec: 0.13, Epoch: 0.06793626117372717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3497, "loss": 0.25573331117630005, "memory_gb": 7.721559524536133, "step_time_ms": 7599.3475914001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:11] (step=0003497) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.06795569374271279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3498, "loss": 0.31485480070114136, "memory_gb": 7.721559524536133, "step_time_ms": 7507.095813751221, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:19] (step=0003498) Train Loss: 0.2423, Train Steps/Sec: 0.13, Epoch: 0.0679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3499, "loss": 0.24136438965797424, "memory_gb": 7.721559524536133, "step_time_ms": 7429.406642913818, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:27] (step=0003499) Train Loss: 0.2028, Train Steps/Sec: 0.13, Epoch: 0.06799455888068402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3500, "loss": 0.29628807306289673, "memory_gb": 7.721559524536133, "step_time_ms": 7478.215932846069, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:35] (step=0003500) Train Loss: 0.2902, Train Steps/Sec: 0.12, Epoch: 0.06801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3501, "loss": 0.17565086483955383, "memory_gb": 7.721559524536133, "step_time_ms": 7404.785633087158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:43] (step=0003501) Train Loss: 0.2247, Train Steps/Sec: 0.13, Epoch: 0.06803342401865527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3502, "loss": 0.37952589988708496, "memory_gb": 7.721559524536133, "step_time_ms": 7486.785888671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:51] (step=0003502) Train Loss: 0.3069, Train Steps/Sec: 0.13, Epoch: 0.06805285658764089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:41:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3503, "loss": 0.21557967364788055, "memory_gb": 7.721559524536133, "step_time_ms": 7444.91171836853, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:41:59] (step=0003503) Train Loss: 0.2165, Train Steps/Sec: 0.12, Epoch: 0.06807228915662651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3504, "loss": 0.24852395057678223, "memory_gb": 7.721559524536133, "step_time_ms": 7409.237384796143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:07] (step=0003504) Train Loss: 0.2081, Train Steps/Sec: 0.13, Epoch: 0.06809172172561212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3505, "loss": 0.25804799795150757, "memory_gb": 7.721559524536133, "step_time_ms": 7420.1929569244385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:15] (step=0003505) Train Loss: 0.2903, Train Steps/Sec: 0.12, Epoch: 0.06811115429459774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3506, "loss": 0.21382658183574677, "memory_gb": 7.721559524536133, "step_time_ms": 7517.887830734253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:23] (step=0003506) Train Loss: 0.2084, Train Steps/Sec: 0.12, Epoch: 0.06813058686358336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3507, "loss": 0.242039754986763, "memory_gb": 7.721559524536133, "step_time_ms": 7485.330581665039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:31] (step=0003507) Train Loss: 0.2144, Train Steps/Sec: 0.12, Epoch: 0.06815001943256899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3508, "loss": 0.21746064722537994, "memory_gb": 7.721559524536133, "step_time_ms": 7391.828775405884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:39] (step=0003508) Train Loss: 0.1644, Train Steps/Sec: 0.13, Epoch: 0.06816945200155461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3509, "loss": 0.3059677481651306, "memory_gb": 7.721559524536133, "step_time_ms": 7469.313144683838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:47] (step=0003509) Train Loss: 0.2283, Train Steps/Sec: 0.12, Epoch: 0.06818888457054023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:42:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3510, "loss": 0.24861669540405273, "memory_gb": 7.721559524536133, "step_time_ms": 7343.57213973999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:42:55] (step=0003510) Train Loss: 0.2243, Train Steps/Sec: 0.13, Epoch: 0.06820831713952584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3511, "loss": 0.2009061574935913, "memory_gb": 7.721559524536133, "step_time_ms": 5656.710863113403, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:01] (step=0003511) Train Loss: 0.2275, Train Steps/Sec: 0.17, Epoch: 0.06822774970851146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3512, "loss": 0.27459093928337097, "memory_gb": 7.721559524536133, "step_time_ms": 7453.229665756226, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:09] (step=0003512) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.06824718227749708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3513, "loss": 0.2035190463066101, "memory_gb": 7.721559524536133, "step_time_ms": 7429.251194000244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:17] (step=0003513) Train Loss: 0.2067, Train Steps/Sec: 0.12, Epoch: 0.0682666148464827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3514, "loss": 0.2549586892127991, "memory_gb": 7.721559524536133, "step_time_ms": 7487.438917160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:25] (step=0003514) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.06828604741546833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3515, "loss": 0.23171216249465942, "memory_gb": 7.721559524536133, "step_time_ms": 7440.636157989502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:33] (step=0003515) Train Loss: 0.2603, Train Steps/Sec: 0.13, Epoch: 0.06830547998445395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3516, "loss": 0.19025133550167084, "memory_gb": 7.721559524536133, "step_time_ms": 7454.951286315918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:41] (step=0003516) Train Loss: 0.1935, Train Steps/Sec: 0.13, Epoch: 0.06832491255343956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3517, "loss": 0.14446431398391724, "memory_gb": 7.721559524536133, "step_time_ms": 7499.326705932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:49] (step=0003517) Train Loss: 0.1856, Train Steps/Sec: 0.12, Epoch: 0.06834434512242518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:43:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3518, "loss": 0.31619924306869507, "memory_gb": 7.721559524536133, "step_time_ms": 7430.3131103515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:43:57] (step=0003518) Train Loss: 0.2397, Train Steps/Sec: 0.12, Epoch: 0.0683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3519, "loss": 0.2164827436208725, "memory_gb": 7.721559524536133, "step_time_ms": 7428.09796333313, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:05] (step=0003519) Train Loss: 0.1935, Train Steps/Sec: 0.12, Epoch: 0.06838321026039643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3520, "loss": 0.28606319427490234, "memory_gb": 7.721559524536133, "step_time_ms": 7533.636093139648, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:13] (step=0003520) Train Loss: 0.2560, Train Steps/Sec: 0.12, Epoch: 0.06840264282938205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3521, "loss": 0.18988122045993805, "memory_gb": 7.721559524536133, "step_time_ms": 7481.929540634155, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:21] (step=0003521) Train Loss: 0.2029, Train Steps/Sec: 0.12, Epoch: 0.06842207539836767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3522, "loss": 0.1267922818660736, "memory_gb": 7.721559524536133, "step_time_ms": 7505.743503570557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:29] (step=0003522) Train Loss: 0.1433, Train Steps/Sec: 0.12, Epoch: 0.06844150796735328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3523, "loss": 0.1307787299156189, "memory_gb": 7.721559524536133, "step_time_ms": 7533.736705780029, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:37] (step=0003523) Train Loss: 0.2122, Train Steps/Sec: 0.12, Epoch: 0.0684609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3524, "loss": 0.27044862508773804, "memory_gb": 7.715639114379883, "step_time_ms": 7514.898777008057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:45] (step=0003524) Train Loss: 0.2189, Train Steps/Sec: 0.13, Epoch: 0.06848037310532452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3525, "loss": 0.2024897336959839, "memory_gb": 7.721559524536133, "step_time_ms": 7487.77437210083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:44:53] (step=0003525) Train Loss: 0.2167, Train Steps/Sec: 0.12, Epoch: 0.06849980567431015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3526, "loss": 0.1828959882259369, "memory_gb": 7.721559524536133, "step_time_ms": 7559.398412704468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:01] (step=0003526) Train Loss: 0.1999, Train Steps/Sec: 0.12, Epoch: 0.06851923824329577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3527, "loss": 0.14622093737125397, "memory_gb": 7.721559524536133, "step_time_ms": 7587.446928024292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:09] (step=0003527) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.06853867081228138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3528, "loss": 0.30386021733283997, "memory_gb": 7.721559524536133, "step_time_ms": 7502.747058868408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:17] (step=0003528) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.068558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3529, "loss": 0.2805849015712738, "memory_gb": 7.721559524536133, "step_time_ms": 7554.232358932495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:25] (step=0003529) Train Loss: 0.2780, Train Steps/Sec: 0.12, Epoch: 0.06857753595025262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3530, "loss": 0.21821460127830505, "memory_gb": 7.721559524536133, "step_time_ms": 7558.705806732178, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:33] (step=0003530) Train Loss: 0.2573, Train Steps/Sec: 0.12, Epoch: 0.06859696851923824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3531, "loss": 0.1794266253709793, "memory_gb": 7.721559524536133, "step_time_ms": 7496.699810028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:41] (step=0003531) Train Loss: 0.2416, Train Steps/Sec: 0.13, Epoch: 0.06861640108822387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3532, "loss": 0.2042919248342514, "memory_gb": 7.721559524536133, "step_time_ms": 7539.530515670776, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:49] (step=0003532) Train Loss: 0.2009, Train Steps/Sec: 0.12, Epoch: 0.06863583365720949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3533, "loss": 0.3424583971500397, "memory_gb": 7.721559524536133, "step_time_ms": 7503.012895584106, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:45:57] (step=0003533) Train Loss: 0.2253, Train Steps/Sec: 0.12, Epoch: 0.0686552662261951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3534, "loss": 0.1617843508720398, "memory_gb": 7.721559524536133, "step_time_ms": 7456.628799438477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:05] (step=0003534) Train Loss: 0.2000, Train Steps/Sec: 0.13, Epoch: 0.06867469879518072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3535, "loss": 0.29246431589126587, "memory_gb": 7.721559524536133, "step_time_ms": 7569.5154666900635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:13] (step=0003535) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.06869413136416634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3536, "loss": 0.2674221694469452, "memory_gb": 7.721559524536133, "step_time_ms": 7541.837215423584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:21] (step=0003536) Train Loss: 0.2363, Train Steps/Sec: 0.12, Epoch: 0.06871356393315196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3537, "loss": 0.16699166595935822, "memory_gb": 7.721559524536133, "step_time_ms": 7554.3036460876465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:30] (step=0003537) Train Loss: 0.1945, Train Steps/Sec: 0.12, Epoch: 0.06873299650213759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3538, "loss": 0.1613936722278595, "memory_gb": 7.721559524536133, "step_time_ms": 7399.028062820435, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:38] (step=0003538) Train Loss: 0.2494, Train Steps/Sec: 0.13, Epoch: 0.06875242907112321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3539, "loss": 0.20631510019302368, "memory_gb": 7.721559524536133, "step_time_ms": 7605.669975280762, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:46] (step=0003539) Train Loss: 0.2610, Train Steps/Sec: 0.12, Epoch: 0.06877186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3540, "loss": 0.19538533687591553, "memory_gb": 7.721559524536133, "step_time_ms": 5258.460521697998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:51] (step=0003540) Train Loss: 0.2436, Train Steps/Sec: 0.18, Epoch: 0.06879129420909444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3541, "loss": 0.2752354145050049, "memory_gb": 7.721559524536133, "step_time_ms": 7486.826181411743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:46:59] (step=0003541) Train Loss: 0.2384, Train Steps/Sec: 0.13, Epoch: 0.06881072677808006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3542, "loss": 0.2783590257167816, "memory_gb": 7.721559524536133, "step_time_ms": 7438.889503479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:07] (step=0003542) Train Loss: 0.2776, Train Steps/Sec: 0.13, Epoch: 0.06883015934706568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3543, "loss": 0.22333739697933197, "memory_gb": 7.721559524536133, "step_time_ms": 7641.952276229858, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:15] (step=0003543) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.0688495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3544, "loss": 0.2882053256034851, "memory_gb": 7.721559524536133, "step_time_ms": 7459.5019817352295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:23] (step=0003544) Train Loss: 0.2404, Train Steps/Sec: 0.12, Epoch: 0.06886902448503693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3545, "loss": 0.20156097412109375, "memory_gb": 7.721559524536133, "step_time_ms": 7465.5914306640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:31] (step=0003545) Train Loss: 0.2055, Train Steps/Sec: 0.13, Epoch: 0.06888845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3546, "loss": 0.1860182136297226, "memory_gb": 7.721559524536133, "step_time_ms": 7523.666858673096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:39] (step=0003546) Train Loss: 0.2035, Train Steps/Sec: 0.12, Epoch: 0.06890788962300816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3547, "loss": 0.23915939033031464, "memory_gb": 7.715639114379883, "step_time_ms": 7477.85758972168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:47] (step=0003547) Train Loss: 0.2142, Train Steps/Sec: 0.12, Epoch: 0.06892732219199378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:47:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3548, "loss": 0.168701171875, "memory_gb": 7.721559524536133, "step_time_ms": 7445.276498794556, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:47:55] (step=0003548) Train Loss: 0.1909, Train Steps/Sec: 0.13, Epoch: 0.0689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3549, "loss": 0.24997037649154663, "memory_gb": 7.721559524536133, "step_time_ms": 7477.961540222168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:03] (step=0003549) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.06896618732996503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3550, "loss": 0.2534857392311096, "memory_gb": 7.721559524536133, "step_time_ms": 7482.1507930755615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:11] (step=0003550) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.06898561989895065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3551, "loss": 0.24072617292404175, "memory_gb": 7.721559524536133, "step_time_ms": 7417.860746383667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:19] (step=0003551) Train Loss: 0.2603, Train Steps/Sec: 0.12, Epoch: 0.06900505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3552, "loss": 0.26149827241897583, "memory_gb": 7.721559524536133, "step_time_ms": 7474.472761154175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:27] (step=0003552) Train Loss: 0.2168, Train Steps/Sec: 0.12, Epoch: 0.06902448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3553, "loss": 0.2923552393913269, "memory_gb": 7.721559524536133, "step_time_ms": 7434.062719345093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:35] (step=0003553) Train Loss: 0.2859, Train Steps/Sec: 0.12, Epoch: 0.0690439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3554, "loss": 0.22785133123397827, "memory_gb": 7.721559524536133, "step_time_ms": 7371.613502502441, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:43] (step=0003554) Train Loss: 0.2181, Train Steps/Sec: 0.13, Epoch: 0.06906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3555, "loss": 0.218852698802948, "memory_gb": 7.721559524536133, "step_time_ms": 7480.7703495025635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:51] (step=0003555) Train Loss: 0.2593, Train Steps/Sec: 0.13, Epoch: 0.06908278274387875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:48:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3556, "loss": 0.2742499113082886, "memory_gb": 7.721559524536133, "step_time_ms": 7470.07417678833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:48:59] (step=0003556) Train Loss: 0.2268, Train Steps/Sec: 0.13, Epoch: 0.06910221531286435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3557, "loss": 0.3408239483833313, "memory_gb": 7.721559524536133, "step_time_ms": 7390.8562660217285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:07] (step=0003557) Train Loss: 0.2833, Train Steps/Sec: 0.13, Epoch: 0.06912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3558, "loss": 0.3124944567680359, "memory_gb": 7.721559524536133, "step_time_ms": 7423.342227935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:15] (step=0003558) Train Loss: 0.2792, Train Steps/Sec: 0.13, Epoch: 0.0691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3559, "loss": 0.29633432626724243, "memory_gb": 7.721559524536133, "step_time_ms": 7444.075584411621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:23] (step=0003559) Train Loss: 0.2802, Train Steps/Sec: 0.12, Epoch: 0.06916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3560, "loss": 0.18870025873184204, "memory_gb": 7.721559524536133, "step_time_ms": 7413.221120834351, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:31] (step=0003560) Train Loss: 0.2460, Train Steps/Sec: 0.13, Epoch: 0.06917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3561, "loss": 0.3596838712692261, "memory_gb": 7.721559524536133, "step_time_ms": 7469.661474227905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:39] (step=0003561) Train Loss: 0.2990, Train Steps/Sec: 0.12, Epoch: 0.06919937815779247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3562, "loss": 0.1720016598701477, "memory_gb": 7.721559524536133, "step_time_ms": 7503.177881240845, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:47] (step=0003562) Train Loss: 0.2092, Train Steps/Sec: 0.12, Epoch: 0.06921881072677807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:49:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3563, "loss": 0.2563587725162506, "memory_gb": 7.721559524536133, "step_time_ms": 7421.619653701782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:49:55] (step=0003563) Train Loss: 0.2285, Train Steps/Sec: 0.13, Epoch: 0.0692382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3564, "loss": 0.22052179276943207, "memory_gb": 7.721559524536133, "step_time_ms": 7436.152219772339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:03] (step=0003564) Train Loss: 0.1972, Train Steps/Sec: 0.12, Epoch: 0.06925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3565, "loss": 0.29034268856048584, "memory_gb": 7.721559524536133, "step_time_ms": 7520.714998245239, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:11] (step=0003565) Train Loss: 0.3201, Train Steps/Sec: 0.12, Epoch: 0.06927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3566, "loss": 0.23564212024211884, "memory_gb": 7.721559524536133, "step_time_ms": 7437.70170211792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:19] (step=0003566) Train Loss: 0.2272, Train Steps/Sec: 0.12, Epoch: 0.06929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3567, "loss": 0.34839481115341187, "memory_gb": 7.721559524536133, "step_time_ms": 7343.413352966309, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:27] (step=0003567) Train Loss: 0.3149, Train Steps/Sec: 0.13, Epoch: 0.06931597357170619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3568, "loss": 0.19807937741279602, "memory_gb": 7.721559524536133, "step_time_ms": 7496.679306030273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:35] (step=0003568) Train Loss: 0.2294, Train Steps/Sec: 0.12, Epoch: 0.0693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3569, "loss": 0.1952347755432129, "memory_gb": 7.721559524536133, "step_time_ms": 5224.03359413147, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:41] (step=0003569) Train Loss: 0.2552, Train Steps/Sec: 0.17, Epoch: 0.06935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3570, "loss": 0.19049224257469177, "memory_gb": 7.721559524536133, "step_time_ms": 7518.152475357056, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:49] (step=0003570) Train Loss: 0.2151, Train Steps/Sec: 0.12, Epoch: 0.06937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3571, "loss": 0.15341052412986755, "memory_gb": 7.721559524536133, "step_time_ms": 7431.43892288208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:50:57] (step=0003571) Train Loss: 0.1729, Train Steps/Sec: 0.12, Epoch: 0.06939370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3572, "loss": 0.19162516295909882, "memory_gb": 7.721559524536133, "step_time_ms": 7510.838747024536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:05] (step=0003572) Train Loss: 0.2108, Train Steps/Sec: 0.12, Epoch: 0.06941313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3573, "loss": 0.21290616691112518, "memory_gb": 7.721559524536133, "step_time_ms": 7533.16855430603, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:13] (step=0003573) Train Loss: 0.2760, Train Steps/Sec: 0.12, Epoch: 0.0694325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3574, "loss": 0.3724725842475891, "memory_gb": 7.721559524536133, "step_time_ms": 7448.327302932739, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:21] (step=0003574) Train Loss: 0.2481, Train Steps/Sec: 0.13, Epoch: 0.06945200155460551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3575, "loss": 0.3550527095794678, "memory_gb": 7.721559524536133, "step_time_ms": 7523.745059967041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:29] (step=0003575) Train Loss: 0.2951, Train Steps/Sec: 0.12, Epoch: 0.06947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3576, "loss": 0.3113369941711426, "memory_gb": 7.721559524536133, "step_time_ms": 7606.436014175415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:37] (step=0003576) Train Loss: 0.2794, Train Steps/Sec: 0.12, Epoch: 0.06949086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3577, "loss": 0.21799355745315552, "memory_gb": 7.721559524536133, "step_time_ms": 7574.487209320068, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:45] (step=0003577) Train Loss: 0.2173, Train Steps/Sec: 0.12, Epoch: 0.06951029926156238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:51:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3578, "loss": 0.1632646918296814, "memory_gb": 7.721559524536133, "step_time_ms": 7573.868274688721, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:51:53] (step=0003578) Train Loss: 0.1985, Train Steps/Sec: 0.12, Epoch: 0.069529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3579, "loss": 0.20318983495235443, "memory_gb": 7.721559524536133, "step_time_ms": 7539.653778076172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:01] (step=0003579) Train Loss: 0.2070, Train Steps/Sec: 0.12, Epoch: 0.06954916439953363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3580, "loss": 0.3565901517868042, "memory_gb": 7.721559524536133, "step_time_ms": 7510.272264480591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:09] (step=0003580) Train Loss: 0.3161, Train Steps/Sec: 0.13, Epoch: 0.06956859696851923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3581, "loss": 0.2827020287513733, "memory_gb": 7.721559524536133, "step_time_ms": 7557.293891906738, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:17] (step=0003581) Train Loss: 0.3116, Train Steps/Sec: 0.13, Epoch: 0.06958802953750486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3582, "loss": 0.33970576524734497, "memory_gb": 7.721559524536133, "step_time_ms": 7582.857847213745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:25] (step=0003582) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.06960746210649048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3583, "loss": 0.13989782333374023, "memory_gb": 7.721559524536133, "step_time_ms": 7514.461517333984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:33] (step=0003583) Train Loss: 0.1985, Train Steps/Sec: 0.13, Epoch: 0.0696268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3584, "loss": 0.39640000462532043, "memory_gb": 7.721559524536133, "step_time_ms": 7530.295610427856, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:41] (step=0003584) Train Loss: 0.3153, Train Steps/Sec: 0.13, Epoch: 0.06964632724446172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3585, "loss": 0.22946356236934662, "memory_gb": 7.721559524536133, "step_time_ms": 7536.686420440674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:50] (step=0003585) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.06966575981344733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:52:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3586, "loss": 0.18624749779701233, "memory_gb": 7.721559524536133, "step_time_ms": 7473.412036895752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:52:58] (step=0003586) Train Loss: 0.2752, Train Steps/Sec: 0.13, Epoch: 0.06968519238243295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3587, "loss": 0.22651153802871704, "memory_gb": 7.721559524536133, "step_time_ms": 7590.041160583496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:06] (step=0003587) Train Loss: 0.2546, Train Steps/Sec: 0.12, Epoch: 0.06970462495141858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3588, "loss": 0.33328989148139954, "memory_gb": 7.715639114379883, "step_time_ms": 7596.997022628784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:14] (step=0003588) Train Loss: 0.2862, Train Steps/Sec: 0.12, Epoch: 0.0697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3589, "loss": 0.3058061897754669, "memory_gb": 7.721559524536133, "step_time_ms": 7565.573453903198, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:22] (step=0003589) Train Loss: 0.2370, Train Steps/Sec: 0.12, Epoch: 0.06974349008938982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3590, "loss": 0.2065485417842865, "memory_gb": 7.721559524536133, "step_time_ms": 7541.71085357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:30] (step=0003590) Train Loss: 0.2433, Train Steps/Sec: 0.13, Epoch: 0.06976292265837544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3591, "loss": 0.20808416604995728, "memory_gb": 7.721559524536133, "step_time_ms": 7672.884225845337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:38] (step=0003591) Train Loss: 0.2413, Train Steps/Sec: 0.13, Epoch: 0.06978235522736105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3592, "loss": 0.25166070461273193, "memory_gb": 7.721559524536133, "step_time_ms": 7488.36088180542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:46] (step=0003592) Train Loss: 0.3037, Train Steps/Sec: 0.12, Epoch: 0.06980178779634667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:53:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3593, "loss": 0.2243688702583313, "memory_gb": 7.721559524536133, "step_time_ms": 7562.425136566162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:53:54] (step=0003593) Train Loss: 0.2658, Train Steps/Sec: 0.13, Epoch: 0.0698212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3594, "loss": 0.21427232027053833, "memory_gb": 7.721559524536133, "step_time_ms": 7473.50811958313, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:01] (step=0003594) Train Loss: 0.2058, Train Steps/Sec: 0.13, Epoch: 0.06984065293431792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3595, "loss": 0.26879745721817017, "memory_gb": 7.721559524536133, "step_time_ms": 7456.274509429932, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:09] (step=0003595) Train Loss: 0.2255, Train Steps/Sec: 0.13, Epoch: 0.06986008550330354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3596, "loss": 0.2583106458187103, "memory_gb": 7.721559524536133, "step_time_ms": 7368.719816207886, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:17] (step=0003596) Train Loss: 0.2307, Train Steps/Sec: 0.13, Epoch: 0.06987951807228916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3597, "loss": 0.2397819459438324, "memory_gb": 7.721559524536133, "step_time_ms": 7522.494554519653, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:25] (step=0003597) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.06989895064127477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3598, "loss": 0.23024532198905945, "memory_gb": 7.721559524536133, "step_time_ms": 5201.190948486328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:31] (step=0003598) Train Loss: 0.2306, Train Steps/Sec: 0.17, Epoch: 0.06991838321026039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3599, "loss": 0.21660837531089783, "memory_gb": 7.721559524536133, "step_time_ms": 7497.698783874512, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:39] (step=0003599) Train Loss: 0.2365, Train Steps/Sec: 0.12, Epoch: 0.06993781577924602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3600, "loss": 0.22897721827030182, "memory_gb": 7.721559524536133, "step_time_ms": 7456.799507141113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:47] (step=0003600) Train Loss: 0.2794, Train Steps/Sec: 0.12, Epoch: 0.06995724834823164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3601, "loss": 0.2509381175041199, "memory_gb": 7.721559524536133, "step_time_ms": 7407.831430435181, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:54:55] (step=0003601) Train Loss: 0.2112, Train Steps/Sec: 0.13, Epoch: 0.06997668091721726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3602, "loss": 0.2663651704788208, "memory_gb": 7.715639114379883, "step_time_ms": 7446.958780288696, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:03] (step=0003602) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.06999611348620288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3603, "loss": 0.2320694625377655, "memory_gb": 7.721559524536133, "step_time_ms": 7390.347003936768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:11] (step=0003603) Train Loss: 0.2664, Train Steps/Sec: 0.13, Epoch: 0.07001554605518849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3604, "loss": 0.2164405882358551, "memory_gb": 7.721559524536133, "step_time_ms": 7422.633171081543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:19] (step=0003604) Train Loss: 0.1949, Train Steps/Sec: 0.12, Epoch: 0.07003497862417411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3605, "loss": 0.19841285049915314, "memory_gb": 7.721559524536133, "step_time_ms": 7506.326198577881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:27] (step=0003605) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.07005441119315974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3606, "loss": 0.25402677059173584, "memory_gb": 7.721559524536133, "step_time_ms": 7407.603979110718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:35] (step=0003606) Train Loss: 0.2721, Train Steps/Sec: 0.12, Epoch: 0.07007384376214536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3607, "loss": 0.20869958400726318, "memory_gb": 7.721559524536133, "step_time_ms": 7416.9981479644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:43] (step=0003607) Train Loss: 0.2304, Train Steps/Sec: 0.12, Epoch: 0.07009327633113098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:55:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3608, "loss": 0.25105005502700806, "memory_gb": 7.721559524536133, "step_time_ms": 7506.2079429626465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:55:52] (step=0003608) Train Loss: 0.2167, Train Steps/Sec: 0.12, Epoch: 0.0701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3609, "loss": 0.16577228903770447, "memory_gb": 7.721559524536133, "step_time_ms": 7375.423192977905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:00] (step=0003609) Train Loss: 0.1911, Train Steps/Sec: 0.12, Epoch: 0.07013214146910221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3610, "loss": 0.3518011271953583, "memory_gb": 7.721559524536133, "step_time_ms": 7450.714349746704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:08] (step=0003610) Train Loss: 0.3410, Train Steps/Sec: 0.13, Epoch: 0.07015157403808783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3611, "loss": 0.1610155701637268, "memory_gb": 7.721559524536133, "step_time_ms": 7541.466951370239, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:16] (step=0003611) Train Loss: 0.1692, Train Steps/Sec: 0.12, Epoch: 0.07017100660707346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3612, "loss": 0.12321685254573822, "memory_gb": 7.721559524536133, "step_time_ms": 7492.368698120117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:24] (step=0003612) Train Loss: 0.1846, Train Steps/Sec: 0.12, Epoch: 0.07019043917605908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3613, "loss": 0.2471543550491333, "memory_gb": 7.721559524536133, "step_time_ms": 7472.393274307251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:32] (step=0003613) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.0702098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3614, "loss": 0.38233011960983276, "memory_gb": 7.721559524536133, "step_time_ms": 7483.019828796387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:40] (step=0003614) Train Loss: 0.3171, Train Steps/Sec: 0.12, Epoch: 0.07022930431403031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3615, "loss": 0.23452870547771454, "memory_gb": 7.721559524536133, "step_time_ms": 7438.206672668457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:48] (step=0003615) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.07024873688301593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:56:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3616, "loss": 0.27528563141822815, "memory_gb": 7.721559524536133, "step_time_ms": 7465.712308883667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:56:56] (step=0003616) Train Loss: 0.2666, Train Steps/Sec: 0.12, Epoch: 0.07026816945200155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3617, "loss": 0.2815808057785034, "memory_gb": 7.721559524536133, "step_time_ms": 7538.367509841919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:04] (step=0003617) Train Loss: 0.2217, Train Steps/Sec: 0.12, Epoch: 0.07028760202098717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3618, "loss": 0.2205192744731903, "memory_gb": 7.721559524536133, "step_time_ms": 7498.8112449646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:12] (step=0003618) Train Loss: 0.2091, Train Steps/Sec: 0.13, Epoch: 0.0703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3619, "loss": 0.2347232848405838, "memory_gb": 7.721559524536133, "step_time_ms": 7287.314176559448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:20] (step=0003619) Train Loss: 0.2683, Train Steps/Sec: 0.12, Epoch: 0.07032646715895842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3620, "loss": 0.2614704370498657, "memory_gb": 7.721559524536133, "step_time_ms": 7489.380598068237, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:28] (step=0003620) Train Loss: 0.2412, Train Steps/Sec: 0.13, Epoch: 0.07034589972794403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3621, "loss": 0.1929936558008194, "memory_gb": 7.721559524536133, "step_time_ms": 7453.61590385437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:36] (step=0003621) Train Loss: 0.2224, Train Steps/Sec: 0.12, Epoch: 0.07036533229692965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3622, "loss": 0.24653418362140656, "memory_gb": 7.721559524536133, "step_time_ms": 7480.3102016448975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:44] (step=0003622) Train Loss: 0.2449, Train Steps/Sec: 0.12, Epoch: 0.07038476486591527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3623, "loss": 0.34127333760261536, "memory_gb": 7.721559524536133, "step_time_ms": 7507.662773132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:57:52] (step=0003623) Train Loss: 0.2843, Train Steps/Sec: 0.12, Epoch: 0.0704041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3624, "loss": 0.17620813846588135, "memory_gb": 7.721559524536133, "step_time_ms": 7452.11386680603, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:00] (step=0003624) Train Loss: 0.1475, Train Steps/Sec: 0.13, Epoch: 0.07042363000388652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3625, "loss": 0.3301585614681244, "memory_gb": 7.721559524536133, "step_time_ms": 7334.280967712402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:08] (step=0003625) Train Loss: 0.3161, Train Steps/Sec: 0.13, Epoch: 0.07044306257287214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3626, "loss": 0.15766534209251404, "memory_gb": 7.721559524536133, "step_time_ms": 7577.731370925903, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:16] (step=0003626) Train Loss: 0.2099, Train Steps/Sec: 0.12, Epoch: 0.07046249514185775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3627, "loss": 0.1946476846933365, "memory_gb": 7.721559524536133, "step_time_ms": 5191.858530044556, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:22] (step=0003627) Train Loss: 0.1901, Train Steps/Sec: 0.17, Epoch: 0.07048192771084337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3628, "loss": 0.33539915084838867, "memory_gb": 7.721559524536133, "step_time_ms": 7593.127012252808, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:30] (step=0003628) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.07050136027982899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3629, "loss": 0.18379713594913483, "memory_gb": 7.721559524536133, "step_time_ms": 7479.671239852905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:38] (step=0003629) Train Loss: 0.1668, Train Steps/Sec: 0.13, Epoch: 0.07052079284881461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3630, "loss": 0.2690458595752716, "memory_gb": 7.721559524536133, "step_time_ms": 7453.794956207275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:46] (step=0003630) Train Loss: 0.2671, Train Steps/Sec: 0.13, Epoch: 0.07054022541780024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:58:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3631, "loss": 0.28345298767089844, "memory_gb": 7.721559524536133, "step_time_ms": 7555.138349533081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:58:54] (step=0003631) Train Loss: 0.3124, Train Steps/Sec: 0.12, Epoch: 0.07055965798678586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3632, "loss": 0.24631384015083313, "memory_gb": 7.721559524536133, "step_time_ms": 7502.493381500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:02] (step=0003632) Train Loss: 0.2691, Train Steps/Sec: 0.13, Epoch: 0.07057909055577147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3633, "loss": 0.34148257970809937, "memory_gb": 7.721559524536133, "step_time_ms": 7501.077890396118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:10] (step=0003633) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.07059852312475709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3634, "loss": 0.34458884596824646, "memory_gb": 7.721559524536133, "step_time_ms": 7599.850416183472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:18] (step=0003634) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.07061795569374271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3635, "loss": 0.2532378137111664, "memory_gb": 7.721559524536133, "step_time_ms": 7449.538469314575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:26] (step=0003635) Train Loss: 0.2921, Train Steps/Sec: 0.12, Epoch: 0.07063738826272833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3636, "loss": 0.13229073584079742, "memory_gb": 7.721559524536133, "step_time_ms": 7511.105537414551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:34] (step=0003636) Train Loss: 0.1194, Train Steps/Sec: 0.12, Epoch: 0.07065682083171396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3637, "loss": 0.2613126039505005, "memory_gb": 7.721559524536133, "step_time_ms": 7514.490365982056, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:42] (step=0003637) Train Loss: 0.2072, Train Steps/Sec: 0.12, Epoch: 0.07067625340069958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3638, "loss": 0.3310670852661133, "memory_gb": 7.721559524536133, "step_time_ms": 7430.192470550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:50] (step=0003638) Train Loss: 0.3126, Train Steps/Sec: 0.13, Epoch: 0.07069568596968519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 01:59:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3639, "loss": 0.26081106066703796, "memory_gb": 7.715639114379883, "step_time_ms": 7546.33641242981, "trainable_params": 4718592, "method": "lora"} [2025-07-29 01:59:58] (step=0003639) Train Loss: 0.2611, Train Steps/Sec: 0.13, Epoch: 0.07071511853867081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3640, "loss": 0.3804610073566437, "memory_gb": 7.721559524536133, "step_time_ms": 7511.339426040649, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:06] (step=0003640) Train Loss: 0.3136, Train Steps/Sec: 0.13, Epoch: 0.07073455110765643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3641, "loss": 0.30568167567253113, "memory_gb": 7.721559524536133, "step_time_ms": 7458.494186401367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:14] (step=0003641) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.07075398367664205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3642, "loss": 0.25467750430107117, "memory_gb": 7.721559524536133, "step_time_ms": 7434.595584869385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:22] (step=0003642) Train Loss: 0.2446, Train Steps/Sec: 0.12, Epoch: 0.07077341624562768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3643, "loss": 0.18223631381988525, "memory_gb": 7.721559524536133, "step_time_ms": 7509.927272796631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:30] (step=0003643) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.07079284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3644, "loss": 0.27311933040618896, "memory_gb": 7.721559524536133, "step_time_ms": 7446.02108001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:39] (step=0003644) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.07081228138359891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3645, "loss": 0.26353463530540466, "memory_gb": 7.721559524536133, "step_time_ms": 7411.192417144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:47] (step=0003645) Train Loss: 0.2726, Train Steps/Sec: 0.13, Epoch: 0.07083171395258453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3646, "loss": 0.24639008939266205, "memory_gb": 7.715639114379883, "step_time_ms": 7491.372346878052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:00:55] (step=0003646) Train Loss: 0.2638, Train Steps/Sec: 0.12, Epoch: 0.07085114652157015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3647, "loss": 0.21048663556575775, "memory_gb": 7.721559524536133, "step_time_ms": 7421.920299530029, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:03] (step=0003647) Train Loss: 0.2260, Train Steps/Sec: 0.13, Epoch: 0.07087057909055577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3648, "loss": 0.29084473848342896, "memory_gb": 7.721559524536133, "step_time_ms": 7402.671575546265, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:11] (step=0003648) Train Loss: 0.3362, Train Steps/Sec: 0.13, Epoch: 0.0708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3649, "loss": 0.2582692503929138, "memory_gb": 7.721559524536133, "step_time_ms": 7469.2089557647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:19] (step=0003649) Train Loss: 0.2566, Train Steps/Sec: 0.12, Epoch: 0.070909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3650, "loss": 0.17228978872299194, "memory_gb": 7.721559524536133, "step_time_ms": 7470.235824584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:27] (step=0003650) Train Loss: 0.1936, Train Steps/Sec: 0.12, Epoch: 0.07092887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3651, "loss": 0.3377382159233093, "memory_gb": 7.721559524536133, "step_time_ms": 7411.005973815918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:35] (step=0003651) Train Loss: 0.3411, Train Steps/Sec: 0.13, Epoch: 0.07094830936649825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3652, "loss": 0.25660261511802673, "memory_gb": 7.721559524536133, "step_time_ms": 7515.218496322632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:43] (step=0003652) Train Loss: 0.2359, Train Steps/Sec: 0.12, Epoch: 0.07096774193548387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3653, "loss": 0.25105854868888855, "memory_gb": 7.721559524536133, "step_time_ms": 7462.501049041748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:51] (step=0003653) Train Loss: 0.2419, Train Steps/Sec: 0.12, Epoch: 0.0709871745044695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:01:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3654, "loss": 0.12746071815490723, "memory_gb": 7.721559524536133, "step_time_ms": 7317.472457885742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:01:59] (step=0003654) Train Loss: 0.1782, Train Steps/Sec: 0.13, Epoch: 0.07100660707345512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3655, "loss": 0.2851969003677368, "memory_gb": 7.721559524536133, "step_time_ms": 7404.207229614258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:06] (step=0003655) Train Loss: 0.2298, Train Steps/Sec: 0.13, Epoch: 0.07102603964244072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3656, "loss": 0.2340995967388153, "memory_gb": 7.721559524536133, "step_time_ms": 5187.446355819702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:12] (step=0003656) Train Loss: 0.2391, Train Steps/Sec: 0.17, Epoch: 0.07104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3657, "loss": 0.21608121693134308, "memory_gb": 7.721559524536133, "step_time_ms": 7467.708110809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:20] (step=0003657) Train Loss: 0.2438, Train Steps/Sec: 0.12, Epoch: 0.07106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3658, "loss": 0.3098578453063965, "memory_gb": 7.721559524536133, "step_time_ms": 7471.325159072876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:28] (step=0003658) Train Loss: 0.3320, Train Steps/Sec: 0.12, Epoch: 0.07108433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3659, "loss": 0.218131884932518, "memory_gb": 7.721559524536133, "step_time_ms": 7431.18691444397, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:36] (step=0003659) Train Loss: 0.2079, Train Steps/Sec: 0.13, Epoch: 0.07110376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3660, "loss": 0.30208325386047363, "memory_gb": 7.721559524536133, "step_time_ms": 7506.0577392578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:45] (step=0003660) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.07112320248736884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:02:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3661, "loss": 0.16101785004138947, "memory_gb": 7.721559524536133, "step_time_ms": 7530.824184417725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:02:53] (step=0003661) Train Loss: 0.2158, Train Steps/Sec: 0.12, Epoch: 0.07114263505635444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3662, "loss": 0.24081499874591827, "memory_gb": 7.721559524536133, "step_time_ms": 7436.956405639648, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:01] (step=0003662) Train Loss: 0.2732, Train Steps/Sec: 0.12, Epoch: 0.07116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3663, "loss": 0.1579144150018692, "memory_gb": 7.721559524536133, "step_time_ms": 7504.523038864136, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:09] (step=0003663) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.07118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3664, "loss": 0.1356569528579712, "memory_gb": 7.715639114379883, "step_time_ms": 7482.39803314209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:17] (step=0003664) Train Loss: 0.1807, Train Steps/Sec: 0.12, Epoch: 0.07120093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3665, "loss": 0.21844446659088135, "memory_gb": 7.721559524536133, "step_time_ms": 7439.651012420654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:25] (step=0003665) Train Loss: 0.1998, Train Steps/Sec: 0.13, Epoch: 0.07122036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3666, "loss": 0.28811079263687134, "memory_gb": 7.721559524536133, "step_time_ms": 7522.360801696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:33] (step=0003666) Train Loss: 0.2925, Train Steps/Sec: 0.12, Epoch: 0.07123979790128256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3667, "loss": 0.2758421301841736, "memory_gb": 7.721559524536133, "step_time_ms": 7512.14337348938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:41] (step=0003667) Train Loss: 0.2966, Train Steps/Sec: 0.12, Epoch: 0.07125923047026816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3668, "loss": 0.25379443168640137, "memory_gb": 7.721559524536133, "step_time_ms": 7446.862459182739, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:49] (step=0003668) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.07127866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3669, "loss": 0.219587504863739, "memory_gb": 7.715639114379883, "step_time_ms": 7461.2791538238525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:03:57] (step=0003669) Train Loss: 0.1882, Train Steps/Sec: 0.13, Epoch: 0.07129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3670, "loss": 0.22222763299942017, "memory_gb": 7.721559524536133, "step_time_ms": 7450.243949890137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:05] (step=0003670) Train Loss: 0.2637, Train Steps/Sec: 0.12, Epoch: 0.07131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3671, "loss": 0.14451844990253448, "memory_gb": 7.721559524536133, "step_time_ms": 7437.416315078735, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:13] (step=0003671) Train Loss: 0.1567, Train Steps/Sec: 0.13, Epoch: 0.07133696074621065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3672, "loss": 0.242099791765213, "memory_gb": 7.721559524536133, "step_time_ms": 7486.136436462402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:21] (step=0003672) Train Loss: 0.2567, Train Steps/Sec: 0.13, Epoch: 0.07135639331519626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3673, "loss": 0.23821213841438293, "memory_gb": 7.721559524536133, "step_time_ms": 7579.3797969818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:29] (step=0003673) Train Loss: 0.2586, Train Steps/Sec: 0.12, Epoch: 0.07137582588418188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3674, "loss": 0.1420394480228424, "memory_gb": 7.721559524536133, "step_time_ms": 7529.25705909729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:37] (step=0003674) Train Loss: 0.1607, Train Steps/Sec: 0.12, Epoch: 0.0713952584531675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3675, "loss": 0.2971978187561035, "memory_gb": 7.721559524536133, "step_time_ms": 7506.631851196289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:45] (step=0003675) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.07141469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3676, "loss": 0.22718992829322815, "memory_gb": 7.721559524536133, "step_time_ms": 7550.980567932129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:04:53] (step=0003676) Train Loss: 0.1916, Train Steps/Sec: 0.12, Epoch: 0.07143412359113875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3677, "loss": 0.3016306757926941, "memory_gb": 7.721559524536133, "step_time_ms": 7531.328439712524, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:01] (step=0003677) Train Loss: 0.2345, Train Steps/Sec: 0.12, Epoch: 0.07145355616012437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3678, "loss": 0.19653812050819397, "memory_gb": 7.721559524536133, "step_time_ms": 7530.113935470581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:09] (step=0003678) Train Loss: 0.2018, Train Steps/Sec: 0.13, Epoch: 0.07147298872910998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3679, "loss": 0.31481724977493286, "memory_gb": 7.721559524536133, "step_time_ms": 7672.543048858643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:17] (step=0003679) Train Loss: 0.2220, Train Steps/Sec: 0.13, Epoch: 0.0714924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3680, "loss": 0.2699674367904663, "memory_gb": 7.721559524536133, "step_time_ms": 7458.730936050415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:25] (step=0003680) Train Loss: 0.2620, Train Steps/Sec: 0.12, Epoch: 0.07151185386708123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3681, "loss": 0.20650173723697662, "memory_gb": 7.721559524536133, "step_time_ms": 7565.003395080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:33] (step=0003681) Train Loss: 0.2047, Train Steps/Sec: 0.12, Epoch: 0.07153128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3682, "loss": 0.2620643973350525, "memory_gb": 7.721559524536133, "step_time_ms": 7306.751012802124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:41] (step=0003682) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.07155071900505247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3683, "loss": 0.2331664115190506, "memory_gb": 7.721559524536133, "step_time_ms": 7326.46918296814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:49] (step=0003683) Train Loss: 0.2570, Train Steps/Sec: 0.13, Epoch: 0.0715701515740381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3684, "loss": 0.24121138453483582, "memory_gb": 7.721559524536133, "step_time_ms": 7172.171115875244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:05:57] (step=0003684) Train Loss: 0.2603, Train Steps/Sec: 0.13, Epoch: 0.0715895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3685, "loss": 0.27118033170700073, "memory_gb": 7.715639114379883, "step_time_ms": 5614.984750747681, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:03] (step=0003685) Train Loss: 0.2796, Train Steps/Sec: 0.17, Epoch: 0.07160901671200932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3686, "loss": 0.2813236117362976, "memory_gb": 7.721559524536133, "step_time_ms": 7500.2264976501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:11] (step=0003686) Train Loss: 0.2423, Train Steps/Sec: 0.13, Epoch: 0.07162844928099495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3687, "loss": 0.23058022558689117, "memory_gb": 7.721559524536133, "step_time_ms": 7473.810195922852, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:19] (step=0003687) Train Loss: 0.2173, Train Steps/Sec: 0.13, Epoch: 0.07164788184998057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3688, "loss": 0.2849883735179901, "memory_gb": 7.721559524536133, "step_time_ms": 7500.743627548218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:27] (step=0003688) Train Loss: 0.2804, Train Steps/Sec: 0.12, Epoch: 0.07166731441896619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3689, "loss": 0.16086240112781525, "memory_gb": 7.721559524536133, "step_time_ms": 7513.439178466797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:35] (step=0003689) Train Loss: 0.2132, Train Steps/Sec: 0.12, Epoch: 0.07168674698795181, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3690, "loss": 0.2742519974708557, "memory_gb": 7.721559524536133, "step_time_ms": 7451.740026473999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:43] (step=0003690) Train Loss: 0.2327, Train Steps/Sec: 0.13, Epoch: 0.07170617955693742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3691, "loss": 0.19598177075386047, "memory_gb": 7.721559524536133, "step_time_ms": 7401.670932769775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:51] (step=0003691) Train Loss: 0.2005, Train Steps/Sec: 0.13, Epoch: 0.07172561212592304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3692, "loss": 0.2567809224128723, "memory_gb": 7.721559524536133, "step_time_ms": 7448.721408843994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:06:59] (step=0003692) Train Loss: 0.2409, Train Steps/Sec: 0.13, Epoch: 0.07174504469490867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3693, "loss": 0.16723516583442688, "memory_gb": 7.721559524536133, "step_time_ms": 7475.594282150269, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:07] (step=0003693) Train Loss: 0.2372, Train Steps/Sec: 0.12, Epoch: 0.07176447726389429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3694, "loss": 0.28677132725715637, "memory_gb": 7.721559524536133, "step_time_ms": 7288.6857986450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:15] (step=0003694) Train Loss: 0.2683, Train Steps/Sec: 0.13, Epoch: 0.07178390983287991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3695, "loss": 0.3187665641307831, "memory_gb": 7.721559524536133, "step_time_ms": 7434.5550537109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:23] (step=0003695) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.07180334240186553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3696, "loss": 0.23974914848804474, "memory_gb": 7.721559524536133, "step_time_ms": 7466.613531112671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:31] (step=0003696) Train Loss: 0.2450, Train Steps/Sec: 0.12, Epoch: 0.07182277497085114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3697, "loss": 0.18752804398536682, "memory_gb": 7.721559524536133, "step_time_ms": 7440.849304199219, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:39] (step=0003697) Train Loss: 0.2066, Train Steps/Sec: 0.12, Epoch: 0.07184220753983676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3698, "loss": 0.283003568649292, "memory_gb": 7.721559524536133, "step_time_ms": 7426.3012409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:47] (step=0003698) Train Loss: 0.2657, Train Steps/Sec: 0.13, Epoch: 0.07186164010882239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:07:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3699, "loss": 0.222402423620224, "memory_gb": 7.721559524536133, "step_time_ms": 7448.128938674927, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:07:55] (step=0003699) Train Loss: 0.2641, Train Steps/Sec: 0.13, Epoch: 0.07188107267780801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3700, "loss": 0.3007274568080902, "memory_gb": 7.721559524536133, "step_time_ms": 7411.657333374023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:03] (step=0003700) Train Loss: 0.3520, Train Steps/Sec: 0.13, Epoch: 0.07190050524679363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3701, "loss": 0.209965780377388, "memory_gb": 7.721559524536133, "step_time_ms": 7227.442026138306, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:11] (step=0003701) Train Loss: 0.2296, Train Steps/Sec: 0.12, Epoch: 0.07191993781577925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3702, "loss": 0.23635472357273102, "memory_gb": 7.721559524536133, "step_time_ms": 7441.690921783447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:19] (step=0003702) Train Loss: 0.2264, Train Steps/Sec: 0.12, Epoch: 0.07193937038476486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3703, "loss": 0.23919910192489624, "memory_gb": 7.721559524536133, "step_time_ms": 7425.928354263306, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:27] (step=0003703) Train Loss: 0.2263, Train Steps/Sec: 0.13, Epoch: 0.07195880295375048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3704, "loss": 0.16849499940872192, "memory_gb": 7.721559524536133, "step_time_ms": 7529.2253494262695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:35] (step=0003704) Train Loss: 0.1976, Train Steps/Sec: 0.12, Epoch: 0.0719782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3705, "loss": 0.1691206693649292, "memory_gb": 7.721559524536133, "step_time_ms": 7575.83475112915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:43] (step=0003705) Train Loss: 0.1786, Train Steps/Sec: 0.12, Epoch: 0.07199766809172173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3706, "loss": 0.36375030875205994, "memory_gb": 7.721559524536133, "step_time_ms": 7468.113899230957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:51] (step=0003706) Train Loss: 0.2798, Train Steps/Sec: 0.12, Epoch: 0.07201710066070735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:08:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3707, "loss": 0.21607860922813416, "memory_gb": 7.721559524536133, "step_time_ms": 7540.950298309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:08:59] (step=0003707) Train Loss: 0.2178, Train Steps/Sec: 0.12, Epoch: 0.07203653322969296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3708, "loss": 0.1498483568429947, "memory_gb": 7.721559524536133, "step_time_ms": 7583.876371383667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:07] (step=0003708) Train Loss: 0.1810, Train Steps/Sec: 0.12, Epoch: 0.07205596579867858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3709, "loss": 0.11604183912277222, "memory_gb": 7.721559524536133, "step_time_ms": 7475.33655166626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:15] (step=0003709) Train Loss: 0.1816, Train Steps/Sec: 0.13, Epoch: 0.0720753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3710, "loss": 0.28036659955978394, "memory_gb": 7.721559524536133, "step_time_ms": 7544.0380573272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:23] (step=0003710) Train Loss: 0.2665, Train Steps/Sec: 0.12, Epoch: 0.07209483093664983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3711, "loss": 0.27500006556510925, "memory_gb": 7.721559524536133, "step_time_ms": 7544.147729873657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:31] (step=0003711) Train Loss: 0.2330, Train Steps/Sec: 0.12, Epoch: 0.07211426350563545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3712, "loss": 0.21025905013084412, "memory_gb": 7.721559524536133, "step_time_ms": 7359.551429748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:39] (step=0003712) Train Loss: 0.2107, Train Steps/Sec: 0.13, Epoch: 0.07213369607462107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3713, "loss": 0.32826894521713257, "memory_gb": 7.721559524536133, "step_time_ms": 7294.299602508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:47] (step=0003713) Train Loss: 0.3031, Train Steps/Sec: 0.13, Epoch: 0.07215312864360668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:09:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3714, "loss": 0.2684416174888611, "memory_gb": 7.721559524536133, "step_time_ms": 5475.590944290161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:09:54] (step=0003714) Train Loss: 0.2404, Train Steps/Sec: 0.14, Epoch: 0.0721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3715, "loss": 0.15191105008125305, "memory_gb": 7.721559524536133, "step_time_ms": 7556.039333343506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:02] (step=0003715) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.07219199378157792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3716, "loss": 0.3384191691875458, "memory_gb": 7.721559524536133, "step_time_ms": 7601.480960845947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:10] (step=0003716) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.07221142635056355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3717, "loss": 0.29065659642219543, "memory_gb": 7.721559524536133, "step_time_ms": 7489.189147949219, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:18] (step=0003717) Train Loss: 0.2552, Train Steps/Sec: 0.12, Epoch: 0.07223085891954917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3718, "loss": 0.30718088150024414, "memory_gb": 7.721559524536133, "step_time_ms": 7493.683576583862, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:26] (step=0003718) Train Loss: 0.2463, Train Steps/Sec: 0.13, Epoch: 0.07225029148853479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3719, "loss": 0.15062209963798523, "memory_gb": 7.721559524536133, "step_time_ms": 7507.31348991394, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:34] (step=0003719) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.0722697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3720, "loss": 0.2316216379404068, "memory_gb": 7.721559524536133, "step_time_ms": 7531.068801879883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:42] (step=0003720) Train Loss: 0.2010, Train Steps/Sec: 0.12, Epoch: 0.07228915662650602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3721, "loss": 0.2939237356185913, "memory_gb": 7.721559524536133, "step_time_ms": 7533.625602722168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:50] (step=0003721) Train Loss: 0.3103, Train Steps/Sec: 0.13, Epoch: 0.07230858919549164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:10:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3722, "loss": 0.22611366212368011, "memory_gb": 7.721559524536133, "step_time_ms": 7611.124753952026, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:10:58] (step=0003722) Train Loss: 0.2904, Train Steps/Sec: 0.12, Epoch: 0.07232802176447727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3723, "loss": 0.36455589532852173, "memory_gb": 7.721559524536133, "step_time_ms": 7457.371711730957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:06] (step=0003723) Train Loss: 0.3271, Train Steps/Sec: 0.12, Epoch: 0.07234745433346289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3724, "loss": 0.21802498400211334, "memory_gb": 7.721559524536133, "step_time_ms": 7476.761341094971, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:14] (step=0003724) Train Loss: 0.2471, Train Steps/Sec: 0.12, Epoch: 0.07236688690244851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3725, "loss": 0.18666309118270874, "memory_gb": 7.721559524536133, "step_time_ms": 7503.21102142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:22] (step=0003725) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.07238631947143412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3726, "loss": 0.18617503345012665, "memory_gb": 7.721559524536133, "step_time_ms": 7665.6365394592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:30] (step=0003726) Train Loss: 0.1744, Train Steps/Sec: 0.12, Epoch: 0.07240575204041974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3727, "loss": 0.32243603467941284, "memory_gb": 7.721559524536133, "step_time_ms": 7427.371501922607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:38] (step=0003727) Train Loss: 0.3097, Train Steps/Sec: 0.13, Epoch: 0.07242518460940536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3728, "loss": 0.08573491871356964, "memory_gb": 7.721559524536133, "step_time_ms": 7476.020336151123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:46] (step=0003728) Train Loss: 0.1666, Train Steps/Sec: 0.12, Epoch: 0.07244461717839099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:11:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3729, "loss": 0.2004626840353012, "memory_gb": 7.721559524536133, "step_time_ms": 7495.328664779663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:11:54] (step=0003729) Train Loss: 0.1942, Train Steps/Sec: 0.12, Epoch: 0.07246404974737661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3730, "loss": 0.2874099910259247, "memory_gb": 7.715639114379883, "step_time_ms": 7415.322065353394, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:02] (step=0003730) Train Loss: 0.2536, Train Steps/Sec: 0.13, Epoch: 0.07248348231636223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3731, "loss": 0.2643130421638489, "memory_gb": 7.721559524536133, "step_time_ms": 7467.299461364746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:10] (step=0003731) Train Loss: 0.2574, Train Steps/Sec: 0.12, Epoch: 0.07250291488534784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3732, "loss": 0.3157790005207062, "memory_gb": 7.721559524536133, "step_time_ms": 7468.400001525879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:18] (step=0003732) Train Loss: 0.2515, Train Steps/Sec: 0.12, Epoch: 0.07252234745433346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3733, "loss": 0.2987205982208252, "memory_gb": 7.721559524536133, "step_time_ms": 7415.208578109741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:26] (step=0003733) Train Loss: 0.2705, Train Steps/Sec: 0.13, Epoch: 0.07254178002331908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3734, "loss": 0.2806982100009918, "memory_gb": 7.721559524536133, "step_time_ms": 7458.9855670928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:34] (step=0003734) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.0725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3735, "loss": 0.22616013884544373, "memory_gb": 7.721559524536133, "step_time_ms": 7423.353433609009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:42] (step=0003735) Train Loss: 0.1942, Train Steps/Sec: 0.12, Epoch: 0.07258064516129033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3736, "loss": 0.21629656851291656, "memory_gb": 7.721559524536133, "step_time_ms": 7417.428731918335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:50] (step=0003736) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.07260007773027594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:12:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3737, "loss": 0.1783219575881958, "memory_gb": 7.721559524536133, "step_time_ms": 7508.260250091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:12:58] (step=0003737) Train Loss: 0.1897, Train Steps/Sec: 0.12, Epoch: 0.07261951029926156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3738, "loss": 0.2626298666000366, "memory_gb": 7.721559524536133, "step_time_ms": 7389.320850372314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:06] (step=0003738) Train Loss: 0.2952, Train Steps/Sec: 0.13, Epoch: 0.07263894286824718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3739, "loss": 0.2911940813064575, "memory_gb": 7.721559524536133, "step_time_ms": 7450.108289718628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:14] (step=0003739) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.0726583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3740, "loss": 0.28860101103782654, "memory_gb": 7.721559524536133, "step_time_ms": 7479.260683059692, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:22] (step=0003740) Train Loss: 0.2670, Train Steps/Sec: 0.12, Epoch: 0.07267780800621843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3741, "loss": 0.19543829560279846, "memory_gb": 7.721559524536133, "step_time_ms": 7324.038982391357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:30] (step=0003741) Train Loss: 0.2366, Train Steps/Sec: 0.13, Epoch: 0.07269724057520405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3742, "loss": 0.2116086333990097, "memory_gb": 7.721559524536133, "step_time_ms": 6333.757400512695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:37] (step=0003742) Train Loss: 0.2340, Train Steps/Sec: 0.15, Epoch: 0.07271667314418966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3743, "loss": 0.20267504453659058, "memory_gb": 7.721559524536133, "step_time_ms": 6492.8154945373535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:44] (step=0003743) Train Loss: 0.2688, Train Steps/Sec: 0.14, Epoch: 0.07273610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3744, "loss": 0.25914180278778076, "memory_gb": 7.721559524536133, "step_time_ms": 7476.123332977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:13:52] (step=0003744) Train Loss: 0.2246, Train Steps/Sec: 0.12, Epoch: 0.0727555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3745, "loss": 0.19945836067199707, "memory_gb": 7.721559524536133, "step_time_ms": 7492.745161056519, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:00] (step=0003745) Train Loss: 0.2462, Train Steps/Sec: 0.12, Epoch: 0.07277497085114652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3746, "loss": 0.2674853205680847, "memory_gb": 7.721559524536133, "step_time_ms": 7446.662425994873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:08] (step=0003746) Train Loss: 0.3016, Train Steps/Sec: 0.12, Epoch: 0.07279440342013214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3747, "loss": 0.22387118637561798, "memory_gb": 7.721559524536133, "step_time_ms": 7446.844100952148, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:16] (step=0003747) Train Loss: 0.2620, Train Steps/Sec: 0.13, Epoch: 0.07281383598911777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3748, "loss": 0.26107192039489746, "memory_gb": 7.721559524536133, "step_time_ms": 7512.201309204102, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:24] (step=0003748) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.07283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3749, "loss": 0.22474907338619232, "memory_gb": 7.721559524536133, "step_time_ms": 7451.229810714722, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:32] (step=0003749) Train Loss: 0.2051, Train Steps/Sec: 0.12, Epoch: 0.072852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3750, "loss": 0.3270626366138458, "memory_gb": 7.721559524536133, "step_time_ms": 7437.926769256592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:40] (step=0003750) Train Loss: 0.2923, Train Steps/Sec: 0.13, Epoch: 0.07287213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3751, "loss": 0.2588677704334259, "memory_gb": 7.721559524536133, "step_time_ms": 7489.454746246338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:48] (step=0003751) Train Loss: 0.2417, Train Steps/Sec: 0.12, Epoch: 0.07289156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:14:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3752, "loss": 0.25886136293411255, "memory_gb": 7.721559524536133, "step_time_ms": 7502.206563949585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:14:56] (step=0003752) Train Loss: 0.2640, Train Steps/Sec: 0.12, Epoch: 0.07291099883404586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3753, "loss": 0.2709663510322571, "memory_gb": 7.721559524536133, "step_time_ms": 7480.081081390381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:04] (step=0003753) Train Loss: 0.2484, Train Steps/Sec: 0.13, Epoch: 0.07293043140303149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3754, "loss": 0.47032177448272705, "memory_gb": 7.715639114379883, "step_time_ms": 7310.453414916992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:12] (step=0003754) Train Loss: 0.3795, Train Steps/Sec: 0.13, Epoch: 0.0729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3755, "loss": 0.1350226104259491, "memory_gb": 7.721559524536133, "step_time_ms": 7529.088258743286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:20] (step=0003755) Train Loss: 0.1893, Train Steps/Sec: 0.13, Epoch: 0.07296929654100272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3756, "loss": 0.21492008864879608, "memory_gb": 7.721559524536133, "step_time_ms": 7536.250829696655, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:28] (step=0003756) Train Loss: 0.2537, Train Steps/Sec: 0.12, Epoch: 0.07298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3757, "loss": 0.3303007483482361, "memory_gb": 7.721559524536133, "step_time_ms": 7549.323558807373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:36] (step=0003757) Train Loss: 0.2776, Train Steps/Sec: 0.12, Epoch: 0.07300816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3758, "loss": 0.3380850553512573, "memory_gb": 7.721559524536133, "step_time_ms": 7502.67481803894, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:44] (step=0003758) Train Loss: 0.3267, Train Steps/Sec: 0.12, Epoch: 0.07302759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:15:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3759, "loss": 0.20348933339118958, "memory_gb": 7.721559524536133, "step_time_ms": 7498.961687088013, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:15:52] (step=0003759) Train Loss: 0.1939, Train Steps/Sec: 0.13, Epoch: 0.07304702681694521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3760, "loss": 0.3090338110923767, "memory_gb": 7.721559524536133, "step_time_ms": 7598.515272140503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:00] (step=0003760) Train Loss: 0.2505, Train Steps/Sec: 0.12, Epoch: 0.07306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3761, "loss": 0.2865521013736725, "memory_gb": 7.721559524536133, "step_time_ms": 7600.923776626587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:08] (step=0003761) Train Loss: 0.2670, Train Steps/Sec: 0.12, Epoch: 0.07308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3762, "loss": 0.2346886694431305, "memory_gb": 7.721559524536133, "step_time_ms": 7503.891229629517, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:16] (step=0003762) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.07310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3763, "loss": 0.26032620668411255, "memory_gb": 7.721559524536133, "step_time_ms": 7540.825605392456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:24] (step=0003763) Train Loss: 0.2756, Train Steps/Sec: 0.12, Epoch: 0.07312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3764, "loss": 0.16795606911182404, "memory_gb": 7.721559524536133, "step_time_ms": 7499.635934829712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:32] (step=0003764) Train Loss: 0.1652, Train Steps/Sec: 0.13, Epoch: 0.0731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3765, "loss": 0.28943970799446106, "memory_gb": 7.721559524536133, "step_time_ms": 7439.242601394653, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:40] (step=0003765) Train Loss: 0.2971, Train Steps/Sec: 0.12, Epoch: 0.07316362223085891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3766, "loss": 0.279386043548584, "memory_gb": 7.721559524536133, "step_time_ms": 7580.574035644531, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:48] (step=0003766) Train Loss: 0.3047, Train Steps/Sec: 0.12, Epoch: 0.07318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:16:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3767, "loss": 0.2514088749885559, "memory_gb": 7.721559524536133, "step_time_ms": 7680.77826499939, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:16:57] (step=0003767) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.07320248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3768, "loss": 0.31410232186317444, "memory_gb": 7.721559524536133, "step_time_ms": 7482.399702072144, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:05] (step=0003768) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.07322191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3769, "loss": 0.1908373236656189, "memory_gb": 7.721559524536133, "step_time_ms": 7577.4946212768555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:13] (step=0003769) Train Loss: 0.1767, Train Steps/Sec: 0.12, Epoch: 0.0732413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3770, "loss": 0.23839730024337769, "memory_gb": 7.721559524536133, "step_time_ms": 7352.608680725098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:21] (step=0003770) Train Loss: 0.2884, Train Steps/Sec: 0.13, Epoch: 0.07326078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3771, "loss": 0.1853049099445343, "memory_gb": 7.721559524536133, "step_time_ms": 6067.546367645264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:27] (step=0003771) Train Loss: 0.1799, Train Steps/Sec: 0.16, Epoch: 0.07328021764477263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3772, "loss": 0.23592866957187653, "memory_gb": 7.721559524536133, "step_time_ms": 6658.155679702759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:34] (step=0003772) Train Loss: 0.2309, Train Steps/Sec: 0.14, Epoch: 0.07329965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3773, "loss": 0.24892961978912354, "memory_gb": 7.721559524536133, "step_time_ms": 7480.232000350952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:42] (step=0003773) Train Loss: 0.2350, Train Steps/Sec: 0.13, Epoch: 0.07331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3774, "loss": 0.29983818531036377, "memory_gb": 7.721559524536133, "step_time_ms": 7547.628164291382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:50] (step=0003774) Train Loss: 0.2813, Train Steps/Sec: 0.12, Epoch: 0.0733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:17:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3775, "loss": 0.2708410620689392, "memory_gb": 7.721559524536133, "step_time_ms": 7486.915111541748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:17:58] (step=0003775) Train Loss: 0.2851, Train Steps/Sec: 0.12, Epoch: 0.07335794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3776, "loss": 0.20807364583015442, "memory_gb": 7.721559524536133, "step_time_ms": 7437.514543533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:06] (step=0003776) Train Loss: 0.2076, Train Steps/Sec: 0.13, Epoch: 0.07337738048970074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3777, "loss": 0.2915487289428711, "memory_gb": 7.721559524536133, "step_time_ms": 7468.387603759766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:14] (step=0003777) Train Loss: 0.3137, Train Steps/Sec: 0.12, Epoch: 0.07339681305868635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3778, "loss": 0.23759055137634277, "memory_gb": 7.721559524536133, "step_time_ms": 7376.619100570679, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:22] (step=0003778) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.07341624562767197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3779, "loss": 0.24633896350860596, "memory_gb": 7.721559524536133, "step_time_ms": 7436.051607131958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:30] (step=0003779) Train Loss: 0.2290, Train Steps/Sec: 0.12, Epoch: 0.0734356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3780, "loss": 0.2589915692806244, "memory_gb": 7.721559524536133, "step_time_ms": 7488.624811172485, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:38] (step=0003780) Train Loss: 0.2449, Train Steps/Sec: 0.13, Epoch: 0.07345511076564322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3781, "loss": 0.21741768717765808, "memory_gb": 7.721559524536133, "step_time_ms": 7420.238733291626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:46] (step=0003781) Train Loss: 0.2134, Train Steps/Sec: 0.13, Epoch: 0.07347454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:18:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3782, "loss": 0.2931998372077942, "memory_gb": 7.721559524536133, "step_time_ms": 7443.31431388855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:18:54] (step=0003782) Train Loss: 0.2554, Train Steps/Sec: 0.13, Epoch: 0.07349397590361446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3783, "loss": 0.24831296503543854, "memory_gb": 7.721559524536133, "step_time_ms": 7523.15354347229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:02] (step=0003783) Train Loss: 0.2998, Train Steps/Sec: 0.12, Epoch: 0.07351340847260007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3784, "loss": 0.2148357778787613, "memory_gb": 7.721559524536133, "step_time_ms": 7460.0512981414795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:10] (step=0003784) Train Loss: 0.2348, Train Steps/Sec: 0.13, Epoch: 0.0735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3785, "loss": 0.30538925528526306, "memory_gb": 7.721559524536133, "step_time_ms": 7400.260448455811, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:18] (step=0003785) Train Loss: 0.2282, Train Steps/Sec: 0.13, Epoch: 0.07355227361057132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3786, "loss": 0.13347209990024567, "memory_gb": 7.721559524536133, "step_time_ms": 7454.19454574585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:26] (step=0003786) Train Loss: 0.1454, Train Steps/Sec: 0.13, Epoch: 0.07357170617955694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3787, "loss": 0.30909520387649536, "memory_gb": 7.721559524536133, "step_time_ms": 7393.677234649658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:34] (step=0003787) Train Loss: 0.2309, Train Steps/Sec: 0.13, Epoch: 0.07359113874854256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3788, "loss": 0.21472199261188507, "memory_gb": 7.721559524536133, "step_time_ms": 7442.285537719727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:42] (step=0003788) Train Loss: 0.2429, Train Steps/Sec: 0.12, Epoch: 0.07361057131752818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3789, "loss": 0.2634967565536499, "memory_gb": 7.721559524536133, "step_time_ms": 7467.863321304321, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:50] (step=0003789) Train Loss: 0.2530, Train Steps/Sec: 0.12, Epoch: 0.07363000388651379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:19:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3790, "loss": 0.20307737588882446, "memory_gb": 7.721559524536133, "step_time_ms": 7399.407625198364, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:19:58] (step=0003790) Train Loss: 0.1981, Train Steps/Sec: 0.13, Epoch: 0.07364943645549941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3791, "loss": 0.22974760830402374, "memory_gb": 7.721559524536133, "step_time_ms": 7406.404256820679, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:06] (step=0003791) Train Loss: 0.2246, Train Steps/Sec: 0.13, Epoch: 0.07366886902448504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3792, "loss": 0.289087176322937, "memory_gb": 7.721559524536133, "step_time_ms": 7455.772161483765, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:14] (step=0003792) Train Loss: 0.3004, Train Steps/Sec: 0.13, Epoch: 0.07368830159347066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3793, "loss": 0.20957425236701965, "memory_gb": 7.721559524536133, "step_time_ms": 7411.33975982666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:22] (step=0003793) Train Loss: 0.2108, Train Steps/Sec: 0.13, Epoch: 0.07370773416245628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3794, "loss": 0.13769149780273438, "memory_gb": 7.721559524536133, "step_time_ms": 7410.242319107056, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:30] (step=0003794) Train Loss: 0.1885, Train Steps/Sec: 0.13, Epoch: 0.07372716673144189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3795, "loss": 0.24896077811717987, "memory_gb": 7.721559524536133, "step_time_ms": 7510.5156898498535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:38] (step=0003795) Train Loss: 0.2039, Train Steps/Sec: 0.12, Epoch: 0.07374659930042751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3796, "loss": 0.14042896032333374, "memory_gb": 7.721559524536133, "step_time_ms": 7456.026554107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:46] (step=0003796) Train Loss: 0.1774, Train Steps/Sec: 0.13, Epoch: 0.07376603186941313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:20:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3797, "loss": 0.37491756677627563, "memory_gb": 7.721559524536133, "step_time_ms": 7455.902814865112, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:20:54] (step=0003797) Train Loss: 0.2669, Train Steps/Sec: 0.13, Epoch: 0.07378546443839876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3798, "loss": 0.356425940990448, "memory_gb": 7.721559524536133, "step_time_ms": 7608.816623687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:02] (step=0003798) Train Loss: 0.2974, Train Steps/Sec: 0.12, Epoch: 0.07380489700738438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3799, "loss": 0.16644346714019775, "memory_gb": 7.721559524536133, "step_time_ms": 7391.71576499939, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:10] (step=0003799) Train Loss: 0.2067, Train Steps/Sec: 0.13, Epoch: 0.07382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3800, "loss": 0.19054356217384338, "memory_gb": 7.721559524536133, "step_time_ms": 5905.369758605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:16] (step=0003800) Train Loss: 0.2141, Train Steps/Sec: 0.16, Epoch: 0.07384376214535561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3801, "loss": 0.20532746613025665, "memory_gb": 7.721559524536133, "step_time_ms": 6822.011947631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:24] (step=0003801) Train Loss: 0.3241, Train Steps/Sec: 0.14, Epoch: 0.07386319471434123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3802, "loss": 0.17476128041744232, "memory_gb": 7.721559524536133, "step_time_ms": 7516.700029373169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:32] (step=0003802) Train Loss: 0.2436, Train Steps/Sec: 0.12, Epoch: 0.07388262728332685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3803, "loss": 0.2263389229774475, "memory_gb": 7.721559524536133, "step_time_ms": 7589.0350341796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:40] (step=0003803) Train Loss: 0.2605, Train Steps/Sec: 0.12, Epoch: 0.07390205985231248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3804, "loss": 0.2414672076702118, "memory_gb": 7.721559524536133, "step_time_ms": 7544.005393981934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:48] (step=0003804) Train Loss: 0.2621, Train Steps/Sec: 0.12, Epoch: 0.0739214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:21:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3805, "loss": 0.23289132118225098, "memory_gb": 7.721559524536133, "step_time_ms": 7521.61979675293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:21:56] (step=0003805) Train Loss: 0.2269, Train Steps/Sec: 0.12, Epoch: 0.07394092499028372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3806, "loss": 0.21550413966178894, "memory_gb": 7.721559524536133, "step_time_ms": 7642.341613769531, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:04] (step=0003806) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.07396035755926933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3807, "loss": 0.2316872924566269, "memory_gb": 7.721559524536133, "step_time_ms": 7583.3728313446045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:12] (step=0003807) Train Loss: 0.2537, Train Steps/Sec: 0.12, Epoch: 0.07397979012825495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3808, "loss": 0.17353788018226624, "memory_gb": 7.721559524536133, "step_time_ms": 7609.461307525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:20] (step=0003808) Train Loss: 0.2098, Train Steps/Sec: 0.13, Epoch: 0.07399922269724057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3809, "loss": 0.2720314860343933, "memory_gb": 7.721559524536133, "step_time_ms": 7658.693790435791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:28] (step=0003809) Train Loss: 0.2822, Train Steps/Sec: 0.12, Epoch: 0.0740186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3810, "loss": 0.2564764618873596, "memory_gb": 7.721559524536133, "step_time_ms": 7576.193809509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:36] (step=0003810) Train Loss: 0.2462, Train Steps/Sec: 0.13, Epoch: 0.07403808783521182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3811, "loss": 0.1719937026500702, "memory_gb": 7.721559524536133, "step_time_ms": 7592.164516448975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:44] (step=0003811) Train Loss: 0.1955, Train Steps/Sec: 0.12, Epoch: 0.07405752040419744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:22:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3812, "loss": 0.26121532917022705, "memory_gb": 7.721559524536133, "step_time_ms": 7638.175010681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:22:52] (step=0003812) Train Loss: 0.2234, Train Steps/Sec: 0.12, Epoch: 0.07407695297318305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3813, "loss": 0.29405906796455383, "memory_gb": 7.721559524536133, "step_time_ms": 7560.280323028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:00] (step=0003813) Train Loss: 0.3201, Train Steps/Sec: 0.12, Epoch: 0.07409638554216867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3814, "loss": 0.20852753520011902, "memory_gb": 7.721559524536133, "step_time_ms": 7557.085275650024, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:08] (step=0003814) Train Loss: 0.1648, Train Steps/Sec: 0.12, Epoch: 0.0741158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3815, "loss": 0.22597545385360718, "memory_gb": 7.721559524536133, "step_time_ms": 7731.52494430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:16] (step=0003815) Train Loss: 0.2798, Train Steps/Sec: 0.12, Epoch: 0.07413525068013992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3816, "loss": 0.2883671224117279, "memory_gb": 7.721559524536133, "step_time_ms": 7603.768348693848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:24] (step=0003816) Train Loss: 0.2717, Train Steps/Sec: 0.12, Epoch: 0.07415468324912554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3817, "loss": 0.2825632095336914, "memory_gb": 7.721559524536133, "step_time_ms": 7516.290903091431, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:32] (step=0003817) Train Loss: 0.2787, Train Steps/Sec: 0.12, Epoch: 0.07417411581811116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3818, "loss": 0.19708186388015747, "memory_gb": 7.721559524536133, "step_time_ms": 7563.840866088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:40] (step=0003818) Train Loss: 0.2043, Train Steps/Sec: 0.12, Epoch: 0.07419354838709677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3819, "loss": 0.3177998661994934, "memory_gb": 7.721559524536133, "step_time_ms": 7224.614381790161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:48] (step=0003819) Train Loss: 0.2311, Train Steps/Sec: 0.13, Epoch: 0.07421298095608239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3820, "loss": 0.30853134393692017, "memory_gb": 7.721559524536133, "step_time_ms": 7463.228225708008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:23:56] (step=0003820) Train Loss: 0.3019, Train Steps/Sec: 0.13, Epoch: 0.07423241352506801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3821, "loss": 0.2564027011394501, "memory_gb": 7.721559524536133, "step_time_ms": 7494.309425354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:04] (step=0003821) Train Loss: 0.2356, Train Steps/Sec: 0.13, Epoch: 0.07425184609405364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3822, "loss": 0.22188203036785126, "memory_gb": 7.721559524536133, "step_time_ms": 7496.324300765991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:12] (step=0003822) Train Loss: 0.1888, Train Steps/Sec: 0.12, Epoch: 0.07427127866303926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3823, "loss": 0.24136421084403992, "memory_gb": 7.721559524536133, "step_time_ms": 7477.530717849731, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:20] (step=0003823) Train Loss: 0.2707, Train Steps/Sec: 0.12, Epoch: 0.07429071123202487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3824, "loss": 0.13659627735614777, "memory_gb": 7.721559524536133, "step_time_ms": 7498.105525970459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:28] (step=0003824) Train Loss: 0.2410, Train Steps/Sec: 0.12, Epoch: 0.07431014380101049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3825, "loss": 0.29004624485969543, "memory_gb": 7.721559524536133, "step_time_ms": 7448.498010635376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:36] (step=0003825) Train Loss: 0.2382, Train Steps/Sec: 0.13, Epoch: 0.07432957636999611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3826, "loss": 0.21564923226833344, "memory_gb": 7.721559524536133, "step_time_ms": 7442.923545837402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:44] (step=0003826) Train Loss: 0.2073, Train Steps/Sec: 0.12, Epoch: 0.07434900893898173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3827, "loss": 0.24459108710289001, "memory_gb": 7.721559524536133, "step_time_ms": 7517.5793170928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:24:52] (step=0003827) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.07436844150796736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3828, "loss": 0.12918560206890106, "memory_gb": 7.721559524536133, "step_time_ms": 7455.03044128418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:00] (step=0003828) Train Loss: 0.1926, Train Steps/Sec: 0.13, Epoch: 0.07438787407695298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3829, "loss": 0.2343614548444748, "memory_gb": 7.721559524536133, "step_time_ms": 5633.581161499023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:06] (step=0003829) Train Loss: 0.2649, Train Steps/Sec: 0.17, Epoch: 0.07440730664593859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3830, "loss": 0.35573795437812805, "memory_gb": 7.721559524536133, "step_time_ms": 7473.077774047852, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:14] (step=0003830) Train Loss: 0.3624, Train Steps/Sec: 0.13, Epoch: 0.07442673921492421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3831, "loss": 0.3761652112007141, "memory_gb": 7.721559524536133, "step_time_ms": 7420.240879058838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:22] (step=0003831) Train Loss: 0.3407, Train Steps/Sec: 0.13, Epoch: 0.07444617178390983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3832, "loss": 0.1878845989704132, "memory_gb": 7.721559524536133, "step_time_ms": 7467.426300048828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:30] (step=0003832) Train Loss: 0.1980, Train Steps/Sec: 0.12, Epoch: 0.07446560435289545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3833, "loss": 0.23498927056789398, "memory_gb": 7.721559524536133, "step_time_ms": 7457.907199859619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:38] (step=0003833) Train Loss: 0.2101, Train Steps/Sec: 0.12, Epoch: 0.07448503692188108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3834, "loss": 0.324917197227478, "memory_gb": 7.721559524536133, "step_time_ms": 7391.066551208496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:46] (step=0003834) Train Loss: 0.3308, Train Steps/Sec: 0.13, Epoch: 0.0745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3835, "loss": 0.21430468559265137, "memory_gb": 7.721559524536133, "step_time_ms": 7501.1115074157715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:25:54] (step=0003835) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.0745239020598523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3836, "loss": 0.24557556211948395, "memory_gb": 7.721559524536133, "step_time_ms": 7436.096668243408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:02] (step=0003836) Train Loss: 0.2483, Train Steps/Sec: 0.13, Epoch: 0.07454333462883793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3837, "loss": 0.14746294915676117, "memory_gb": 7.721559524536133, "step_time_ms": 7380.485057830811, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:10] (step=0003837) Train Loss: 0.1869, Train Steps/Sec: 0.13, Epoch: 0.07456276719782355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3838, "loss": 0.16271977126598358, "memory_gb": 7.721559524536133, "step_time_ms": 7428.678035736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:18] (step=0003838) Train Loss: 0.2320, Train Steps/Sec: 0.13, Epoch: 0.07458219976680917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3839, "loss": 0.2287847250699997, "memory_gb": 7.721559524536133, "step_time_ms": 7449.359893798828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:26] (step=0003839) Train Loss: 0.2184, Train Steps/Sec: 0.13, Epoch: 0.0746016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3840, "loss": 0.23789082467556, "memory_gb": 7.721559524536133, "step_time_ms": 7399.179220199585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:34] (step=0003840) Train Loss: 0.2318, Train Steps/Sec: 0.13, Epoch: 0.07462106490478042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3841, "loss": 0.13251709938049316, "memory_gb": 7.721559524536133, "step_time_ms": 7453.268766403198, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:42] (step=0003841) Train Loss: 0.1620, Train Steps/Sec: 0.13, Epoch: 0.07464049747376603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3842, "loss": 0.2583341598510742, "memory_gb": 7.721559524536133, "step_time_ms": 7467.71240234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:50] (step=0003842) Train Loss: 0.2549, Train Steps/Sec: 0.13, Epoch: 0.07465993004275165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:26:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3843, "loss": 0.15788868069648743, "memory_gb": 7.721559524536133, "step_time_ms": 7454.74648475647, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:26:58] (step=0003843) Train Loss: 0.2290, Train Steps/Sec: 0.12, Epoch: 0.07467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3844, "loss": 0.28215813636779785, "memory_gb": 7.721559524536133, "step_time_ms": 7457.933664321899, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:06] (step=0003844) Train Loss: 0.2493, Train Steps/Sec: 0.13, Epoch: 0.0746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3845, "loss": 0.26321667432785034, "memory_gb": 7.721559524536133, "step_time_ms": 7494.410514831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:14] (step=0003845) Train Loss: 0.2130, Train Steps/Sec: 0.13, Epoch: 0.07471822774970852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3846, "loss": 0.21122345328330994, "memory_gb": 7.721559524536133, "step_time_ms": 7412.3711585998535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:22] (step=0003846) Train Loss: 0.1861, Train Steps/Sec: 0.12, Epoch: 0.07473766031869414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3847, "loss": 0.33333295583724976, "memory_gb": 7.721559524536133, "step_time_ms": 7474.492788314819, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:30] (step=0003847) Train Loss: 0.3335, Train Steps/Sec: 0.12, Epoch: 0.07475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3848, "loss": 0.14839781820774078, "memory_gb": 7.721559524536133, "step_time_ms": 7562.94846534729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:38] (step=0003848) Train Loss: 0.1565, Train Steps/Sec: 0.12, Epoch: 0.07477652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3849, "loss": 0.16163429617881775, "memory_gb": 7.721559524536133, "step_time_ms": 7444.165229797363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:46] (step=0003849) Train Loss: 0.2050, Train Steps/Sec: 0.13, Epoch: 0.07479595802565099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:27:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3850, "loss": 0.2928745746612549, "memory_gb": 7.721559524536133, "step_time_ms": 7511.39497756958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:27:55] (step=0003850) Train Loss: 0.2885, Train Steps/Sec: 0.12, Epoch: 0.07481539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3851, "loss": 0.19610168039798737, "memory_gb": 7.721559524536133, "step_time_ms": 7558.712482452393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:03] (step=0003851) Train Loss: 0.1985, Train Steps/Sec: 0.12, Epoch: 0.07483482316362224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3852, "loss": 0.39577844738960266, "memory_gb": 7.721559524536133, "step_time_ms": 7527.898550033569, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:11] (step=0003852) Train Loss: 0.3354, Train Steps/Sec: 0.13, Epoch: 0.07485425573260784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3853, "loss": 0.29473382234573364, "memory_gb": 7.721559524536133, "step_time_ms": 7484.084367752075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:19] (step=0003853) Train Loss: 0.2976, Train Steps/Sec: 0.12, Epoch: 0.07487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3854, "loss": 0.36067909002304077, "memory_gb": 7.721559524536133, "step_time_ms": 7498.7633228302, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:27] (step=0003854) Train Loss: 0.3457, Train Steps/Sec: 0.12, Epoch: 0.07489312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3855, "loss": 0.22484686970710754, "memory_gb": 7.721559524536133, "step_time_ms": 7571.258544921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:35] (step=0003855) Train Loss: 0.2343, Train Steps/Sec: 0.12, Epoch: 0.07491255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3856, "loss": 0.1859300434589386, "memory_gb": 7.721559524536133, "step_time_ms": 7439.193248748779, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:43] (step=0003856) Train Loss: 0.1856, Train Steps/Sec: 0.13, Epoch: 0.07493198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3857, "loss": 0.25253596901893616, "memory_gb": 7.721559524536133, "step_time_ms": 7538.183212280273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:51] (step=0003857) Train Loss: 0.2892, Train Steps/Sec: 0.12, Epoch: 0.07495141857753596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:28:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3858, "loss": 0.20735397934913635, "memory_gb": 7.721559524536133, "step_time_ms": 5068.434476852417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:28:57] (step=0003858) Train Loss: 0.2582, Train Steps/Sec: 0.16, Epoch: 0.07497085114652156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3859, "loss": 0.20244644582271576, "memory_gb": 7.721559524536133, "step_time_ms": 7587.429523468018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:05] (step=0003859) Train Loss: 0.2319, Train Steps/Sec: 0.12, Epoch: 0.07499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3860, "loss": 0.21982061862945557, "memory_gb": 7.721559524536133, "step_time_ms": 7497.569561004639, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:13] (step=0003860) Train Loss: 0.2479, Train Steps/Sec: 0.12, Epoch: 0.07500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3861, "loss": 0.24038933217525482, "memory_gb": 7.715639114379883, "step_time_ms": 7538.565635681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:21] (step=0003861) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.07502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3862, "loss": 0.25389620661735535, "memory_gb": 7.721559524536133, "step_time_ms": 7552.363872528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:29] (step=0003862) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.07504858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3863, "loss": 0.13359390199184418, "memory_gb": 7.721559524536133, "step_time_ms": 7431.65397644043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:37] (step=0003863) Train Loss: 0.1881, Train Steps/Sec: 0.12, Epoch: 0.07506801399144968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3864, "loss": 0.2583059072494507, "memory_gb": 7.721559524536133, "step_time_ms": 7500.590562820435, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:45] (step=0003864) Train Loss: 0.2287, Train Steps/Sec: 0.12, Epoch: 0.07508744656043528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:29:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3865, "loss": 0.2763218879699707, "memory_gb": 7.721559524536133, "step_time_ms": 7539.198875427246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:29:53] (step=0003865) Train Loss: 0.3168, Train Steps/Sec: 0.12, Epoch: 0.0751068791294209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3866, "loss": 0.336861252784729, "memory_gb": 7.721559524536133, "step_time_ms": 7504.357814788818, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:01] (step=0003866) Train Loss: 0.2905, Train Steps/Sec: 0.12, Epoch: 0.07512631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3867, "loss": 0.13098043203353882, "memory_gb": 7.721559524536133, "step_time_ms": 7543.107271194458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:09] (step=0003867) Train Loss: 0.1759, Train Steps/Sec: 0.12, Epoch: 0.07514574426739215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3868, "loss": 0.2929267883300781, "memory_gb": 7.721559524536133, "step_time_ms": 7508.355379104614, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:17] (step=0003868) Train Loss: 0.2499, Train Steps/Sec: 0.12, Epoch: 0.07516517683637777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3869, "loss": 0.200516939163208, "memory_gb": 7.721559524536133, "step_time_ms": 7443.804979324341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:25] (step=0003869) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.0751846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3870, "loss": 0.2828409671783447, "memory_gb": 7.721559524536133, "step_time_ms": 7467.511892318726, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:33] (step=0003870) Train Loss: 0.1983, Train Steps/Sec: 0.12, Epoch: 0.075204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3871, "loss": 0.17223361134529114, "memory_gb": 7.721559524536133, "step_time_ms": 7490.865230560303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:41] (step=0003871) Train Loss: 0.1909, Train Steps/Sec: 0.12, Epoch: 0.07522347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3872, "loss": 0.24678227305412292, "memory_gb": 7.721559524536133, "step_time_ms": 7482.439994812012, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:49] (step=0003872) Train Loss: 0.2693, Train Steps/Sec: 0.12, Epoch: 0.07524290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:30:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3873, "loss": 0.1685725748538971, "memory_gb": 7.721559524536133, "step_time_ms": 7463.026523590088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:30:57] (step=0003873) Train Loss: 0.1794, Train Steps/Sec: 0.13, Epoch: 0.07526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3874, "loss": 0.2464265674352646, "memory_gb": 7.721559524536133, "step_time_ms": 7500.4260540008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:05] (step=0003874) Train Loss: 0.2174, Train Steps/Sec: 0.13, Epoch: 0.07528177225029149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3875, "loss": 0.1878218650817871, "memory_gb": 7.715639114379883, "step_time_ms": 7437.892436981201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:13] (step=0003875) Train Loss: 0.2041, Train Steps/Sec: 0.13, Epoch: 0.07530120481927711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3876, "loss": 0.14182038605213165, "memory_gb": 7.721559524536133, "step_time_ms": 7467.971563339233, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:22] (step=0003876) Train Loss: 0.1692, Train Steps/Sec: 0.12, Epoch: 0.07532063738826272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3877, "loss": 0.22985075414180756, "memory_gb": 7.721559524536133, "step_time_ms": 7526.183843612671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:30] (step=0003877) Train Loss: 0.2080, Train Steps/Sec: 0.12, Epoch: 0.07534006995724835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3878, "loss": 0.26943767070770264, "memory_gb": 7.721559524536133, "step_time_ms": 7451.329469680786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:38] (step=0003878) Train Loss: 0.2644, Train Steps/Sec: 0.13, Epoch: 0.07535950252623397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3879, "loss": 0.12004058063030243, "memory_gb": 7.721559524536133, "step_time_ms": 7418.297529220581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:46] (step=0003879) Train Loss: 0.1743, Train Steps/Sec: 0.13, Epoch: 0.07537893509521959, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3880, "loss": 0.24993394315242767, "memory_gb": 7.721559524536133, "step_time_ms": 7488.749742507935, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:31:54] (step=0003880) Train Loss: 0.2238, Train Steps/Sec: 0.12, Epoch: 0.07539836766420521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3881, "loss": 0.30828866362571716, "memory_gb": 7.721559524536133, "step_time_ms": 7387.898921966553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:02] (step=0003881) Train Loss: 0.3064, Train Steps/Sec: 0.13, Epoch: 0.07541780023319082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3882, "loss": 0.3592415452003479, "memory_gb": 7.721559524536133, "step_time_ms": 7371.133327484131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:10] (step=0003882) Train Loss: 0.2993, Train Steps/Sec: 0.13, Epoch: 0.07543723280217644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3883, "loss": 0.2924049198627472, "memory_gb": 7.721559524536133, "step_time_ms": 7491.819620132446, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:18] (step=0003883) Train Loss: 0.2633, Train Steps/Sec: 0.12, Epoch: 0.07545666537116207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3884, "loss": 0.26646023988723755, "memory_gb": 7.721559524536133, "step_time_ms": 7271.001100540161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:26] (step=0003884) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.07547609794014769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3885, "loss": 0.32497939467430115, "memory_gb": 7.721559524536133, "step_time_ms": 7337.764501571655, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:34] (step=0003885) Train Loss: 0.3456, Train Steps/Sec: 0.13, Epoch: 0.07549553050913331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3886, "loss": 0.2201680988073349, "memory_gb": 7.715639114379883, "step_time_ms": 7470.128059387207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:42] (step=0003886) Train Loss: 0.2301, Train Steps/Sec: 0.12, Epoch: 0.07551496307811893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3887, "loss": 0.13628485798835754, "memory_gb": 7.721559524536133, "step_time_ms": 5705.685377120972, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:47] (step=0003887) Train Loss: 0.1833, Train Steps/Sec: 0.17, Epoch: 0.07553439564710454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3888, "loss": 0.2518245577812195, "memory_gb": 7.721559524536133, "step_time_ms": 7467.091083526611, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:32:55] (step=0003888) Train Loss: 0.2185, Train Steps/Sec: 0.12, Epoch: 0.07555382821609016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3889, "loss": 0.3282056152820587, "memory_gb": 7.721559524536133, "step_time_ms": 7440.653562545776, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:04] (step=0003889) Train Loss: 0.3097, Train Steps/Sec: 0.12, Epoch: 0.07557326078507579, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3890, "loss": 0.14110445976257324, "memory_gb": 7.721559524536133, "step_time_ms": 7431.748390197754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:12] (step=0003890) Train Loss: 0.1771, Train Steps/Sec: 0.12, Epoch: 0.07559269335406141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3891, "loss": 0.26011645793914795, "memory_gb": 7.721559524536133, "step_time_ms": 7485.722780227661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:20] (step=0003891) Train Loss: 0.2307, Train Steps/Sec: 0.12, Epoch: 0.07561212592304703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3892, "loss": 0.22749221324920654, "memory_gb": 7.721559524536133, "step_time_ms": 7491.110563278198, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:28] (step=0003892) Train Loss: 0.1806, Train Steps/Sec: 0.12, Epoch: 0.07563155849203265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3893, "loss": 0.28264349699020386, "memory_gb": 7.721559524536133, "step_time_ms": 7493.502855300903, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:36] (step=0003893) Train Loss: 0.2748, Train Steps/Sec: 0.12, Epoch: 0.07565099106101826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3894, "loss": 0.14192408323287964, "memory_gb": 7.721559524536133, "step_time_ms": 7514.889240264893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:44] (step=0003894) Train Loss: 0.2007, Train Steps/Sec: 0.12, Epoch: 0.07567042363000388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:33:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3895, "loss": 0.32798314094543457, "memory_gb": 7.721559524536133, "step_time_ms": 7497.515439987183, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:33:52] (step=0003895) Train Loss: 0.2383, Train Steps/Sec: 0.12, Epoch: 0.0756898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3896, "loss": 0.27878832817077637, "memory_gb": 7.721559524536133, "step_time_ms": 7444.006681442261, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:00] (step=0003896) Train Loss: 0.2340, Train Steps/Sec: 0.12, Epoch: 0.07570928876797513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3897, "loss": 0.23684191703796387, "memory_gb": 7.721559524536133, "step_time_ms": 7337.515115737915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:08] (step=0003897) Train Loss: 0.2224, Train Steps/Sec: 0.12, Epoch: 0.07572872133696075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 3898, "loss": 0.1995270550251007, "memory_gb": 7.721559524536133, "step_time_ms": 7420.704126358032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:16] (step=0003898) Train Loss: 0.1961, Train Steps/Sec: 0.13, Epoch: 0.07574815390594637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3899, "loss": 0.3504161834716797, "memory_gb": 7.721559524536133, "step_time_ms": 7427.8857707977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:24] (step=0003899) Train Loss: 0.2742, Train Steps/Sec: 0.12, Epoch: 0.07576758647493198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3900, "loss": 0.32258129119873047, "memory_gb": 7.721559524536133, "step_time_ms": 7517.724275588989, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:32] (step=0003900) Train Loss: 0.2740, Train Steps/Sec: 0.12, Epoch: 0.0757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3901, "loss": 0.23065859079360962, "memory_gb": 7.721559524536133, "step_time_ms": 7440.215110778809, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:40] (step=0003901) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.07580645161290323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 3902, "loss": 0.322879433631897, "memory_gb": 7.721559524536133, "step_time_ms": 7632.786750793457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:48] (step=0003902) Train Loss: 0.3365, Train Steps/Sec: 0.12, Epoch: 0.07582588418188885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:34:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 3903, "loss": 0.25059765577316284, "memory_gb": 7.721559524536133, "step_time_ms": 7555.34291267395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:34:56] (step=0003903) Train Loss: 0.2607, Train Steps/Sec: 0.12, Epoch: 0.07584531675087447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 3904, "loss": 0.358134388923645, "memory_gb": 7.721559524536133, "step_time_ms": 7560.211658477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:04] (step=0003904) Train Loss: 0.2712, Train Steps/Sec: 0.12, Epoch: 0.07586474931986009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3905, "loss": 0.2181081473827362, "memory_gb": 7.721559524536133, "step_time_ms": 7473.26397895813, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:12] (step=0003905) Train Loss: 0.2237, Train Steps/Sec: 0.13, Epoch: 0.0758841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3906, "loss": 0.18834638595581055, "memory_gb": 7.721559524536133, "step_time_ms": 7614.205121994019, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:20] (step=0003906) Train Loss: 0.2533, Train Steps/Sec: 0.12, Epoch: 0.07590361445783132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3907, "loss": 0.21681702136993408, "memory_gb": 7.721559524536133, "step_time_ms": 7485.4700565338135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:28] (step=0003907) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.07592304702681694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3908, "loss": 0.19013667106628418, "memory_gb": 7.721559524536133, "step_time_ms": 7498.5058307647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:36] (step=0003908) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.07594247959580257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3909, "loss": 0.18548652529716492, "memory_gb": 7.721559524536133, "step_time_ms": 7574.217319488525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:44] (step=0003909) Train Loss: 0.1689, Train Steps/Sec: 0.12, Epoch: 0.07596191216478819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:35:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3910, "loss": 0.14499324560165405, "memory_gb": 7.721559524536133, "step_time_ms": 7556.771755218506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:35:52] (step=0003910) Train Loss: 0.1230, Train Steps/Sec: 0.13, Epoch: 0.07598134473377381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3911, "loss": 0.3022674322128296, "memory_gb": 7.721559524536133, "step_time_ms": 7434.507369995117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:00] (step=0003911) Train Loss: 0.2622, Train Steps/Sec: 0.13, Epoch: 0.07600077730275942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 3912, "loss": 0.27464067935943604, "memory_gb": 7.721559524536133, "step_time_ms": 7558.891296386719, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:09] (step=0003912) Train Loss: 0.2679, Train Steps/Sec: 0.12, Epoch: 0.07602020987174504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3913, "loss": 0.20569312572479248, "memory_gb": 7.721559524536133, "step_time_ms": 7564.902305603027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:17] (step=0003913) Train Loss: 0.2520, Train Steps/Sec: 0.12, Epoch: 0.07603964244073066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3914, "loss": 0.3535405993461609, "memory_gb": 7.721559524536133, "step_time_ms": 7367.217063903809, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:24] (step=0003914) Train Loss: 0.2824, Train Steps/Sec: 0.13, Epoch: 0.07605907500971629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 3915, "loss": 0.36980947852134705, "memory_gb": 7.721559524536133, "step_time_ms": 7574.60880279541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:32] (step=0003915) Train Loss: 0.3303, Train Steps/Sec: 0.12, Epoch: 0.07607850757870191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3916, "loss": 0.2092430144548416, "memory_gb": 7.721559524536133, "step_time_ms": 5441.925287246704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:38] (step=0003916) Train Loss: 0.2395, Train Steps/Sec: 0.18, Epoch: 0.07609794014768752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3917, "loss": 0.27734172344207764, "memory_gb": 7.721559524536133, "step_time_ms": 7504.847049713135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:46] (step=0003917) Train Loss: 0.1989, Train Steps/Sec: 0.12, Epoch: 0.07611737271667314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:36:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3918, "loss": 0.2042672336101532, "memory_gb": 7.721559524536133, "step_time_ms": 7485.244989395142, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:36:54] (step=0003918) Train Loss: 0.2617, Train Steps/Sec: 0.12, Epoch: 0.07613680528565876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3919, "loss": 0.29820072650909424, "memory_gb": 7.721559524536133, "step_time_ms": 7465.976715087891, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:02] (step=0003919) Train Loss: 0.2641, Train Steps/Sec: 0.13, Epoch: 0.07615623785464438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3920, "loss": 0.2076585590839386, "memory_gb": 7.721559524536133, "step_time_ms": 7560.095310211182, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:10] (step=0003920) Train Loss: 0.1799, Train Steps/Sec: 0.12, Epoch: 0.07617567042363001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3921, "loss": 0.1716257929801941, "memory_gb": 7.721559524536133, "step_time_ms": 7445.322513580322, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:18] (step=0003921) Train Loss: 0.2577, Train Steps/Sec: 0.13, Epoch: 0.07619510299261563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3922, "loss": 0.31583353877067566, "memory_gb": 7.721559524536133, "step_time_ms": 7439.500570297241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:26] (step=0003922) Train Loss: 0.2634, Train Steps/Sec: 0.13, Epoch: 0.07621453556160124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3923, "loss": 0.2981279194355011, "memory_gb": 7.721559524536133, "step_time_ms": 7458.491563796997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:34] (step=0003923) Train Loss: 0.2320, Train Steps/Sec: 0.12, Epoch: 0.07623396813058686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3924, "loss": 0.20596177875995636, "memory_gb": 7.721559524536133, "step_time_ms": 7393.970966339111, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:42] (step=0003924) Train Loss: 0.2137, Train Steps/Sec: 0.13, Epoch: 0.07625340069957248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3925, "loss": 0.2803293764591217, "memory_gb": 7.721559524536133, "step_time_ms": 7372.430086135864, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:50] (step=0003925) Train Loss: 0.2133, Train Steps/Sec: 0.13, Epoch: 0.0762728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:37:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3926, "loss": 0.2871454656124115, "memory_gb": 7.721559524536133, "step_time_ms": 7436.44380569458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:37:58] (step=0003926) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.07629226583754373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 3927, "loss": 0.34189552068710327, "memory_gb": 7.721559524536133, "step_time_ms": 7388.191699981689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:06] (step=0003927) Train Loss: 0.2740, Train Steps/Sec: 0.13, Epoch: 0.07631169840652935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3928, "loss": 0.16774772107601166, "memory_gb": 7.721559524536133, "step_time_ms": 7396.341800689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:14] (step=0003928) Train Loss: 0.1668, Train Steps/Sec: 0.13, Epoch: 0.07633113097551496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3929, "loss": 0.24894775450229645, "memory_gb": 7.721559524536133, "step_time_ms": 7467.068910598755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:22] (step=0003929) Train Loss: 0.2935, Train Steps/Sec: 0.13, Epoch: 0.07635056354450058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 3930, "loss": 0.19229814410209656, "memory_gb": 7.721559524536133, "step_time_ms": 7434.218645095825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:30] (step=0003930) Train Loss: 0.1832, Train Steps/Sec: 0.12, Epoch: 0.0763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 3931, "loss": 0.130928635597229, "memory_gb": 7.721559524536133, "step_time_ms": 7402.395248413086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:38] (step=0003931) Train Loss: 0.2160, Train Steps/Sec: 0.12, Epoch: 0.07638942868247182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 3932, "loss": 0.21131631731987, "memory_gb": 7.721559524536133, "step_time_ms": 7496.624946594238, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:46] (step=0003932) Train Loss: 0.2354, Train Steps/Sec: 0.12, Epoch: 0.07640886125145745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:38:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 3933, "loss": 0.13384665548801422, "memory_gb": 7.721559524536133, "step_time_ms": 7461.047410964966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:38:54] (step=0003933) Train Loss: 0.1816, Train Steps/Sec: 0.13, Epoch: 0.07642829382044307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 3934, "loss": 0.2217138707637787, "memory_gb": 7.721559524536133, "step_time_ms": 7525.022268295288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:02] (step=0003934) Train Loss: 0.2562, Train Steps/Sec: 0.13, Epoch: 0.07644772638942868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 3935, "loss": 0.18143188953399658, "memory_gb": 7.721559524536133, "step_time_ms": 7514.350414276123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:10] (step=0003935) Train Loss: 0.2604, Train Steps/Sec: 0.13, Epoch: 0.0764671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 3936, "loss": 0.1952795684337616, "memory_gb": 7.721559524536133, "step_time_ms": 7551.139116287231, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:18] (step=0003936) Train Loss: 0.1921, Train Steps/Sec: 0.12, Epoch: 0.07648659152739992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 3937, "loss": 0.2627585232257843, "memory_gb": 7.721559524536133, "step_time_ms": 7484.555006027222, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:26] (step=0003937) Train Loss: 0.2676, Train Steps/Sec: 0.12, Epoch: 0.07650602409638554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 3938, "loss": 0.2620227336883545, "memory_gb": 7.721559524536133, "step_time_ms": 7498.940706253052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:34] (step=0003938) Train Loss: 0.2378, Train Steps/Sec: 0.12, Epoch: 0.07652545666537117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 3939, "loss": 0.17954891920089722, "memory_gb": 7.721559524536133, "step_time_ms": 7551.47910118103, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:42] (step=0003939) Train Loss: 0.2003, Train Steps/Sec: 0.12, Epoch: 0.07654488923435679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 3940, "loss": 0.1837671399116516, "memory_gb": 7.721559524536133, "step_time_ms": 7441.465854644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:50] (step=0003940) Train Loss: 0.2100, Train Steps/Sec: 0.12, Epoch: 0.0765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 3941, "loss": 0.30475062131881714, "memory_gb": 7.721559524536133, "step_time_ms": 7462.890863418579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:39:59] (step=0003941) Train Loss: 0.3267, Train Steps/Sec: 0.12, Epoch: 0.07658375437232802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3942, "loss": 0.16203007102012634, "memory_gb": 7.721559524536133, "step_time_ms": 7496.842384338379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:07] (step=0003942) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.07660318694131364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 3943, "loss": 0.3060278594493866, "memory_gb": 7.721559524536133, "step_time_ms": 7537.327289581299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:14] (step=0003943) Train Loss: 0.3115, Train Steps/Sec: 0.13, Epoch: 0.07662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 3944, "loss": 0.3199455738067627, "memory_gb": 7.721559524536133, "step_time_ms": 7366.330146789551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:22] (step=0003944) Train Loss: 0.2519, Train Steps/Sec: 0.13, Epoch: 0.07664205207928489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3945, "loss": 0.25073471665382385, "memory_gb": 7.721559524536133, "step_time_ms": 5316.549301147461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:28] (step=0003945) Train Loss: 0.2797, Train Steps/Sec: 0.16, Epoch: 0.0766614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3946, "loss": 0.236097514629364, "memory_gb": 7.721559524536133, "step_time_ms": 7502.034664154053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:36] (step=0003946) Train Loss: 0.1898, Train Steps/Sec: 0.12, Epoch: 0.07668091721725612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3947, "loss": 0.10992961376905441, "memory_gb": 7.721559524536133, "step_time_ms": 7541.8860912323, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:44] (step=0003947) Train Loss: 0.1177, Train Steps/Sec: 0.12, Epoch: 0.07670034978624174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3948, "loss": 0.22278879582881927, "memory_gb": 7.721559524536133, "step_time_ms": 7462.654590606689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:40:52] (step=0003948) Train Loss: 0.2105, Train Steps/Sec: 0.12, Epoch: 0.07671978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 3949, "loss": 0.3140854835510254, "memory_gb": 7.721559524536133, "step_time_ms": 7530.254602432251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:00] (step=0003949) Train Loss: 0.2696, Train Steps/Sec: 0.12, Epoch: 0.07673921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3950, "loss": 0.19530120491981506, "memory_gb": 7.721559524536133, "step_time_ms": 7370.889902114868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:09] (step=0003950) Train Loss: 0.1990, Train Steps/Sec: 0.12, Epoch: 0.0767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3951, "loss": 0.1685798317193985, "memory_gb": 7.721559524536133, "step_time_ms": 7550.677537918091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:17] (step=0003951) Train Loss: 0.1600, Train Steps/Sec: 0.12, Epoch: 0.07677808006218421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3952, "loss": 0.11646530032157898, "memory_gb": 7.721559524536133, "step_time_ms": 7525.489568710327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:25] (step=0003952) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.07679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3953, "loss": 0.2856384515762329, "memory_gb": 7.721559524536133, "step_time_ms": 7646.73924446106, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:33] (step=0003953) Train Loss: 0.2339, Train Steps/Sec: 0.12, Epoch: 0.07681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3954, "loss": 0.19967959821224213, "memory_gb": 7.721559524536133, "step_time_ms": 7505.373477935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:41] (step=0003954) Train Loss: 0.1735, Train Steps/Sec: 0.13, Epoch: 0.07683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3955, "loss": 0.18458548188209534, "memory_gb": 7.721559524536133, "step_time_ms": 7517.797470092773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:49] (step=0003955) Train Loss: 0.2586, Train Steps/Sec: 0.12, Epoch: 0.0768558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:41:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3956, "loss": 0.31072384119033813, "memory_gb": 7.721559524536133, "step_time_ms": 7586.3306522369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:41:57] (step=0003956) Train Loss: 0.2396, Train Steps/Sec: 0.12, Epoch: 0.07687524290711233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3957, "loss": 0.2468772977590561, "memory_gb": 7.721559524536133, "step_time_ms": 7496.384143829346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:05] (step=0003957) Train Loss: 0.2175, Train Steps/Sec: 0.13, Epoch: 0.07689467547609793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 3958, "loss": 0.25255054235458374, "memory_gb": 7.721559524536133, "step_time_ms": 7460.439443588257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:13] (step=0003958) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.07691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 3959, "loss": 0.3118944764137268, "memory_gb": 7.721559524536133, "step_time_ms": 7498.800992965698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:21] (step=0003959) Train Loss: 0.2462, Train Steps/Sec: 0.12, Epoch: 0.07693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 3960, "loss": 0.2380945384502411, "memory_gb": 7.721559524536133, "step_time_ms": 7371.495246887207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:29] (step=0003960) Train Loss: 0.2281, Train Steps/Sec: 0.13, Epoch: 0.0769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 3961, "loss": 0.11572272330522537, "memory_gb": 7.721559524536133, "step_time_ms": 7390.520334243774, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:37] (step=0003961) Train Loss: 0.1708, Train Steps/Sec: 0.13, Epoch: 0.07697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 3962, "loss": 0.17331033945083618, "memory_gb": 7.721559524536133, "step_time_ms": 7432.585954666138, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:45] (step=0003962) Train Loss: 0.2082, Train Steps/Sec: 0.12, Epoch: 0.07699183832102605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:42:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 3963, "loss": 0.21558243036270142, "memory_gb": 7.721559524536133, "step_time_ms": 7359.494209289551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:42:53] (step=0003963) Train Loss: 0.2173, Train Steps/Sec: 0.13, Epoch: 0.07701127089001165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 3964, "loss": 0.37539228796958923, "memory_gb": 7.721559524536133, "step_time_ms": 7432.754755020142, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:01] (step=0003964) Train Loss: 0.2899, Train Steps/Sec: 0.12, Epoch: 0.07703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 3965, "loss": 0.2842504382133484, "memory_gb": 7.721559524536133, "step_time_ms": 7457.152366638184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:09] (step=0003965) Train Loss: 0.2755, Train Steps/Sec: 0.12, Epoch: 0.0770501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 3966, "loss": 0.3333461880683899, "memory_gb": 7.721559524536133, "step_time_ms": 7408.764362335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:17] (step=0003966) Train Loss: 0.2965, Train Steps/Sec: 0.13, Epoch: 0.07706956859696852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 3967, "loss": 0.3002792000770569, "memory_gb": 7.721559524536133, "step_time_ms": 7457.764625549316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:25] (step=0003967) Train Loss: 0.3145, Train Steps/Sec: 0.13, Epoch: 0.07708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 3968, "loss": 0.15410326421260834, "memory_gb": 7.721559524536133, "step_time_ms": 7542.7820682525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:33] (step=0003968) Train Loss: 0.1964, Train Steps/Sec: 0.12, Epoch: 0.07710843373493977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 3969, "loss": 0.2888188362121582, "memory_gb": 7.721559524536133, "step_time_ms": 7461.905479431152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:41] (step=0003969) Train Loss: 0.2252, Train Steps/Sec: 0.12, Epoch: 0.07712786630392537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 3970, "loss": 0.268026739358902, "memory_gb": 7.721559524536133, "step_time_ms": 7495.734453201294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:49] (step=0003970) Train Loss: 0.2476, Train Steps/Sec: 0.12, Epoch: 0.077147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:43:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 3971, "loss": 0.24363389611244202, "memory_gb": 7.721559524536133, "step_time_ms": 7487.201690673828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:43:57] (step=0003971) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.07716673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 3972, "loss": 0.19453871250152588, "memory_gb": 7.721559524536133, "step_time_ms": 7256.367206573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:05] (step=0003972) Train Loss: 0.2199, Train Steps/Sec: 0.13, Epoch: 0.07718616401088224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 3973, "loss": 0.19831416010856628, "memory_gb": 7.721559524536133, "step_time_ms": 6851.18293762207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:12] (step=0003973) Train Loss: 0.1793, Train Steps/Sec: 0.14, Epoch: 0.07720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 3974, "loss": 0.23472747206687927, "memory_gb": 7.721559524536133, "step_time_ms": 5949.374675750732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:19] (step=0003974) Train Loss: 0.2266, Train Steps/Sec: 0.15, Epoch: 0.07722502914885347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 3975, "loss": 0.350824236869812, "memory_gb": 7.721559524536133, "step_time_ms": 7457.5817584991455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:27] (step=0003975) Train Loss: 0.3108, Train Steps/Sec: 0.12, Epoch: 0.0772444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 3976, "loss": 0.231800377368927, "memory_gb": 7.721559524536133, "step_time_ms": 7460.804462432861, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:35] (step=0003976) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.07726389428682472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 3977, "loss": 0.22149163484573364, "memory_gb": 7.721559524536133, "step_time_ms": 7366.276025772095, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:43] (step=0003977) Train Loss: 0.2257, Train Steps/Sec: 0.13, Epoch: 0.07728332685581034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 3978, "loss": 0.23087936639785767, "memory_gb": 7.721559524536133, "step_time_ms": 7403.294801712036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:51] (step=0003978) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.07730275942479596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:44:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3979, "loss": 0.1853885054588318, "memory_gb": 7.721559524536133, "step_time_ms": 7480.695962905884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:44:59] (step=0003979) Train Loss: 0.1892, Train Steps/Sec: 0.12, Epoch: 0.07732219199378158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3980, "loss": 0.2169056534767151, "memory_gb": 7.721559524536133, "step_time_ms": 7391.977310180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:07] (step=0003980) Train Loss: 0.1835, Train Steps/Sec: 0.12, Epoch: 0.07734162456276719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3981, "loss": 0.23298653960227966, "memory_gb": 7.721559524536133, "step_time_ms": 7478.952884674072, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:15] (step=0003981) Train Loss: 0.2494, Train Steps/Sec: 0.12, Epoch: 0.07736105713175281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 3982, "loss": 0.3051553964614868, "memory_gb": 7.721559524536133, "step_time_ms": 7503.51357460022, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:23] (step=0003982) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.07738048970073844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3983, "loss": 0.23629815876483917, "memory_gb": 7.721559524536133, "step_time_ms": 7448.56595993042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:31] (step=0003983) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.07739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 3984, "loss": 0.27209603786468506, "memory_gb": 7.721559524536133, "step_time_ms": 7459.823846817017, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:39] (step=0003984) Train Loss: 0.2346, Train Steps/Sec: 0.13, Epoch: 0.07741935483870968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 3985, "loss": 0.2824678122997284, "memory_gb": 7.721559524536133, "step_time_ms": 7575.93297958374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:47] (step=0003985) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.0774387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:45:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 3986, "loss": 0.35038334131240845, "memory_gb": 7.721559524536133, "step_time_ms": 7521.021127700806, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:45:55] (step=0003986) Train Loss: 0.2703, Train Steps/Sec: 0.12, Epoch: 0.07745821997668091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 3987, "loss": 0.30300232768058777, "memory_gb": 7.721559524536133, "step_time_ms": 7502.785921096802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:03] (step=0003987) Train Loss: 0.3150, Train Steps/Sec: 0.12, Epoch: 0.07747765254566653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 3988, "loss": 0.2755384147167206, "memory_gb": 7.721559524536133, "step_time_ms": 7540.3172969818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:11] (step=0003988) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.07749708511465216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 3989, "loss": 0.3267001211643219, "memory_gb": 7.721559524536133, "step_time_ms": 7471.7254638671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:20] (step=0003989) Train Loss: 0.2873, Train Steps/Sec: 0.12, Epoch: 0.07751651768363778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 3990, "loss": 0.2355678528547287, "memory_gb": 7.721559524536133, "step_time_ms": 7441.572189331055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:28] (step=0003990) Train Loss: 0.2835, Train Steps/Sec: 0.13, Epoch: 0.0775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 3991, "loss": 0.2566157281398773, "memory_gb": 7.721559524536133, "step_time_ms": 7634.350776672363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:36] (step=0003991) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.07755538282160902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 3992, "loss": 0.12851834297180176, "memory_gb": 7.721559524536133, "step_time_ms": 7468.849420547485, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:44] (step=0003992) Train Loss: 0.1586, Train Steps/Sec: 0.12, Epoch: 0.07757481539059463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 3993, "loss": 0.2237461507320404, "memory_gb": 7.721559524536133, "step_time_ms": 7451.5204429626465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:52] (step=0003993) Train Loss: 0.2730, Train Steps/Sec: 0.13, Epoch: 0.07759424795958025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 3994, "loss": 0.3047283887863159, "memory_gb": 7.721559524536133, "step_time_ms": 7538.568735122681, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:46:59] (step=0003994) Train Loss: 0.2940, Train Steps/Sec: 0.13, Epoch: 0.07761368052856588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 3995, "loss": 0.1738586127758026, "memory_gb": 7.721559524536133, "step_time_ms": 7500.791311264038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:07] (step=0003995) Train Loss: 0.1718, Train Steps/Sec: 0.13, Epoch: 0.0776331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 3996, "loss": 0.318981796503067, "memory_gb": 7.721559524536133, "step_time_ms": 7528.4271240234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:15] (step=0003996) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.07765254566653712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 3997, "loss": 0.19886699318885803, "memory_gb": 7.721559524536133, "step_time_ms": 7579.524040222168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:24] (step=0003997) Train Loss: 0.1973, Train Steps/Sec: 0.12, Epoch: 0.07767197823552274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 3998, "loss": 0.25356337428092957, "memory_gb": 7.721559524536133, "step_time_ms": 7509.932279586792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:31] (step=0003998) Train Loss: 0.2835, Train Steps/Sec: 0.13, Epoch: 0.07769141080450835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 3999, "loss": 0.3305709958076477, "memory_gb": 7.721559524536133, "step_time_ms": 7487.1532917022705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:40] (step=0003999) Train Loss: 0.2776, Train Steps/Sec: 0.12, Epoch: 0.07771084337349397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4000, "loss": 0.22046944499015808, "memory_gb": 7.721559524536133, "step_time_ms": 7525.816917419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:48] (step=0004000) Train Loss: 0.2616, Train Steps/Sec: 0.12, Epoch: 0.0777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:47:48] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0004000/ [2025-07-29 02:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4001, "loss": 0.24583379924297333, "memory_gb": 7.721559524536133, "step_time_ms": 7319.339990615845, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:47:56] (step=0004001) Train Loss: 0.2789, Train Steps/Sec: 0.13, Epoch: 0.07774970851146522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4002, "loss": 0.3364184498786926, "memory_gb": 7.721559524536133, "step_time_ms": 6037.143230438232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:02] (step=0004002) Train Loss: 0.2687, Train Steps/Sec: 0.16, Epoch: 0.07776914108045084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4003, "loss": 0.2612519860267639, "memory_gb": 7.721559524536133, "step_time_ms": 6699.934244155884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:09] (step=0004003) Train Loss: 0.3055, Train Steps/Sec: 0.14, Epoch: 0.07778857364943645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4004, "loss": 0.15757811069488525, "memory_gb": 7.721559524536133, "step_time_ms": 7469.675540924072, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:17] (step=0004004) Train Loss: 0.2161, Train Steps/Sec: 0.13, Epoch: 0.07780800621842207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4005, "loss": 0.31014323234558105, "memory_gb": 7.721559524536133, "step_time_ms": 7544.794082641602, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:25] (step=0004005) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.0778274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4006, "loss": 0.152938574552536, "memory_gb": 7.721559524536133, "step_time_ms": 7418.1718826293945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:33] (step=0004006) Train Loss: 0.1915, Train Steps/Sec: 0.13, Epoch: 0.07784687135639332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4007, "loss": 0.3144870400428772, "memory_gb": 7.721559524536133, "step_time_ms": 7403.214693069458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:41] (step=0004007) Train Loss: 0.2565, Train Steps/Sec: 0.12, Epoch: 0.07786630392537894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4008, "loss": 0.3360612690448761, "memory_gb": 7.721559524536133, "step_time_ms": 7484.91644859314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:49] (step=0004008) Train Loss: 0.2983, Train Steps/Sec: 0.12, Epoch: 0.07788573649436456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4009, "loss": 0.2488279640674591, "memory_gb": 7.721559524536133, "step_time_ms": 7395.684719085693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:48:57] (step=0004009) Train Loss: 0.2695, Train Steps/Sec: 0.12, Epoch: 0.07790516906335017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4010, "loss": 0.14256463944911957, "memory_gb": 7.721559524536133, "step_time_ms": 7446.582555770874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:05] (step=0004010) Train Loss: 0.1735, Train Steps/Sec: 0.13, Epoch: 0.07792460163233579, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4011, "loss": 0.289208322763443, "memory_gb": 7.721559524536133, "step_time_ms": 7521.892547607422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:13] (step=0004011) Train Loss: 0.2736, Train Steps/Sec: 0.12, Epoch: 0.07794403420132141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4012, "loss": 0.3604787588119507, "memory_gb": 7.715639114379883, "step_time_ms": 7411.733150482178, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:21] (step=0004012) Train Loss: 0.3021, Train Steps/Sec: 0.12, Epoch: 0.07796346677030704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4013, "loss": 0.2873713970184326, "memory_gb": 7.721559524536133, "step_time_ms": 7223.117828369141, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:29] (step=0004013) Train Loss: 0.2642, Train Steps/Sec: 0.12, Epoch: 0.07798289933929266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4014, "loss": 0.1816750168800354, "memory_gb": 7.721559524536133, "step_time_ms": 7534.728527069092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:38] (step=0004014) Train Loss: 0.2723, Train Steps/Sec: 0.12, Epoch: 0.07800233190827828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4015, "loss": 0.2694758176803589, "memory_gb": 7.721559524536133, "step_time_ms": 7402.8637409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:46] (step=0004015) Train Loss: 0.2733, Train Steps/Sec: 0.12, Epoch: 0.07802176447726389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4016, "loss": 0.29566073417663574, "memory_gb": 7.721559524536133, "step_time_ms": 7406.825542449951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:49:54] (step=0004016) Train Loss: 0.2650, Train Steps/Sec: 0.12, Epoch: 0.07804119704624951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4017, "loss": 0.17634882032871246, "memory_gb": 7.721559524536133, "step_time_ms": 7506.228923797607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:02] (step=0004017) Train Loss: 0.1863, Train Steps/Sec: 0.12, Epoch: 0.07806062961523513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4018, "loss": 0.2750549614429474, "memory_gb": 7.721559524536133, "step_time_ms": 7419.375419616699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:10] (step=0004018) Train Loss: 0.3010, Train Steps/Sec: 0.12, Epoch: 0.07808006218422076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4019, "loss": 0.21388527750968933, "memory_gb": 7.721559524536133, "step_time_ms": 7423.596620559692, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:18] (step=0004019) Train Loss: 0.1681, Train Steps/Sec: 0.13, Epoch: 0.07809949475320638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4020, "loss": 0.23360884189605713, "memory_gb": 7.721559524536133, "step_time_ms": 7536.570072174072, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:26] (step=0004020) Train Loss: 0.1990, Train Steps/Sec: 0.12, Epoch: 0.078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4021, "loss": 0.2925340533256531, "memory_gb": 7.721559524536133, "step_time_ms": 7486.999273300171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:34] (step=0004021) Train Loss: 0.2541, Train Steps/Sec: 0.13, Epoch: 0.07813835989117761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4022, "loss": 0.25724953413009644, "memory_gb": 7.721559524536133, "step_time_ms": 7481.934547424316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:41] (step=0004022) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.07815779246016323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4023, "loss": 0.24733120203018188, "memory_gb": 7.721559524536133, "step_time_ms": 7507.153272628784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:49] (step=0004023) Train Loss: 0.2278, Train Steps/Sec: 0.12, Epoch: 0.07817722502914885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4024, "loss": 0.11401443183422089, "memory_gb": 7.721559524536133, "step_time_ms": 7440.741777420044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:50:57] (step=0004024) Train Loss: 0.1837, Train Steps/Sec: 0.13, Epoch: 0.07819665759813448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4025, "loss": 0.3476385772228241, "memory_gb": 7.721559524536133, "step_time_ms": 7477.131605148315, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:05] (step=0004025) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.0782160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4026, "loss": 0.2761092483997345, "memory_gb": 7.721559524536133, "step_time_ms": 7467.8449630737305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:14] (step=0004026) Train Loss: 0.2687, Train Steps/Sec: 0.12, Epoch: 0.07823552273610572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4027, "loss": 0.15528954565525055, "memory_gb": 7.721559524536133, "step_time_ms": 7432.371139526367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:22] (step=0004027) Train Loss: 0.2024, Train Steps/Sec: 0.13, Epoch: 0.07825495530509133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4028, "loss": 0.23060518503189087, "memory_gb": 7.721559524536133, "step_time_ms": 7482.051372528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:30] (step=0004028) Train Loss: 0.2196, Train Steps/Sec: 0.12, Epoch: 0.07827438787407695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4029, "loss": 0.2495070993900299, "memory_gb": 7.721559524536133, "step_time_ms": 7490.773677825928, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:38] (step=0004029) Train Loss: 0.2311, Train Steps/Sec: 0.13, Epoch: 0.07829382044306257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4030, "loss": 0.23536202311515808, "memory_gb": 7.721559524536133, "step_time_ms": 7344.590902328491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:46] (step=0004030) Train Loss: 0.2072, Train Steps/Sec: 0.13, Epoch: 0.0783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4031, "loss": 0.2572489380836487, "memory_gb": 7.721559524536133, "step_time_ms": 5655.249118804932, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:51] (step=0004031) Train Loss: 0.2648, Train Steps/Sec: 0.17, Epoch: 0.07833268558103382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:51:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4032, "loss": 0.20249664783477783, "memory_gb": 7.721559524536133, "step_time_ms": 7109.103202819824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:51:59] (step=0004032) Train Loss: 0.1959, Train Steps/Sec: 0.13, Epoch: 0.07835211815001943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4033, "loss": 0.23776008188724518, "memory_gb": 7.721559524536133, "step_time_ms": 7470.658302307129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:07] (step=0004033) Train Loss: 0.2394, Train Steps/Sec: 0.12, Epoch: 0.07837155071900505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4034, "loss": 0.19246625900268555, "memory_gb": 7.721559524536133, "step_time_ms": 7512.9029750823975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:15] (step=0004034) Train Loss: 0.1816, Train Steps/Sec: 0.12, Epoch: 0.07839098328799067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4035, "loss": 0.24304983019828796, "memory_gb": 7.721559524536133, "step_time_ms": 7670.217037200928, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:23] (step=0004035) Train Loss: 0.2856, Train Steps/Sec: 0.12, Epoch: 0.07841041585697629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4036, "loss": 0.2758437693119049, "memory_gb": 7.721559524536133, "step_time_ms": 7442.638874053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:31] (step=0004036) Train Loss: 0.2517, Train Steps/Sec: 0.12, Epoch: 0.07842984842596191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4037, "loss": 0.3042692542076111, "memory_gb": 7.721559524536133, "step_time_ms": 7530.116558074951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:40] (step=0004037) Train Loss: 0.3317, Train Steps/Sec: 0.12, Epoch: 0.07844928099494754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4038, "loss": 0.2101926952600479, "memory_gb": 7.721559524536133, "step_time_ms": 7478.953838348389, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:48] (step=0004038) Train Loss: 0.2133, Train Steps/Sec: 0.12, Epoch: 0.07846871356393315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4039, "loss": 0.2703113555908203, "memory_gb": 7.721559524536133, "step_time_ms": 7453.723430633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:52:56] (step=0004039) Train Loss: 0.2464, Train Steps/Sec: 0.13, Epoch: 0.07848814613291877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4040, "loss": 0.24783191084861755, "memory_gb": 7.721559524536133, "step_time_ms": 7547.615528106689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:04] (step=0004040) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.07850757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4041, "loss": 0.19919002056121826, "memory_gb": 7.721559524536133, "step_time_ms": 7549.593210220337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:12] (step=0004041) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.07852701127089001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4042, "loss": 0.23153233528137207, "memory_gb": 7.721559524536133, "step_time_ms": 7453.798055648804, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:20] (step=0004042) Train Loss: 0.2584, Train Steps/Sec: 0.13, Epoch: 0.07854644383987563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4043, "loss": 0.18612247705459595, "memory_gb": 7.715639114379883, "step_time_ms": 7533.379554748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:28] (step=0004043) Train Loss: 0.1894, Train Steps/Sec: 0.12, Epoch: 0.07856587640886126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4044, "loss": 0.278794527053833, "memory_gb": 7.721559524536133, "step_time_ms": 7480.0684452056885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:36] (step=0004044) Train Loss: 0.2206, Train Steps/Sec: 0.13, Epoch: 0.07858530897784687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4045, "loss": 0.15040439367294312, "memory_gb": 7.721559524536133, "step_time_ms": 7453.0463218688965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:43] (step=0004045) Train Loss: 0.1871, Train Steps/Sec: 0.13, Epoch: 0.07860474154683249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4046, "loss": 0.17932316660881042, "memory_gb": 7.721559524536133, "step_time_ms": 7505.997657775879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:51] (step=0004046) Train Loss: 0.2437, Train Steps/Sec: 0.13, Epoch: 0.07862417411581811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4047, "loss": 0.2541833817958832, "memory_gb": 7.721559524536133, "step_time_ms": 7511.989116668701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:53:59] (step=0004047) Train Loss: 0.2649, Train Steps/Sec: 0.12, Epoch: 0.07864360668480373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4048, "loss": 0.3553749918937683, "memory_gb": 7.721559524536133, "step_time_ms": 7461.3330364227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:07] (step=0004048) Train Loss: 0.2617, Train Steps/Sec: 0.13, Epoch: 0.07866303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4049, "loss": 0.22465738654136658, "memory_gb": 7.721559524536133, "step_time_ms": 7503.0837059021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:15] (step=0004049) Train Loss: 0.2044, Train Steps/Sec: 0.12, Epoch: 0.07868247182277498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4050, "loss": 0.3049260973930359, "memory_gb": 7.721559524536133, "step_time_ms": 7492.4774169921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:24] (step=0004050) Train Loss: 0.2474, Train Steps/Sec: 0.12, Epoch: 0.07870190439176059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4051, "loss": 0.29079532623291016, "memory_gb": 7.721559524536133, "step_time_ms": 7428.375244140625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:32] (step=0004051) Train Loss: 0.2562, Train Steps/Sec: 0.12, Epoch: 0.07872133696074621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4052, "loss": 0.1534481644630432, "memory_gb": 7.721559524536133, "step_time_ms": 7489.373445510864, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:40] (step=0004052) Train Loss: 0.1718, Train Steps/Sec: 0.12, Epoch: 0.07874076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4053, "loss": 0.23409491777420044, "memory_gb": 7.721559524536133, "step_time_ms": 7458.024263381958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:48] (step=0004053) Train Loss: 0.2635, Train Steps/Sec: 0.12, Epoch: 0.07876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4054, "loss": 0.2871318459510803, "memory_gb": 7.721559524536133, "step_time_ms": 7404.232025146484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:54:56] (step=0004054) Train Loss: 0.3003, Train Steps/Sec: 0.12, Epoch: 0.07877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4055, "loss": 0.28572824597358704, "memory_gb": 7.715639114379883, "step_time_ms": 7419.559001922607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:04] (step=0004055) Train Loss: 0.2509, Train Steps/Sec: 0.12, Epoch: 0.0787990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4056, "loss": 0.15983358025550842, "memory_gb": 7.721559524536133, "step_time_ms": 7447.227954864502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:12] (step=0004056) Train Loss: 0.1821, Train Steps/Sec: 0.12, Epoch: 0.0788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4057, "loss": 0.34782475233078003, "memory_gb": 7.721559524536133, "step_time_ms": 7403.705358505249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:20] (step=0004057) Train Loss: 0.2660, Train Steps/Sec: 0.12, Epoch: 0.07883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4058, "loss": 0.376037061214447, "memory_gb": 7.721559524536133, "step_time_ms": 7332.7460289001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:28] (step=0004058) Train Loss: 0.2761, Train Steps/Sec: 0.13, Epoch: 0.07885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4059, "loss": 0.20783869922161102, "memory_gb": 7.721559524536133, "step_time_ms": 7474.526166915894, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:36] (step=0004059) Train Loss: 0.2162, Train Steps/Sec: 0.12, Epoch: 0.07887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4060, "loss": 0.24507565796375275, "memory_gb": 7.721559524536133, "step_time_ms": 5447.114944458008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:42] (step=0004060) Train Loss: 0.2458, Train Steps/Sec: 0.17, Epoch: 0.0788962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4061, "loss": 0.24775710701942444, "memory_gb": 7.721559524536133, "step_time_ms": 7461.800098419189, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:50] (step=0004061) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.0789156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:55:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4062, "loss": 0.3087562322616577, "memory_gb": 7.721559524536133, "step_time_ms": 7393.915414810181, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:55:58] (step=0004062) Train Loss: 0.2491, Train Steps/Sec: 0.12, Epoch: 0.07893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4063, "loss": 0.3124343752861023, "memory_gb": 7.721559524536133, "step_time_ms": 7447.075843811035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:06] (step=0004063) Train Loss: 0.2748, Train Steps/Sec: 0.12, Epoch: 0.07895452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4064, "loss": 0.2858063280582428, "memory_gb": 7.721559524536133, "step_time_ms": 7481.240272521973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:14] (step=0004064) Train Loss: 0.2690, Train Steps/Sec: 0.12, Epoch: 0.07897396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4065, "loss": 0.1900252401828766, "memory_gb": 7.721559524536133, "step_time_ms": 7413.043737411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:22] (step=0004065) Train Loss: 0.2651, Train Steps/Sec: 0.13, Epoch: 0.07899339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4066, "loss": 0.2236599326133728, "memory_gb": 7.721559524536133, "step_time_ms": 7419.45481300354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:30] (step=0004066) Train Loss: 0.1737, Train Steps/Sec: 0.13, Epoch: 0.07901282549553051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4067, "loss": 0.32085177302360535, "memory_gb": 7.721559524536133, "step_time_ms": 7496.506929397583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:38] (step=0004067) Train Loss: 0.2834, Train Steps/Sec: 0.12, Epoch: 0.07903225806451612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4068, "loss": 0.3163086771965027, "memory_gb": 7.721559524536133, "step_time_ms": 7445.082187652588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:46] (step=0004068) Train Loss: 0.3085, Train Steps/Sec: 0.13, Epoch: 0.07905169063350174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4069, "loss": 0.2664092183113098, "memory_gb": 7.721559524536133, "step_time_ms": 7440.198659896851, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:56:54] (step=0004069) Train Loss: 0.2748, Train Steps/Sec: 0.13, Epoch: 0.07907112320248737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4070, "loss": 0.21725237369537354, "memory_gb": 7.721559524536133, "step_time_ms": 7481.302499771118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:02] (step=0004070) Train Loss: 0.2517, Train Steps/Sec: 0.12, Epoch: 0.07909055577147299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4071, "loss": 0.18616153299808502, "memory_gb": 7.721559524536133, "step_time_ms": 7442.115545272827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:10] (step=0004071) Train Loss: 0.2372, Train Steps/Sec: 0.13, Epoch: 0.07910998834045861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4072, "loss": 0.27590644359588623, "memory_gb": 7.721559524536133, "step_time_ms": 7496.953964233398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:18] (step=0004072) Train Loss: 0.2236, Train Steps/Sec: 0.12, Epoch: 0.07912942090944423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4073, "loss": 0.24060086905956268, "memory_gb": 7.721559524536133, "step_time_ms": 7509.244203567505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:26] (step=0004073) Train Loss: 0.2351, Train Steps/Sec: 0.12, Epoch: 0.07914885347842984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4074, "loss": 0.34945613145828247, "memory_gb": 7.721559524536133, "step_time_ms": 7473.149538040161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:34] (step=0004074) Train Loss: 0.3211, Train Steps/Sec: 0.13, Epoch: 0.07916828604741546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4075, "loss": 0.25523507595062256, "memory_gb": 7.721559524536133, "step_time_ms": 7472.372055053711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:42] (step=0004075) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.07918771861640109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4076, "loss": 0.24181431531906128, "memory_gb": 7.721559524536133, "step_time_ms": 7662.530422210693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:50] (step=0004076) Train Loss: 0.2187, Train Steps/Sec: 0.12, Epoch: 0.07920715118538671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:57:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4077, "loss": 0.2294107973575592, "memory_gb": 7.721559524536133, "step_time_ms": 7236.325025558472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:57:58] (step=0004077) Train Loss: 0.2743, Train Steps/Sec: 0.13, Epoch: 0.07922658375437233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4078, "loss": 0.30295825004577637, "memory_gb": 7.721559524536133, "step_time_ms": 7472.657918930054, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:06] (step=0004078) Train Loss: 0.2570, Train Steps/Sec: 0.12, Epoch: 0.07924601632335795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4079, "loss": 0.33427494764328003, "memory_gb": 7.721559524536133, "step_time_ms": 7501.633882522583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:14] (step=0004079) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.07926544889234356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4080, "loss": 0.20051836967468262, "memory_gb": 7.721559524536133, "step_time_ms": 7438.8861656188965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:22] (step=0004080) Train Loss: 0.2323, Train Steps/Sec: 0.12, Epoch: 0.07928488146132918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4081, "loss": 0.3070297837257385, "memory_gb": 7.721559524536133, "step_time_ms": 7454.071521759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:30] (step=0004081) Train Loss: 0.2731, Train Steps/Sec: 0.13, Epoch: 0.07930431403031481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4082, "loss": 0.26521921157836914, "memory_gb": 7.721559524536133, "step_time_ms": 7487.795829772949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:38] (step=0004082) Train Loss: 0.2598, Train Steps/Sec: 0.12, Epoch: 0.07932374659930043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4083, "loss": 0.20458173751831055, "memory_gb": 7.721559524536133, "step_time_ms": 7420.7603931427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:46] (step=0004083) Train Loss: 0.1928, Train Steps/Sec: 0.12, Epoch: 0.07934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:58:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4084, "loss": 0.2841864228248596, "memory_gb": 7.721559524536133, "step_time_ms": 7471.124649047852, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:58:54] (step=0004084) Train Loss: 0.2621, Train Steps/Sec: 0.13, Epoch: 0.07936261173727167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4085, "loss": 0.2847479581832886, "memory_gb": 7.721559524536133, "step_time_ms": 7516.231060028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:02] (step=0004085) Train Loss: 0.2516, Train Steps/Sec: 0.12, Epoch: 0.07938204430625728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4086, "loss": 0.23316268622875214, "memory_gb": 7.721559524536133, "step_time_ms": 7447.012186050415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:10] (step=0004086) Train Loss: 0.2298, Train Steps/Sec: 0.12, Epoch: 0.0794014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4087, "loss": 0.24235600233078003, "memory_gb": 7.721559524536133, "step_time_ms": 7364.049911499023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:18] (step=0004087) Train Loss: 0.2492, Train Steps/Sec: 0.13, Epoch: 0.07942090944422853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4088, "loss": 0.3011642098426819, "memory_gb": 7.721559524536133, "step_time_ms": 7553.236484527588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:26] (step=0004088) Train Loss: 0.2710, Train Steps/Sec: 0.12, Epoch: 0.07944034201321415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4089, "loss": 0.2770613431930542, "memory_gb": 7.721559524536133, "step_time_ms": 5063.262939453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:32] (step=0004089) Train Loss: 0.3227, Train Steps/Sec: 0.18, Epoch: 0.07945977458219977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4090, "loss": 0.1863703727722168, "memory_gb": 7.721559524536133, "step_time_ms": 7516.218662261963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:40] (step=0004090) Train Loss: 0.2440, Train Steps/Sec: 0.13, Epoch: 0.07947920715118538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4091, "loss": 0.19912025332450867, "memory_gb": 7.721559524536133, "step_time_ms": 7419.753313064575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:48] (step=0004091) Train Loss: 0.2008, Train Steps/Sec: 0.12, Epoch: 0.079498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 02:59:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4092, "loss": 0.17572365701198578, "memory_gb": 7.721559524536133, "step_time_ms": 7471.979141235352, "trainable_params": 4718592, "method": "lora"} [2025-07-29 02:59:56] (step=0004092) Train Loss: 0.1756, Train Steps/Sec: 0.12, Epoch: 0.07951807228915662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4093, "loss": 0.12102517485618591, "memory_gb": 7.721559524536133, "step_time_ms": 7527.66752243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:04] (step=0004093) Train Loss: 0.1828, Train Steps/Sec: 0.12, Epoch: 0.07953750485814225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4094, "loss": 0.23415660858154297, "memory_gb": 7.721559524536133, "step_time_ms": 7468.377113342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:12] (step=0004094) Train Loss: 0.2172, Train Steps/Sec: 0.13, Epoch: 0.07955693742712787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4095, "loss": 0.1965639740228653, "memory_gb": 7.721559524536133, "step_time_ms": 7532.99617767334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:20] (step=0004095) Train Loss: 0.2467, Train Steps/Sec: 0.13, Epoch: 0.07957636999611349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4096, "loss": 0.21449948847293854, "memory_gb": 7.721559524536133, "step_time_ms": 7489.980697631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:28] (step=0004096) Train Loss: 0.2129, Train Steps/Sec: 0.12, Epoch: 0.0795958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4097, "loss": 0.2919987142086029, "memory_gb": 7.721559524536133, "step_time_ms": 7442.969083786011, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:36] (step=0004097) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.07961523513408472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4098, "loss": 0.19508755207061768, "memory_gb": 7.721559524536133, "step_time_ms": 7457.192420959473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:44] (step=0004098) Train Loss: 0.2747, Train Steps/Sec: 0.13, Epoch: 0.07963466770307034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:00:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4099, "loss": 0.23857465386390686, "memory_gb": 7.721559524536133, "step_time_ms": 7414.68071937561, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:00:52] (step=0004099) Train Loss: 0.2692, Train Steps/Sec: 0.13, Epoch: 0.07965410027205597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4100, "loss": 0.20721334218978882, "memory_gb": 7.721559524536133, "step_time_ms": 7405.450105667114, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:00] (step=0004100) Train Loss: 0.1697, Train Steps/Sec: 0.13, Epoch: 0.07967353284104159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4101, "loss": 0.23199833929538727, "memory_gb": 7.721559524536133, "step_time_ms": 7506.179332733154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:08] (step=0004101) Train Loss: 0.2015, Train Steps/Sec: 0.12, Epoch: 0.07969296541002721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4102, "loss": 0.3246292471885681, "memory_gb": 7.721559524536133, "step_time_ms": 7537.466049194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:16] (step=0004102) Train Loss: 0.2183, Train Steps/Sec: 0.12, Epoch: 0.07971239797901282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4103, "loss": 0.2501201033592224, "memory_gb": 7.721559524536133, "step_time_ms": 7445.016384124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:24] (step=0004103) Train Loss: 0.2479, Train Steps/Sec: 0.13, Epoch: 0.07973183054799844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4104, "loss": 0.24398772418498993, "memory_gb": 7.721559524536133, "step_time_ms": 7526.248455047607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:32] (step=0004104) Train Loss: 0.1957, Train Steps/Sec: 0.12, Epoch: 0.07975126311698406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4105, "loss": 0.2078322023153305, "memory_gb": 7.721559524536133, "step_time_ms": 7492.8343296051025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:40] (step=0004105) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.07977069568596969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4106, "loss": 0.22472091019153595, "memory_gb": 7.721559524536133, "step_time_ms": 7397.934913635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:48] (step=0004106) Train Loss: 0.2576, Train Steps/Sec: 0.13, Epoch: 0.07979012825495531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:01:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4107, "loss": 0.2158975601196289, "memory_gb": 7.721559524536133, "step_time_ms": 7425.6272315979, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:01:56] (step=0004107) Train Loss: 0.2738, Train Steps/Sec: 0.13, Epoch: 0.07980956082394093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4108, "loss": 0.22304829955101013, "memory_gb": 7.721559524536133, "step_time_ms": 7477.234840393066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:04] (step=0004108) Train Loss: 0.2456, Train Steps/Sec: 0.13, Epoch: 0.07982899339292654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4109, "loss": 0.2699939012527466, "memory_gb": 7.721559524536133, "step_time_ms": 7453.9501667022705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:12] (step=0004109) Train Loss: 0.2662, Train Steps/Sec: 0.13, Epoch: 0.07984842596191216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4110, "loss": 0.22104071080684662, "memory_gb": 7.721559524536133, "step_time_ms": 7479.301452636719, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:20] (step=0004110) Train Loss: 0.2292, Train Steps/Sec: 0.12, Epoch: 0.07986785853089778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4111, "loss": 0.28613823652267456, "memory_gb": 7.721559524536133, "step_time_ms": 7477.025508880615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:28] (step=0004111) Train Loss: 0.2729, Train Steps/Sec: 0.13, Epoch: 0.0798872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4112, "loss": 0.25607195496559143, "memory_gb": 7.721559524536133, "step_time_ms": 7426.518440246582, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:36] (step=0004112) Train Loss: 0.1995, Train Steps/Sec: 0.13, Epoch: 0.07990672366886903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4113, "loss": 0.23477129638195038, "memory_gb": 7.721559524536133, "step_time_ms": 7441.338777542114, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:44] (step=0004113) Train Loss: 0.1973, Train Steps/Sec: 0.13, Epoch: 0.07992615623785465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4114, "loss": 0.3018319010734558, "memory_gb": 7.721559524536133, "step_time_ms": 7523.104906082153, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:02:52] (step=0004114) Train Loss: 0.2733, Train Steps/Sec: 0.12, Epoch: 0.07994558880684026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4115, "loss": 0.37098073959350586, "memory_gb": 7.721559524536133, "step_time_ms": 7415.789842605591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:00] (step=0004115) Train Loss: 0.2884, Train Steps/Sec: 0.13, Epoch: 0.07996502137582588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4116, "loss": 0.33122631907463074, "memory_gb": 7.721559524536133, "step_time_ms": 7340.9388065338135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:08] (step=0004116) Train Loss: 0.2777, Train Steps/Sec: 0.13, Epoch: 0.0799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4117, "loss": 0.2733907103538513, "memory_gb": 7.721559524536133, "step_time_ms": 7535.950660705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:16] (step=0004117) Train Loss: 0.2627, Train Steps/Sec: 0.12, Epoch: 0.08000388651379713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4118, "loss": 0.22896859049797058, "memory_gb": 7.721559524536133, "step_time_ms": 5417.3431396484375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:22] (step=0004118) Train Loss: 0.2477, Train Steps/Sec: 0.17, Epoch: 0.08002331908278275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4119, "loss": 0.24666696786880493, "memory_gb": 7.721559524536133, "step_time_ms": 7481.3573360443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:30] (step=0004119) Train Loss: 0.2562, Train Steps/Sec: 0.12, Epoch: 0.08004275165176837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4120, "loss": 0.3099837005138397, "memory_gb": 7.721559524536133, "step_time_ms": 7412.322282791138, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:38] (step=0004120) Train Loss: 0.3031, Train Steps/Sec: 0.12, Epoch: 0.08006218422075398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4121, "loss": 0.1654963493347168, "memory_gb": 7.721559524536133, "step_time_ms": 7444.458961486816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:46] (step=0004121) Train Loss: 0.1713, Train Steps/Sec: 0.12, Epoch: 0.0800816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:03:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4122, "loss": 0.2672466039657593, "memory_gb": 7.721559524536133, "step_time_ms": 7493.580341339111, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:03:54] (step=0004122) Train Loss: 0.2624, Train Steps/Sec: 0.12, Epoch: 0.08010104935872522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4123, "loss": 0.19776089489459991, "memory_gb": 7.721559524536133, "step_time_ms": 7636.5978717803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:02] (step=0004123) Train Loss: 0.1938, Train Steps/Sec: 0.12, Epoch: 0.08012048192771085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4124, "loss": 0.2639446258544922, "memory_gb": 7.721559524536133, "step_time_ms": 7467.118978500366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:10] (step=0004124) Train Loss: 0.3028, Train Steps/Sec: 0.12, Epoch: 0.08013991449669647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4125, "loss": 0.21411101520061493, "memory_gb": 7.721559524536133, "step_time_ms": 7525.865793228149, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:18] (step=0004125) Train Loss: 0.2166, Train Steps/Sec: 0.12, Epoch: 0.08015934706568208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4126, "loss": 0.16752371191978455, "memory_gb": 7.721559524536133, "step_time_ms": 7477.489709854126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:26] (step=0004126) Train Loss: 0.2460, Train Steps/Sec: 0.13, Epoch: 0.0801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4127, "loss": 0.26033955812454224, "memory_gb": 7.721559524536133, "step_time_ms": 7509.980201721191, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:34] (step=0004127) Train Loss: 0.2680, Train Steps/Sec: 0.13, Epoch: 0.08019821220365332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4128, "loss": 0.13838431239128113, "memory_gb": 7.721559524536133, "step_time_ms": 7581.022500991821, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:42] (step=0004128) Train Loss: 0.1737, Train Steps/Sec: 0.13, Epoch: 0.08021764477263894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4129, "loss": 0.311795711517334, "memory_gb": 7.721559524536133, "step_time_ms": 7506.49881362915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:50] (step=0004129) Train Loss: 0.2184, Train Steps/Sec: 0.13, Epoch: 0.08023707734162457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:04:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4130, "loss": 0.3361005187034607, "memory_gb": 7.721559524536133, "step_time_ms": 7512.49623298645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:04:58] (step=0004130) Train Loss: 0.3036, Train Steps/Sec: 0.13, Epoch: 0.08025650991061019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4131, "loss": 0.2680862247943878, "memory_gb": 7.721559524536133, "step_time_ms": 7617.343902587891, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:06] (step=0004131) Train Loss: 0.2567, Train Steps/Sec: 0.12, Epoch: 0.0802759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4132, "loss": 0.22261476516723633, "memory_gb": 7.721559524536133, "step_time_ms": 7528.95188331604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:14] (step=0004132) Train Loss: 0.2634, Train Steps/Sec: 0.12, Epoch: 0.08029537504858142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4133, "loss": 0.16574391722679138, "memory_gb": 7.721559524536133, "step_time_ms": 7502.315044403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:22] (step=0004133) Train Loss: 0.2218, Train Steps/Sec: 0.12, Epoch: 0.08031480761756704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4134, "loss": 0.26010841131210327, "memory_gb": 7.721559524536133, "step_time_ms": 7593.122482299805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:30] (step=0004134) Train Loss: 0.3102, Train Steps/Sec: 0.12, Epoch: 0.08033424018655266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4135, "loss": 0.26000237464904785, "memory_gb": 7.721559524536133, "step_time_ms": 7513.484954833984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:38] (step=0004135) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.08035367275553829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4136, "loss": 0.2172047346830368, "memory_gb": 7.721559524536133, "step_time_ms": 7498.260259628296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:46] (step=0004136) Train Loss: 0.2083, Train Steps/Sec: 0.12, Epoch: 0.08037310532452391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4137, "loss": 0.264595627784729, "memory_gb": 7.721559524536133, "step_time_ms": 7551.582813262939, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:05:54] (step=0004137) Train Loss: 0.2719, Train Steps/Sec: 0.12, Epoch: 0.08039253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4138, "loss": 0.17431221902370453, "memory_gb": 7.721559524536133, "step_time_ms": 7500.44846534729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:02] (step=0004138) Train Loss: 0.2289, Train Steps/Sec: 0.12, Epoch: 0.08041197046249514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4139, "loss": 0.2145467847585678, "memory_gb": 7.721559524536133, "step_time_ms": 7516.522407531738, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:10] (step=0004139) Train Loss: 0.2351, Train Steps/Sec: 0.12, Epoch: 0.08043140303148076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4140, "loss": 0.31655755639076233, "memory_gb": 7.721559524536133, "step_time_ms": 7538.913726806641, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:19] (step=0004140) Train Loss: 0.2452, Train Steps/Sec: 0.12, Epoch: 0.08045083560046638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4141, "loss": 0.2121269404888153, "memory_gb": 7.721559524536133, "step_time_ms": 7469.923496246338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:27] (step=0004141) Train Loss: 0.2362, Train Steps/Sec: 0.12, Epoch: 0.080470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4142, "loss": 0.2849580645561218, "memory_gb": 7.721559524536133, "step_time_ms": 7492.115020751953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:35] (step=0004142) Train Loss: 0.2616, Train Steps/Sec: 0.12, Epoch: 0.08048970073843763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4143, "loss": 0.13888613879680634, "memory_gb": 7.721559524536133, "step_time_ms": 7531.9459438323975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:43] (step=0004143) Train Loss: 0.2146, Train Steps/Sec: 0.12, Epoch: 0.08050913330742324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4144, "loss": 0.20765864849090576, "memory_gb": 7.721559524536133, "step_time_ms": 7445.617914199829, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:51] (step=0004144) Train Loss: 0.2049, Train Steps/Sec: 0.12, Epoch: 0.08052856587640886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4145, "loss": 0.27738672494888306, "memory_gb": 7.721559524536133, "step_time_ms": 7314.0411376953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:06:59] (step=0004145) Train Loss: 0.2759, Train Steps/Sec: 0.13, Epoch: 0.08054799844539448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4146, "loss": 0.316834032535553, "memory_gb": 7.721559524536133, "step_time_ms": 7489.900827407837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:07] (step=0004146) Train Loss: 0.2308, Train Steps/Sec: 0.12, Epoch: 0.0805674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4147, "loss": 0.354121595621109, "memory_gb": 7.721559524536133, "step_time_ms": 5048.907041549683, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:12] (step=0004147) Train Loss: 0.2997, Train Steps/Sec: 0.18, Epoch: 0.08058686358336573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4148, "loss": 0.2625115215778351, "memory_gb": 7.721559524536133, "step_time_ms": 7468.663215637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:20] (step=0004148) Train Loss: 0.2498, Train Steps/Sec: 0.13, Epoch: 0.08060629615235135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4149, "loss": 0.15766820311546326, "memory_gb": 7.721559524536133, "step_time_ms": 7421.824216842651, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:28] (step=0004149) Train Loss: 0.2223, Train Steps/Sec: 0.13, Epoch: 0.08062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4150, "loss": 0.28566545248031616, "memory_gb": 7.721559524536133, "step_time_ms": 7487.6062870025635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:36] (step=0004150) Train Loss: 0.2870, Train Steps/Sec: 0.12, Epoch: 0.08064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4151, "loss": 0.1825573444366455, "memory_gb": 7.721559524536133, "step_time_ms": 7505.166530609131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:44] (step=0004151) Train Loss: 0.2464, Train Steps/Sec: 0.12, Epoch: 0.0806645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4152, "loss": 0.34617719054222107, "memory_gb": 7.721559524536133, "step_time_ms": 7430.725574493408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:07:52] (step=0004152) Train Loss: 0.3125, Train Steps/Sec: 0.13, Epoch: 0.08068402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4153, "loss": 0.20576141774654388, "memory_gb": 7.721559524536133, "step_time_ms": 7462.761402130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:00] (step=0004153) Train Loss: 0.2632, Train Steps/Sec: 0.13, Epoch: 0.08070345899727945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4154, "loss": 0.34427833557128906, "memory_gb": 7.721559524536133, "step_time_ms": 7450.956106185913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:08] (step=0004154) Train Loss: 0.2842, Train Steps/Sec: 0.13, Epoch: 0.08072289156626505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4155, "loss": 0.13174934685230255, "memory_gb": 7.721559524536133, "step_time_ms": 7462.278604507446, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:16] (step=0004155) Train Loss: 0.2197, Train Steps/Sec: 0.13, Epoch: 0.08074232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4156, "loss": 0.1716352254152298, "memory_gb": 7.721559524536133, "step_time_ms": 7420.026779174805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:24] (step=0004156) Train Loss: 0.2279, Train Steps/Sec: 0.13, Epoch: 0.0807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4157, "loss": 0.1881839483976364, "memory_gb": 7.721559524536133, "step_time_ms": 7464.730739593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:32] (step=0004157) Train Loss: 0.2091, Train Steps/Sec: 0.13, Epoch: 0.08078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4158, "loss": 0.3117012679576874, "memory_gb": 7.721559524536133, "step_time_ms": 7412.226676940918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:40] (step=0004158) Train Loss: 0.2832, Train Steps/Sec: 0.13, Epoch: 0.08080062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4159, "loss": 0.1422755867242813, "memory_gb": 7.721559524536133, "step_time_ms": 7456.410646438599, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:48] (step=0004159) Train Loss: 0.2118, Train Steps/Sec: 0.13, Epoch: 0.08082005441119317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:08:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4160, "loss": 0.33577263355255127, "memory_gb": 7.721559524536133, "step_time_ms": 7453.714609146118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:08:56] (step=0004160) Train Loss: 0.3235, Train Steps/Sec: 0.13, Epoch: 0.08083948698017877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4161, "loss": 0.2267504334449768, "memory_gb": 7.721559524536133, "step_time_ms": 7360.061168670654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:04] (step=0004161) Train Loss: 0.2557, Train Steps/Sec: 0.13, Epoch: 0.0808589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4162, "loss": 0.20428213477134705, "memory_gb": 7.721559524536133, "step_time_ms": 7415.8265590667725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:12] (step=0004162) Train Loss: 0.2434, Train Steps/Sec: 0.13, Epoch: 0.08087835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4163, "loss": 0.36259323358535767, "memory_gb": 7.721559524536133, "step_time_ms": 7552.872657775879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:20] (step=0004163) Train Loss: 0.3384, Train Steps/Sec: 0.12, Epoch: 0.08089778468713564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4164, "loss": 0.2098638117313385, "memory_gb": 7.721559524536133, "step_time_ms": 7595.813989639282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:28] (step=0004164) Train Loss: 0.2333, Train Steps/Sec: 0.13, Epoch: 0.08091721725612126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4165, "loss": 0.2690773606300354, "memory_gb": 7.721559524536133, "step_time_ms": 7393.437623977661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:36] (step=0004165) Train Loss: 0.2042, Train Steps/Sec: 0.13, Epoch: 0.08093664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4166, "loss": 0.29170897603034973, "memory_gb": 7.721559524536133, "step_time_ms": 7484.897613525391, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:44] (step=0004166) Train Loss: 0.2641, Train Steps/Sec: 0.12, Epoch: 0.0809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:09:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4167, "loss": 0.2997320294380188, "memory_gb": 7.721559524536133, "step_time_ms": 7513.3421421051025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:09:52] (step=0004167) Train Loss: 0.2996, Train Steps/Sec: 0.12, Epoch: 0.08097551496307812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4168, "loss": 0.2860199511051178, "memory_gb": 7.721559524536133, "step_time_ms": 7483.805894851685, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:00] (step=0004168) Train Loss: 0.3135, Train Steps/Sec: 0.12, Epoch: 0.08099494753206374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4169, "loss": 0.2680911123752594, "memory_gb": 7.721559524536133, "step_time_ms": 7544.262409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:08] (step=0004169) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.08101438010104936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4170, "loss": 0.15695148706436157, "memory_gb": 7.721559524536133, "step_time_ms": 7509.272575378418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:16] (step=0004170) Train Loss: 0.2036, Train Steps/Sec: 0.12, Epoch: 0.08103381267003498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4171, "loss": 0.24706797301769257, "memory_gb": 7.721559524536133, "step_time_ms": 7454.955101013184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:24] (step=0004171) Train Loss: 0.2782, Train Steps/Sec: 0.12, Epoch: 0.0810532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4172, "loss": 0.282697856426239, "memory_gb": 7.721559524536133, "step_time_ms": 7495.309829711914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:32] (step=0004172) Train Loss: 0.2385, Train Steps/Sec: 0.12, Epoch: 0.08107267780800621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4173, "loss": 0.21215929090976715, "memory_gb": 7.721559524536133, "step_time_ms": 7513.314962387085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:40] (step=0004173) Train Loss: 0.2659, Train Steps/Sec: 0.12, Epoch: 0.08109211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4174, "loss": 0.3062252700328827, "memory_gb": 7.721559524536133, "step_time_ms": 7321.530103683472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:48] (step=0004174) Train Loss: 0.2434, Train Steps/Sec: 0.13, Epoch: 0.08111154294597746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:10:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4175, "loss": 0.2784934639930725, "memory_gb": 7.721559524536133, "step_time_ms": 7571.809768676758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:10:56] (step=0004175) Train Loss: 0.2133, Train Steps/Sec: 0.12, Epoch: 0.08113097551496308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4176, "loss": 0.25561797618865967, "memory_gb": 7.721559524536133, "step_time_ms": 5105.376720428467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:02] (step=0004176) Train Loss: 0.2111, Train Steps/Sec: 0.18, Epoch: 0.0811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4177, "loss": 0.25532981753349304, "memory_gb": 7.721559524536133, "step_time_ms": 7610.764980316162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:10] (step=0004177) Train Loss: 0.2575, Train Steps/Sec: 0.12, Epoch: 0.08116984065293432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4178, "loss": 0.21538513898849487, "memory_gb": 7.721559524536133, "step_time_ms": 7544.842720031738, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:18] (step=0004178) Train Loss: 0.2386, Train Steps/Sec: 0.13, Epoch: 0.08118927322191993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4179, "loss": 0.25511109828948975, "memory_gb": 7.721559524536133, "step_time_ms": 7501.983642578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:26] (step=0004179) Train Loss: 0.2357, Train Steps/Sec: 0.13, Epoch: 0.08120870579090556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4180, "loss": 0.24796243011951447, "memory_gb": 7.721559524536133, "step_time_ms": 7639.425039291382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:34] (step=0004180) Train Loss: 0.2454, Train Steps/Sec: 0.12, Epoch: 0.08122813835989118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4181, "loss": 0.2979927062988281, "memory_gb": 7.721559524536133, "step_time_ms": 7562.77322769165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:42] (step=0004181) Train Loss: 0.2533, Train Steps/Sec: 0.12, Epoch: 0.0812475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4182, "loss": 0.20144608616828918, "memory_gb": 7.721559524536133, "step_time_ms": 7567.337274551392, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:50] (step=0004182) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.08126700349786242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:11:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4183, "loss": 0.36707985401153564, "memory_gb": 7.721559524536133, "step_time_ms": 7549.3810176849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:11:58] (step=0004183) Train Loss: 0.3023, Train Steps/Sec: 0.12, Epoch: 0.08128643606684803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4184, "loss": 0.26595550775527954, "memory_gb": 7.721559524536133, "step_time_ms": 7552.759885787964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:06] (step=0004184) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.08130586863583365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4185, "loss": 0.2586755156517029, "memory_gb": 7.721559524536133, "step_time_ms": 7453.737020492554, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:14] (step=0004185) Train Loss: 0.2656, Train Steps/Sec: 0.12, Epoch: 0.08132530120481928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4186, "loss": 0.2997041344642639, "memory_gb": 7.721559524536133, "step_time_ms": 7515.00940322876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:22] (step=0004186) Train Loss: 0.2632, Train Steps/Sec: 0.12, Epoch: 0.0813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4187, "loss": 0.19058465957641602, "memory_gb": 7.715639114379883, "step_time_ms": 7451.25412940979, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:30] (step=0004187) Train Loss: 0.2789, Train Steps/Sec: 0.12, Epoch: 0.08136416634279052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4188, "loss": 0.28474318981170654, "memory_gb": 7.721559524536133, "step_time_ms": 7422.7705001831055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:38] (step=0004188) Train Loss: 0.2922, Train Steps/Sec: 0.12, Epoch: 0.08138359891177614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4189, "loss": 0.1867961585521698, "memory_gb": 7.721559524536133, "step_time_ms": 7481.903791427612, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:46] (step=0004189) Train Loss: 0.2165, Train Steps/Sec: 0.12, Epoch: 0.08140303148076175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:12:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4190, "loss": 0.2646065354347229, "memory_gb": 7.721559524536133, "step_time_ms": 7470.7581996917725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:12:54] (step=0004190) Train Loss: 0.2691, Train Steps/Sec: 0.12, Epoch: 0.08142246404974737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4191, "loss": 0.24292364716529846, "memory_gb": 7.721559524536133, "step_time_ms": 7364.668130874634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:02] (step=0004191) Train Loss: 0.2246, Train Steps/Sec: 0.13, Epoch: 0.081441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4192, "loss": 0.25195738673210144, "memory_gb": 7.721559524536133, "step_time_ms": 7272.855043411255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:10] (step=0004192) Train Loss: 0.2030, Train Steps/Sec: 0.12, Epoch: 0.08146132918771862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4193, "loss": 0.17608997225761414, "memory_gb": 7.721559524536133, "step_time_ms": 7451.490879058838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:18] (step=0004193) Train Loss: 0.2294, Train Steps/Sec: 0.13, Epoch: 0.08148076175670424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4194, "loss": 0.3419552445411682, "memory_gb": 7.721559524536133, "step_time_ms": 7407.485723495483, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:26] (step=0004194) Train Loss: 0.2499, Train Steps/Sec: 0.13, Epoch: 0.08150019432568986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4195, "loss": 0.11429786682128906, "memory_gb": 7.721559524536133, "step_time_ms": 7459.4972133636475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:34] (step=0004195) Train Loss: 0.1200, Train Steps/Sec: 0.13, Epoch: 0.08151962689467547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4196, "loss": 0.12227702140808105, "memory_gb": 7.721559524536133, "step_time_ms": 7481.269598007202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:42] (step=0004196) Train Loss: 0.1824, Train Steps/Sec: 0.13, Epoch: 0.08153905946366109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4197, "loss": 0.172019362449646, "memory_gb": 7.721559524536133, "step_time_ms": 7425.087451934814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:50] (step=0004197) Train Loss: 0.1781, Train Steps/Sec: 0.13, Epoch: 0.08155849203264671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:13:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4198, "loss": 0.3306633234024048, "memory_gb": 7.721559524536133, "step_time_ms": 7460.024833679199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:13:58] (step=0004198) Train Loss: 0.2848, Train Steps/Sec: 0.12, Epoch: 0.08157792460163234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4199, "loss": 0.2842916250228882, "memory_gb": 7.721559524536133, "step_time_ms": 7490.095376968384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:06] (step=0004199) Train Loss: 0.2606, Train Steps/Sec: 0.12, Epoch: 0.08159735717061796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4200, "loss": 0.22779321670532227, "memory_gb": 7.721559524536133, "step_time_ms": 7419.274091720581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:14] (step=0004200) Train Loss: 0.2394, Train Steps/Sec: 0.13, Epoch: 0.08161678973960358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4201, "loss": 0.20891724526882172, "memory_gb": 7.721559524536133, "step_time_ms": 7425.222396850586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:22] (step=0004201) Train Loss: 0.2321, Train Steps/Sec: 0.13, Epoch: 0.08163622230858919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4202, "loss": 0.2484395056962967, "memory_gb": 7.721559524536133, "step_time_ms": 7473.259210586548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:30] (step=0004202) Train Loss: 0.2559, Train Steps/Sec: 0.12, Epoch: 0.08165565487757481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4203, "loss": 0.23100519180297852, "memory_gb": 7.721559524536133, "step_time_ms": 7270.897150039673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:38] (step=0004203) Train Loss: 0.2322, Train Steps/Sec: 0.12, Epoch: 0.08167508744656043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4204, "loss": 0.3183366656303406, "memory_gb": 7.721559524536133, "step_time_ms": 6715.760707855225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:45] (step=0004204) Train Loss: 0.2584, Train Steps/Sec: 0.14, Epoch: 0.08169452001554606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:14:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4205, "loss": 0.22996678948402405, "memory_gb": 7.721559524536133, "step_time_ms": 6116.291522979736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:14:52] (step=0004205) Train Loss: 0.2074, Train Steps/Sec: 0.14, Epoch: 0.08171395258453168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4206, "loss": 0.30133187770843506, "memory_gb": 7.721559524536133, "step_time_ms": 7427.915096282959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:00] (step=0004206) Train Loss: 0.2678, Train Steps/Sec: 0.13, Epoch: 0.0817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4207, "loss": 0.1907477080821991, "memory_gb": 7.721559524536133, "step_time_ms": 7510.772466659546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:08] (step=0004207) Train Loss: 0.1982, Train Steps/Sec: 0.12, Epoch: 0.08175281772250291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4208, "loss": 0.2717151641845703, "memory_gb": 7.721559524536133, "step_time_ms": 7423.728466033936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:16] (step=0004208) Train Loss: 0.2665, Train Steps/Sec: 0.13, Epoch: 0.08177225029148853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4209, "loss": 0.2660340666770935, "memory_gb": 7.721559524536133, "step_time_ms": 7522.174835205078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:24] (step=0004209) Train Loss: 0.2167, Train Steps/Sec: 0.12, Epoch: 0.08179168286047415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4210, "loss": 0.2272334098815918, "memory_gb": 7.721559524536133, "step_time_ms": 7546.451807022095, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:32] (step=0004210) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.08181111542945978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4211, "loss": 0.18229177594184875, "memory_gb": 7.721559524536133, "step_time_ms": 7464.340448379517, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:40] (step=0004211) Train Loss: 0.2416, Train Steps/Sec: 0.12, Epoch: 0.0818305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4212, "loss": 0.16919177770614624, "memory_gb": 7.721559524536133, "step_time_ms": 7631.300926208496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:48] (step=0004212) Train Loss: 0.2497, Train Steps/Sec: 0.13, Epoch: 0.08184998056743101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4213, "loss": 0.2505947947502136, "memory_gb": 7.721559524536133, "step_time_ms": 7541.046857833862, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:15:56] (step=0004213) Train Loss: 0.2749, Train Steps/Sec: 0.13, Epoch: 0.08186941313641663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4214, "loss": 0.23519515991210938, "memory_gb": 7.721559524536133, "step_time_ms": 7220.884561538696, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:04] (step=0004214) Train Loss: 0.2557, Train Steps/Sec: 0.13, Epoch: 0.08188884570540225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4215, "loss": 0.1409253478050232, "memory_gb": 7.721559524536133, "step_time_ms": 7479.305982589722, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:12] (step=0004215) Train Loss: 0.1433, Train Steps/Sec: 0.13, Epoch: 0.08190827827438787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4216, "loss": 0.3693784177303314, "memory_gb": 7.721559524536133, "step_time_ms": 7585.7861042022705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:20] (step=0004216) Train Loss: 0.3020, Train Steps/Sec: 0.12, Epoch: 0.0819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4217, "loss": 0.25188344717025757, "memory_gb": 7.721559524536133, "step_time_ms": 7499.7076988220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:28] (step=0004217) Train Loss: 0.1939, Train Steps/Sec: 0.12, Epoch: 0.08194714341235912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4218, "loss": 0.3014396131038666, "memory_gb": 7.721559524536133, "step_time_ms": 7480.275869369507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:36] (step=0004218) Train Loss: 0.2557, Train Steps/Sec: 0.12, Epoch: 0.08196657598134473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4219, "loss": 0.29575228691101074, "memory_gb": 7.721559524536133, "step_time_ms": 7598.771333694458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:45] (step=0004219) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.08198600855033035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4220, "loss": 0.2601732611656189, "memory_gb": 7.721559524536133, "step_time_ms": 7523.321151733398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:16:52] (step=0004220) Train Loss: 0.2289, Train Steps/Sec: 0.13, Epoch: 0.08200544111931597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4221, "loss": 0.13476119935512543, "memory_gb": 7.721559524536133, "step_time_ms": 7505.566596984863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:00] (step=0004221) Train Loss: 0.1803, Train Steps/Sec: 0.13, Epoch: 0.0820248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4222, "loss": 0.23599296808242798, "memory_gb": 7.721559524536133, "step_time_ms": 7555.7098388671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:08] (step=0004222) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.08204430625728722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4223, "loss": 0.25526162981987, "memory_gb": 7.721559524536133, "step_time_ms": 7502.8016567230225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:16] (step=0004223) Train Loss: 0.3056, Train Steps/Sec: 0.12, Epoch: 0.08206373882627284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4224, "loss": 0.20818710327148438, "memory_gb": 7.721559524536133, "step_time_ms": 7496.57416343689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:24] (step=0004224) Train Loss: 0.2126, Train Steps/Sec: 0.13, Epoch: 0.08208317139525845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4225, "loss": 0.17568352818489075, "memory_gb": 7.721559524536133, "step_time_ms": 7594.137907028198, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:32] (step=0004225) Train Loss: 0.2031, Train Steps/Sec: 0.12, Epoch: 0.08210260396424407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4226, "loss": 0.2646785378456116, "memory_gb": 7.721559524536133, "step_time_ms": 7521.415710449219, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:41] (step=0004226) Train Loss: 0.2209, Train Steps/Sec: 0.12, Epoch: 0.08212203653322969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4227, "loss": 0.1782497763633728, "memory_gb": 7.721559524536133, "step_time_ms": 7495.980739593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:49] (step=0004227) Train Loss: 0.2106, Train Steps/Sec: 0.13, Epoch: 0.08214146910221531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:17:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4228, "loss": 0.21665726602077484, "memory_gb": 7.721559524536133, "step_time_ms": 7541.927099227905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:17:57] (step=0004228) Train Loss: 0.2242, Train Steps/Sec: 0.12, Epoch: 0.08216090167120094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4229, "loss": 0.2165023535490036, "memory_gb": 7.721559524536133, "step_time_ms": 7487.746715545654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:05] (step=0004229) Train Loss: 0.1633, Train Steps/Sec: 0.13, Epoch: 0.08218033424018656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4230, "loss": 0.20924711227416992, "memory_gb": 7.721559524536133, "step_time_ms": 7477.0348072052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:13] (step=0004230) Train Loss: 0.1798, Train Steps/Sec: 0.13, Epoch: 0.08219976680917217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4231, "loss": 0.26733070611953735, "memory_gb": 7.721559524536133, "step_time_ms": 7580.948829650879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:21] (step=0004231) Train Loss: 0.2676, Train Steps/Sec: 0.12, Epoch: 0.08221919937815779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4232, "loss": 0.28841787576675415, "memory_gb": 7.721559524536133, "step_time_ms": 7315.57822227478, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:28] (step=0004232) Train Loss: 0.2395, Train Steps/Sec: 0.13, Epoch: 0.08223863194714341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4233, "loss": 0.15157654881477356, "memory_gb": 7.721559524536133, "step_time_ms": 6601.578235626221, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:35] (step=0004233) Train Loss: 0.1669, Train Steps/Sec: 0.15, Epoch: 0.08225806451612903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4234, "loss": 0.24242419004440308, "memory_gb": 7.721559524536133, "step_time_ms": 6178.084850311279, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:42] (step=0004234) Train Loss: 0.2299, Train Steps/Sec: 0.14, Epoch: 0.08227749708511466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4235, "loss": 0.31300777196884155, "memory_gb": 7.721559524536133, "step_time_ms": 7436.34557723999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:50] (step=0004235) Train Loss: 0.2772, Train Steps/Sec: 0.13, Epoch: 0.08229692965410028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:18:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4236, "loss": 0.20652258396148682, "memory_gb": 7.721559524536133, "step_time_ms": 7514.596223831177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:18:58] (step=0004236) Train Loss: 0.2023, Train Steps/Sec: 0.12, Epoch: 0.08231636222308589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4237, "loss": 0.21307863295078278, "memory_gb": 7.721559524536133, "step_time_ms": 7468.66774559021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:06] (step=0004237) Train Loss: 0.2595, Train Steps/Sec: 0.12, Epoch: 0.08233579479207151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4238, "loss": 0.24297696352005005, "memory_gb": 7.721559524536133, "step_time_ms": 7414.151191711426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:14] (step=0004238) Train Loss: 0.2315, Train Steps/Sec: 0.13, Epoch: 0.08235522736105713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4239, "loss": 0.2904354929924011, "memory_gb": 7.721559524536133, "step_time_ms": 7463.536024093628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:22] (step=0004239) Train Loss: 0.2724, Train Steps/Sec: 0.13, Epoch: 0.08237465993004275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4240, "loss": 0.25678759813308716, "memory_gb": 7.721559524536133, "step_time_ms": 7408.11824798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:30] (step=0004240) Train Loss: 0.2693, Train Steps/Sec: 0.13, Epoch: 0.08239409249902838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4241, "loss": 0.2920612096786499, "memory_gb": 7.721559524536133, "step_time_ms": 7481.911897659302, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:39] (step=0004241) Train Loss: 0.2419, Train Steps/Sec: 0.12, Epoch: 0.08241352506801398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4242, "loss": 0.31660282611846924, "memory_gb": 7.721559524536133, "step_time_ms": 7465.584993362427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:46] (step=0004242) Train Loss: 0.2827, Train Steps/Sec: 0.13, Epoch: 0.08243295763699961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4243, "loss": 0.21131756901741028, "memory_gb": 7.721559524536133, "step_time_ms": 7423.441171646118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:19:54] (step=0004243) Train Loss: 0.2395, Train Steps/Sec: 0.13, Epoch: 0.08245239020598523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4244, "loss": 0.21167653799057007, "memory_gb": 7.721559524536133, "step_time_ms": 7416.611433029175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:02] (step=0004244) Train Loss: 0.2071, Train Steps/Sec: 0.13, Epoch: 0.08247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4245, "loss": 0.17173753678798676, "memory_gb": 7.721559524536133, "step_time_ms": 7454.192638397217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:10] (step=0004245) Train Loss: 0.2005, Train Steps/Sec: 0.13, Epoch: 0.08249125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4246, "loss": 0.20581594109535217, "memory_gb": 7.721559524536133, "step_time_ms": 7444.617033004761, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:18] (step=0004246) Train Loss: 0.2128, Train Steps/Sec: 0.13, Epoch: 0.0825106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4247, "loss": 0.23625896871089935, "memory_gb": 7.721559524536133, "step_time_ms": 7455.894947052002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:26] (step=0004247) Train Loss: 0.2059, Train Steps/Sec: 0.12, Epoch: 0.0825301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4248, "loss": 0.26072514057159424, "memory_gb": 7.721559524536133, "step_time_ms": 7510.659217834473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:34] (step=0004248) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.08254955305091333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4249, "loss": 0.23381437361240387, "memory_gb": 7.721559524536133, "step_time_ms": 7463.767528533936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:42] (step=0004249) Train Loss: 0.2340, Train Steps/Sec: 0.13, Epoch: 0.08256898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4250, "loss": 0.2747531235218048, "memory_gb": 7.721559524536133, "step_time_ms": 7446.166276931763, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:50] (step=0004250) Train Loss: 0.2245, Train Steps/Sec: 0.12, Epoch: 0.08258841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:20:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4251, "loss": 0.12849056720733643, "memory_gb": 7.721559524536133, "step_time_ms": 7475.336074829102, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:20:59] (step=0004251) Train Loss: 0.1370, Train Steps/Sec: 0.12, Epoch: 0.0826078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4252, "loss": 0.35882699489593506, "memory_gb": 7.721559524536133, "step_time_ms": 7590.136528015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:07] (step=0004252) Train Loss: 0.2758, Train Steps/Sec: 0.12, Epoch: 0.08262728332685582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4253, "loss": 0.1842188686132431, "memory_gb": 7.721559524536133, "step_time_ms": 7425.523519515991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:15] (step=0004253) Train Loss: 0.2330, Train Steps/Sec: 0.13, Epoch: 0.08264671589584142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4254, "loss": 0.20758408308029175, "memory_gb": 7.721559524536133, "step_time_ms": 7481.139898300171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:23] (step=0004254) Train Loss: 0.2250, Train Steps/Sec: 0.13, Epoch: 0.08266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4255, "loss": 0.3165339231491089, "memory_gb": 7.721559524536133, "step_time_ms": 7457.450151443481, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:31] (step=0004255) Train Loss: 0.3125, Train Steps/Sec: 0.13, Epoch: 0.08268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4256, "loss": 0.14561639726161957, "memory_gb": 7.721559524536133, "step_time_ms": 7391.403436660767, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:39] (step=0004256) Train Loss: 0.1699, Train Steps/Sec: 0.12, Epoch: 0.08270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4257, "loss": 0.25919532775878906, "memory_gb": 7.721559524536133, "step_time_ms": 7477.212429046631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:47] (step=0004257) Train Loss: 0.2752, Train Steps/Sec: 0.12, Epoch: 0.08272444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:21:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4258, "loss": 0.31023770570755005, "memory_gb": 7.721559524536133, "step_time_ms": 7474.096298217773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:21:55] (step=0004258) Train Loss: 0.2810, Train Steps/Sec: 0.13, Epoch: 0.08274387874076954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4259, "loss": 0.15191131830215454, "memory_gb": 7.721559524536133, "step_time_ms": 7425.527095794678, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:03] (step=0004259) Train Loss: 0.2271, Train Steps/Sec: 0.12, Epoch: 0.08276331130975514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4260, "loss": 0.23927175998687744, "memory_gb": 7.721559524536133, "step_time_ms": 7548.414945602417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:11] (step=0004260) Train Loss: 0.2224, Train Steps/Sec: 0.13, Epoch: 0.08278274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4261, "loss": 0.19010159373283386, "memory_gb": 7.721559524536133, "step_time_ms": 7505.286693572998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:19] (step=0004261) Train Loss: 0.1913, Train Steps/Sec: 0.12, Epoch: 0.08280217644772639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4262, "loss": 0.28136754035949707, "memory_gb": 7.721559524536133, "step_time_ms": 5435.400485992432, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:25] (step=0004262) Train Loss: 0.2395, Train Steps/Sec: 0.16, Epoch: 0.08282160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4263, "loss": 0.2150609791278839, "memory_gb": 7.721559524536133, "step_time_ms": 7510.5955600738525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:33] (step=0004263) Train Loss: 0.2642, Train Steps/Sec: 0.12, Epoch: 0.08284104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4264, "loss": 0.14552731812000275, "memory_gb": 7.721559524536133, "step_time_ms": 7474.27225112915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:41] (step=0004264) Train Loss: 0.1637, Train Steps/Sec: 0.12, Epoch: 0.08286047415468326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4265, "loss": 0.1852792501449585, "memory_gb": 7.721559524536133, "step_time_ms": 7536.035537719727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:49] (step=0004265) Train Loss: 0.2133, Train Steps/Sec: 0.12, Epoch: 0.08287990672366886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:22:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4266, "loss": 0.2751944065093994, "memory_gb": 7.721559524536133, "step_time_ms": 7537.86039352417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:22:57] (step=0004266) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.08289933929265449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4267, "loss": 0.29398900270462036, "memory_gb": 7.721559524536133, "step_time_ms": 7535.689115524292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:05] (step=0004267) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.08291877186164011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4268, "loss": 0.13216646015644073, "memory_gb": 7.721559524536133, "step_time_ms": 7558.712959289551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:13] (step=0004268) Train Loss: 0.1351, Train Steps/Sec: 0.12, Epoch: 0.08293820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4269, "loss": 0.2977356016635895, "memory_gb": 7.721559524536133, "step_time_ms": 7588.564157485962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:21] (step=0004269) Train Loss: 0.2669, Train Steps/Sec: 0.12, Epoch: 0.08295763699961135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4270, "loss": 0.16044478118419647, "memory_gb": 7.721559524536133, "step_time_ms": 7505.386590957642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:29] (step=0004270) Train Loss: 0.2010, Train Steps/Sec: 0.12, Epoch: 0.08297706956859696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4271, "loss": 0.19214913249015808, "memory_gb": 7.721559524536133, "step_time_ms": 7569.945573806763, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:37] (step=0004271) Train Loss: 0.2468, Train Steps/Sec: 0.12, Epoch: 0.08299650213758258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4272, "loss": 0.2868971824645996, "memory_gb": 7.721559524536133, "step_time_ms": 7583.255767822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:45] (step=0004272) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.0830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4273, "loss": 0.1146341860294342, "memory_gb": 7.721559524536133, "step_time_ms": 7449.288606643677, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:23:53] (step=0004273) Train Loss: 0.1427, Train Steps/Sec: 0.12, Epoch: 0.08303536727555383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4274, "loss": 0.26478826999664307, "memory_gb": 7.721559524536133, "step_time_ms": 7524.874210357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:02] (step=0004274) Train Loss: 0.2251, Train Steps/Sec: 0.12, Epoch: 0.08305479984453945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4275, "loss": 0.1861729621887207, "memory_gb": 7.721559524536133, "step_time_ms": 7516.852617263794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:10] (step=0004275) Train Loss: 0.2391, Train Steps/Sec: 0.12, Epoch: 0.08307423241352507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4276, "loss": 0.1941332221031189, "memory_gb": 7.721559524536133, "step_time_ms": 7444.268465042114, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:18] (step=0004276) Train Loss: 0.2123, Train Steps/Sec: 0.12, Epoch: 0.08309366498251068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4277, "loss": 0.28148022294044495, "memory_gb": 7.721559524536133, "step_time_ms": 7493.705749511719, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:26] (step=0004277) Train Loss: 0.2973, Train Steps/Sec: 0.13, Epoch: 0.0831130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4278, "loss": 0.3294147849082947, "memory_gb": 7.721559524536133, "step_time_ms": 7599.7583866119385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:34] (step=0004278) Train Loss: 0.3047, Train Steps/Sec: 0.12, Epoch: 0.08313253012048193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4279, "loss": 0.22245872020721436, "memory_gb": 7.721559524536133, "step_time_ms": 7250.67925453186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:42] (step=0004279) Train Loss: 0.2303, Train Steps/Sec: 0.13, Epoch: 0.08315196268946755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4280, "loss": 0.18322643637657166, "memory_gb": 7.721559524536133, "step_time_ms": 7448.637008666992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:50] (step=0004280) Train Loss: 0.2351, Train Steps/Sec: 0.13, Epoch: 0.08317139525845317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:24:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4281, "loss": 0.24076372385025024, "memory_gb": 7.715639114379883, "step_time_ms": 7345.718622207642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:24:58] (step=0004281) Train Loss: 0.2276, Train Steps/Sec: 0.13, Epoch: 0.08319082782743879, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4282, "loss": 0.26255926489830017, "memory_gb": 7.721559524536133, "step_time_ms": 7297.987461090088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:06] (step=0004282) Train Loss: 0.2904, Train Steps/Sec: 0.13, Epoch: 0.0832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4283, "loss": 0.3337501585483551, "memory_gb": 7.721559524536133, "step_time_ms": 7389.265775680542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:14] (step=0004283) Train Loss: 0.3061, Train Steps/Sec: 0.13, Epoch: 0.08322969296541002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4284, "loss": 0.1693393588066101, "memory_gb": 7.721559524536133, "step_time_ms": 7437.67786026001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:22] (step=0004284) Train Loss: 0.1724, Train Steps/Sec: 0.13, Epoch: 0.08324912553439565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4285, "loss": 0.2931150197982788, "memory_gb": 7.721559524536133, "step_time_ms": 7386.337757110596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:29] (step=0004285) Train Loss: 0.2686, Train Steps/Sec: 0.13, Epoch: 0.08326855810338127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4286, "loss": 0.15139077603816986, "memory_gb": 7.721559524536133, "step_time_ms": 7429.002046585083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:37] (step=0004286) Train Loss: 0.1540, Train Steps/Sec: 0.13, Epoch: 0.08328799067236689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4287, "loss": 0.26746025681495667, "memory_gb": 7.721559524536133, "step_time_ms": 7501.617670059204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:45] (step=0004287) Train Loss: 0.2726, Train Steps/Sec: 0.12, Epoch: 0.08330742324135251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:25:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4288, "loss": 0.2625829875469208, "memory_gb": 7.721559524536133, "step_time_ms": 7411.950588226318, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:25:53] (step=0004288) Train Loss: 0.2730, Train Steps/Sec: 0.13, Epoch: 0.08332685581033812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4289, "loss": 0.24562999606132507, "memory_gb": 7.721559524536133, "step_time_ms": 7292.635202407837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:01] (step=0004289) Train Loss: 0.2522, Train Steps/Sec: 0.13, Epoch: 0.08334628837932374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4290, "loss": 0.18093250691890717, "memory_gb": 7.721559524536133, "step_time_ms": 7482.06639289856, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:09] (step=0004290) Train Loss: 0.2210, Train Steps/Sec: 0.12, Epoch: 0.08336572094830937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4291, "loss": 0.21475395560264587, "memory_gb": 7.721559524536133, "step_time_ms": 4794.474124908447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:15] (step=0004291) Train Loss: 0.2409, Train Steps/Sec: 0.18, Epoch: 0.08338515351729499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4292, "loss": 0.29237860441207886, "memory_gb": 7.721559524536133, "step_time_ms": 7469.113349914551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:23] (step=0004292) Train Loss: 0.2824, Train Steps/Sec: 0.13, Epoch: 0.08340458608628061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4293, "loss": 0.2165088653564453, "memory_gb": 7.721559524536133, "step_time_ms": 7429.267406463623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:31] (step=0004293) Train Loss: 0.2158, Train Steps/Sec: 0.13, Epoch: 0.08342401865526623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4294, "loss": 0.22110804915428162, "memory_gb": 7.721559524536133, "step_time_ms": 7432.08384513855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:39] (step=0004294) Train Loss: 0.2209, Train Steps/Sec: 0.13, Epoch: 0.08344345122425184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4295, "loss": 0.2717069685459137, "memory_gb": 7.721559524536133, "step_time_ms": 7486.783266067505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:47] (step=0004295) Train Loss: 0.2507, Train Steps/Sec: 0.13, Epoch: 0.08346288379323746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4296, "loss": 0.29985305666923523, "memory_gb": 7.715639114379883, "step_time_ms": 7410.071611404419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:26:55] (step=0004296) Train Loss: 0.3044, Train Steps/Sec: 0.13, Epoch: 0.08348231636222309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4297, "loss": 0.3660677969455719, "memory_gb": 7.721559524536133, "step_time_ms": 7426.713466644287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:03] (step=0004297) Train Loss: 0.2445, Train Steps/Sec: 0.13, Epoch: 0.08350174893120871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4298, "loss": 0.12065935134887695, "memory_gb": 7.721559524536133, "step_time_ms": 7461.534023284912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:11] (step=0004298) Train Loss: 0.1717, Train Steps/Sec: 0.13, Epoch: 0.08352118150019433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4299, "loss": 0.20880119502544403, "memory_gb": 7.721559524536133, "step_time_ms": 7550.415754318237, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:19] (step=0004299) Train Loss: 0.2747, Train Steps/Sec: 0.13, Epoch: 0.08354061406917994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4300, "loss": 0.29521942138671875, "memory_gb": 7.721559524536133, "step_time_ms": 7426.607131958008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:27] (step=0004300) Train Loss: 0.2554, Train Steps/Sec: 0.13, Epoch: 0.08356004663816556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4301, "loss": 0.19550752639770508, "memory_gb": 7.721559524536133, "step_time_ms": 7473.6387729644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:35] (step=0004301) Train Loss: 0.1930, Train Steps/Sec: 0.13, Epoch: 0.08357947920715118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4302, "loss": 0.22323516011238098, "memory_gb": 7.721559524536133, "step_time_ms": 7479.507684707642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:43] (step=0004302) Train Loss: 0.2366, Train Steps/Sec: 0.12, Epoch: 0.0835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4303, "loss": 0.17615856230258942, "memory_gb": 7.721559524536133, "step_time_ms": 7423.410654067993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:51] (step=0004303) Train Loss: 0.1513, Train Steps/Sec: 0.13, Epoch: 0.08361834434512243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:27:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4304, "loss": 0.20319877564907074, "memory_gb": 7.721559524536133, "step_time_ms": 7486.76609992981, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:27:59] (step=0004304) Train Loss: 0.2405, Train Steps/Sec: 0.13, Epoch: 0.08363777691410805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4305, "loss": 0.17453983426094055, "memory_gb": 7.721559524536133, "step_time_ms": 7469.914197921753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:07] (step=0004305) Train Loss: 0.2142, Train Steps/Sec: 0.13, Epoch: 0.08365720948309366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4306, "loss": 0.24086037278175354, "memory_gb": 7.721559524536133, "step_time_ms": 7509.392261505127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:15] (step=0004306) Train Loss: 0.2335, Train Steps/Sec: 0.12, Epoch: 0.08367664205207928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4307, "loss": 0.23014238476753235, "memory_gb": 7.721559524536133, "step_time_ms": 7514.429807662964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:23] (step=0004307) Train Loss: 0.2144, Train Steps/Sec: 0.12, Epoch: 0.0836960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4308, "loss": 0.2782800495624542, "memory_gb": 7.721559524536133, "step_time_ms": 7472.54753112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:31] (step=0004308) Train Loss: 0.2738, Train Steps/Sec: 0.12, Epoch: 0.08371550719005053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4309, "loss": 0.18879222869873047, "memory_gb": 7.721559524536133, "step_time_ms": 7442.821979522705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:39] (step=0004309) Train Loss: 0.1675, Train Steps/Sec: 0.13, Epoch: 0.08373493975903615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4310, "loss": 0.22339025139808655, "memory_gb": 7.721559524536133, "step_time_ms": 7552.345991134644, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:47] (step=0004310) Train Loss: 0.2437, Train Steps/Sec: 0.12, Epoch: 0.08375437232802177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:28:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4311, "loss": 0.21609225869178772, "memory_gb": 7.721559524536133, "step_time_ms": 7497.343301773071, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:28:55] (step=0004311) Train Loss: 0.2130, Train Steps/Sec: 0.13, Epoch: 0.08377380489700738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4312, "loss": 0.12198049575090408, "memory_gb": 7.721559524536133, "step_time_ms": 7554.386138916016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:03] (step=0004312) Train Loss: 0.1842, Train Steps/Sec: 0.13, Epoch: 0.083793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4313, "loss": 0.26004159450531006, "memory_gb": 7.721559524536133, "step_time_ms": 7569.2338943481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:11] (step=0004313) Train Loss: 0.2643, Train Steps/Sec: 0.12, Epoch: 0.08381267003497862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4314, "loss": 0.20474424958229065, "memory_gb": 7.721559524536133, "step_time_ms": 7569.115877151489, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:19] (step=0004314) Train Loss: 0.2302, Train Steps/Sec: 0.12, Epoch: 0.08383210260396425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4315, "loss": 0.18627628684043884, "memory_gb": 7.721559524536133, "step_time_ms": 7490.606307983398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:27] (step=0004315) Train Loss: 0.1816, Train Steps/Sec: 0.12, Epoch: 0.08385153517294987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4316, "loss": 0.23433932662010193, "memory_gb": 7.721559524536133, "step_time_ms": 7566.122770309448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:35] (step=0004316) Train Loss: 0.2403, Train Steps/Sec: 0.12, Epoch: 0.08387096774193549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4317, "loss": 0.3051583766937256, "memory_gb": 7.721559524536133, "step_time_ms": 7464.495420455933, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:43] (step=0004317) Train Loss: 0.2534, Train Steps/Sec: 0.12, Epoch: 0.0838904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4318, "loss": 0.26696112751960754, "memory_gb": 7.721559524536133, "step_time_ms": 7347.829580307007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:51] (step=0004318) Train Loss: 0.2283, Train Steps/Sec: 0.13, Epoch: 0.08390983287990672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:29:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4319, "loss": 0.17188282310962677, "memory_gb": 7.721559524536133, "step_time_ms": 7555.968999862671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:29:59] (step=0004319) Train Loss: 0.2387, Train Steps/Sec: 0.12, Epoch: 0.08392926544889234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4320, "loss": 0.162461519241333, "memory_gb": 7.721559524536133, "step_time_ms": 5008.031845092773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:05] (step=0004320) Train Loss: 0.1789, Train Steps/Sec: 0.18, Epoch: 0.08394869801787797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4321, "loss": 0.1661396473646164, "memory_gb": 7.721559524536133, "step_time_ms": 7544.516086578369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:13] (step=0004321) Train Loss: 0.2339, Train Steps/Sec: 0.12, Epoch: 0.08396813058686359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4322, "loss": 0.12852223217487335, "memory_gb": 7.721559524536133, "step_time_ms": 7510.645389556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:21] (step=0004322) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.08398756315584921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4323, "loss": 0.3029714524745941, "memory_gb": 7.721559524536133, "step_time_ms": 7446.158170700073, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:29] (step=0004323) Train Loss: 0.3380, Train Steps/Sec: 0.13, Epoch: 0.08400699572483482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4324, "loss": 0.25570932030677795, "memory_gb": 7.721559524536133, "step_time_ms": 7539.320945739746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:37] (step=0004324) Train Loss: 0.2442, Train Steps/Sec: 0.12, Epoch: 0.08402642829382044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4325, "loss": 0.19121795892715454, "memory_gb": 7.721559524536133, "step_time_ms": 7492.865800857544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:45] (step=0004325) Train Loss: 0.2018, Train Steps/Sec: 0.12, Epoch: 0.08404586086280606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:30:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4326, "loss": 0.3434140980243683, "memory_gb": 7.721559524536133, "step_time_ms": 7463.865041732788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:30:53] (step=0004326) Train Loss: 0.2984, Train Steps/Sec: 0.13, Epoch: 0.08406529343179169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4327, "loss": 0.27271905541419983, "memory_gb": 7.721559524536133, "step_time_ms": 7517.295122146606, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:01] (step=0004327) Train Loss: 0.2227, Train Steps/Sec: 0.12, Epoch: 0.08408472600077731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4328, "loss": 0.22928643226623535, "memory_gb": 7.721559524536133, "step_time_ms": 7418.264389038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:09] (step=0004328) Train Loss: 0.2078, Train Steps/Sec: 0.13, Epoch: 0.08410415856976293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4329, "loss": 0.1973077803850174, "memory_gb": 7.721559524536133, "step_time_ms": 7406.161785125732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:17] (step=0004329) Train Loss: 0.1973, Train Steps/Sec: 0.13, Epoch: 0.08412359113874854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4330, "loss": 0.2149951457977295, "memory_gb": 7.721559524536133, "step_time_ms": 7466.679811477661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:25] (step=0004330) Train Loss: 0.2107, Train Steps/Sec: 0.12, Epoch: 0.08414302370773416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4331, "loss": 0.110662080347538, "memory_gb": 7.721559524536133, "step_time_ms": 7412.9297733306885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:33] (step=0004331) Train Loss: 0.1608, Train Steps/Sec: 0.13, Epoch: 0.08416245627671978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4332, "loss": 0.19821065664291382, "memory_gb": 7.721559524536133, "step_time_ms": 7434.513568878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:41] (step=0004332) Train Loss: 0.1927, Train Steps/Sec: 0.12, Epoch: 0.0841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4333, "loss": 0.2299404889345169, "memory_gb": 7.721559524536133, "step_time_ms": 7488.925218582153, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:49] (step=0004333) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.08420132141469103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4334, "loss": 0.2567078173160553, "memory_gb": 7.721559524536133, "step_time_ms": 7457.23032951355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:31:57] (step=0004334) Train Loss: 0.2369, Train Steps/Sec: 0.12, Epoch: 0.08422075398367664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4335, "loss": 0.22989550232887268, "memory_gb": 7.721559524536133, "step_time_ms": 7421.2682247161865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:05] (step=0004335) Train Loss: 0.2019, Train Steps/Sec: 0.12, Epoch: 0.08424018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4336, "loss": 0.2621462941169739, "memory_gb": 7.721559524536133, "step_time_ms": 7456.290721893311, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:13] (step=0004336) Train Loss: 0.2208, Train Steps/Sec: 0.12, Epoch: 0.08425961912164788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4337, "loss": 0.27552980184555054, "memory_gb": 7.721559524536133, "step_time_ms": 7414.575099945068, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:21] (step=0004337) Train Loss: 0.2292, Train Steps/Sec: 0.13, Epoch: 0.0842790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4338, "loss": 0.2260512411594391, "memory_gb": 7.721559524536133, "step_time_ms": 7438.995838165283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:29] (step=0004338) Train Loss: 0.1832, Train Steps/Sec: 0.12, Epoch: 0.08429848425961912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4339, "loss": 0.1710967868566513, "memory_gb": 7.721559524536133, "step_time_ms": 7484.26365852356, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:37] (step=0004339) Train Loss: 0.1744, Train Steps/Sec: 0.12, Epoch: 0.08431791682860475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4340, "loss": 0.3013974726200104, "memory_gb": 7.721559524536133, "step_time_ms": 7527.8050899505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:45] (step=0004340) Train Loss: 0.3228, Train Steps/Sec: 0.13, Epoch: 0.08433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:32:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4341, "loss": 0.25275886058807373, "memory_gb": 7.721559524536133, "step_time_ms": 7440.40846824646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:32:53] (step=0004341) Train Loss: 0.3112, Train Steps/Sec: 0.12, Epoch: 0.08435678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4342, "loss": 0.13422268629074097, "memory_gb": 7.721559524536133, "step_time_ms": 7477.9393672943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:02] (step=0004342) Train Loss: 0.1448, Train Steps/Sec: 0.12, Epoch: 0.0843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4343, "loss": 0.24116668105125427, "memory_gb": 7.721559524536133, "step_time_ms": 7217.721939086914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:10] (step=0004343) Train Loss: 0.2281, Train Steps/Sec: 0.12, Epoch: 0.08439564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4344, "loss": 0.3013441562652588, "memory_gb": 7.721559524536133, "step_time_ms": 7415.375709533691, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:18] (step=0004344) Train Loss: 0.2671, Train Steps/Sec: 0.12, Epoch: 0.08441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4345, "loss": 0.2548444867134094, "memory_gb": 7.721559524536133, "step_time_ms": 7478.256940841675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:26] (step=0004345) Train Loss: 0.1937, Train Steps/Sec: 0.12, Epoch: 0.08443451224251847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4346, "loss": 0.2778986692428589, "memory_gb": 7.721559524536133, "step_time_ms": 7470.7019329071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:34] (step=0004346) Train Loss: 0.2571, Train Steps/Sec: 0.12, Epoch: 0.08445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4347, "loss": 0.31668907403945923, "memory_gb": 7.721559524536133, "step_time_ms": 7259.958505630493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:42] (step=0004347) Train Loss: 0.2706, Train Steps/Sec: 0.13, Epoch: 0.0844733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4348, "loss": 0.12101028859615326, "memory_gb": 7.721559524536133, "step_time_ms": 7514.046669006348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:49] (step=0004348) Train Loss: 0.2015, Train Steps/Sec: 0.13, Epoch: 0.08449280994947532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:33:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4349, "loss": 0.26851609349250793, "memory_gb": 7.721559524536133, "step_time_ms": 5450.08111000061, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:33:55] (step=0004349) Train Loss: 0.2952, Train Steps/Sec: 0.18, Epoch: 0.08451224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4350, "loss": 0.18468280136585236, "memory_gb": 7.721559524536133, "step_time_ms": 7520.01953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:03] (step=0004350) Train Loss: 0.2377, Train Steps/Sec: 0.12, Epoch: 0.08453167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4351, "loss": 0.2689345180988312, "memory_gb": 7.721559524536133, "step_time_ms": 7590.039491653442, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:11] (step=0004351) Train Loss: 0.2697, Train Steps/Sec: 0.12, Epoch: 0.08455110765643219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4352, "loss": 0.26145607233047485, "memory_gb": 7.721559524536133, "step_time_ms": 7450.45280456543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:19] (step=0004352) Train Loss: 0.2340, Train Steps/Sec: 0.12, Epoch: 0.0845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4353, "loss": 0.287514328956604, "memory_gb": 7.721559524536133, "step_time_ms": 7506.0060024261475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:27] (step=0004353) Train Loss: 0.3024, Train Steps/Sec: 0.12, Epoch: 0.08458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4354, "loss": 0.2558392286300659, "memory_gb": 7.721559524536133, "step_time_ms": 7434.537649154663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:35] (step=0004354) Train Loss: 0.2446, Train Steps/Sec: 0.13, Epoch: 0.08460940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4355, "loss": 0.32658639550209045, "memory_gb": 7.721559524536133, "step_time_ms": 7469.565153121948, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:43] (step=0004355) Train Loss: 0.3052, Train Steps/Sec: 0.13, Epoch: 0.08462883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4356, "loss": 0.26363056898117065, "memory_gb": 7.721559524536133, "step_time_ms": 7540.484428405762, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:51] (step=0004356) Train Loss: 0.2538, Train Steps/Sec: 0.12, Epoch: 0.08464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4357, "loss": 0.2540764808654785, "memory_gb": 7.721559524536133, "step_time_ms": 7455.382823944092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:34:59] (step=0004357) Train Loss: 0.2628, Train Steps/Sec: 0.13, Epoch: 0.0846677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4358, "loss": 0.1567111313343048, "memory_gb": 7.721559524536133, "step_time_ms": 7505.598545074463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:07] (step=0004358) Train Loss: 0.2431, Train Steps/Sec: 0.12, Epoch: 0.08468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4359, "loss": 0.17208465933799744, "memory_gb": 7.721559524536133, "step_time_ms": 7579.227685928345, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:15] (step=0004359) Train Loss: 0.2137, Train Steps/Sec: 0.12, Epoch: 0.08470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4360, "loss": 0.27151596546173096, "memory_gb": 7.721559524536133, "step_time_ms": 7463.863372802734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:23] (step=0004360) Train Loss: 0.2601, Train Steps/Sec: 0.12, Epoch: 0.08472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4361, "loss": 0.34066706895828247, "memory_gb": 7.721559524536133, "step_time_ms": 7505.798101425171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:31] (step=0004361) Train Loss: 0.2788, Train Steps/Sec: 0.13, Epoch: 0.08474543334628838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4362, "loss": 0.22214455902576447, "memory_gb": 7.721559524536133, "step_time_ms": 7617.19012260437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:39] (step=0004362) Train Loss: 0.1984, Train Steps/Sec: 0.12, Epoch: 0.084764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4363, "loss": 0.14992322027683258, "memory_gb": 7.721559524536133, "step_time_ms": 7573.00329208374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:47] (step=0004363) Train Loss: 0.1666, Train Steps/Sec: 0.12, Epoch: 0.08478429848425961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:35:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4364, "loss": 0.2617456912994385, "memory_gb": 7.721559524536133, "step_time_ms": 7485.2135181427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:35:55] (step=0004364) Train Loss: 0.2055, Train Steps/Sec: 0.12, Epoch: 0.08480373105324523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4365, "loss": 0.18322297930717468, "memory_gb": 7.721559524536133, "step_time_ms": 7572.8795528411865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:03] (step=0004365) Train Loss: 0.1883, Train Steps/Sec: 0.12, Epoch: 0.08482316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4366, "loss": 0.26923319697380066, "memory_gb": 7.721559524536133, "step_time_ms": 7490.985155105591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:11] (step=0004366) Train Loss: 0.2291, Train Steps/Sec: 0.13, Epoch: 0.08484259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4367, "loss": 0.23300457000732422, "memory_gb": 7.721559524536133, "step_time_ms": 7489.42756652832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:20] (step=0004367) Train Loss: 0.2462, Train Steps/Sec: 0.13, Epoch: 0.0848620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4368, "loss": 0.21698784828186035, "memory_gb": 7.721559524536133, "step_time_ms": 7575.857639312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:28] (step=0004368) Train Loss: 0.2629, Train Steps/Sec: 0.12, Epoch: 0.08488146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4369, "loss": 0.23301224410533905, "memory_gb": 7.721559524536133, "step_time_ms": 7481.484413146973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:36] (step=0004369) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.08490089389817333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4370, "loss": 0.2020343691110611, "memory_gb": 7.721559524536133, "step_time_ms": 7434.470176696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:44] (step=0004370) Train Loss: 0.2132, Train Steps/Sec: 0.12, Epoch: 0.08492032646715895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4371, "loss": 0.295606791973114, "memory_gb": 7.721559524536133, "step_time_ms": 7505.834579467773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:36:52] (step=0004371) Train Loss: 0.2565, Train Steps/Sec: 0.12, Epoch: 0.08493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4372, "loss": 0.21095649898052216, "memory_gb": 7.721559524536133, "step_time_ms": 7459.299325942993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:00] (step=0004372) Train Loss: 0.2738, Train Steps/Sec: 0.13, Epoch: 0.0849591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4373, "loss": 0.2539830803871155, "memory_gb": 7.721559524536133, "step_time_ms": 7493.860244750977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:08] (step=0004373) Train Loss: 0.2812, Train Steps/Sec: 0.12, Epoch: 0.08497862417411582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4374, "loss": 0.1745908409357071, "memory_gb": 7.721559524536133, "step_time_ms": 7575.73938369751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:16] (step=0004374) Train Loss: 0.1998, Train Steps/Sec: 0.12, Epoch: 0.08499805674310144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4375, "loss": 0.26990360021591187, "memory_gb": 7.721559524536133, "step_time_ms": 7462.296724319458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:24] (step=0004375) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.08501748931208705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4376, "loss": 0.17491771280765533, "memory_gb": 7.721559524536133, "step_time_ms": 7279.636859893799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:32] (step=0004376) Train Loss: 0.1864, Train Steps/Sec: 0.13, Epoch: 0.08503692188107267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4377, "loss": 0.265663743019104, "memory_gb": 7.721559524536133, "step_time_ms": 7474.934816360474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:40] (step=0004377) Train Loss: 0.2275, Train Steps/Sec: 0.12, Epoch: 0.0850563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4378, "loss": 0.23452962934970856, "memory_gb": 7.721559524536133, "step_time_ms": 5055.86314201355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:45] (step=0004378) Train Loss: 0.2690, Train Steps/Sec: 0.19, Epoch: 0.08507578701904392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:37:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4379, "loss": 0.31728947162628174, "memory_gb": 7.721559524536133, "step_time_ms": 7510.523080825806, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:37:53] (step=0004379) Train Loss: 0.3004, Train Steps/Sec: 0.12, Epoch: 0.08509521958802954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4380, "loss": 0.2576407194137573, "memory_gb": 7.721559524536133, "step_time_ms": 7454.664468765259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:01] (step=0004380) Train Loss: 0.2269, Train Steps/Sec: 0.13, Epoch: 0.08511465215701516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4381, "loss": 0.21120969951152802, "memory_gb": 7.721559524536133, "step_time_ms": 7412.922620773315, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:09] (step=0004381) Train Loss: 0.2695, Train Steps/Sec: 0.13, Epoch: 0.08513408472600077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4382, "loss": 0.20145469903945923, "memory_gb": 7.721559524536133, "step_time_ms": 7486.814975738525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:17] (step=0004382) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.0851535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4383, "loss": 0.15168753266334534, "memory_gb": 7.721559524536133, "step_time_ms": 7456.19797706604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:25] (step=0004383) Train Loss: 0.1722, Train Steps/Sec: 0.13, Epoch: 0.08517294986397202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4384, "loss": 0.1354159712791443, "memory_gb": 7.721559524536133, "step_time_ms": 7453.238964080811, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:33] (step=0004384) Train Loss: 0.1836, Train Steps/Sec: 0.12, Epoch: 0.08519238243295764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4385, "loss": 0.26831504702568054, "memory_gb": 7.721559524536133, "step_time_ms": 7566.340684890747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:41] (step=0004385) Train Loss: 0.2406, Train Steps/Sec: 0.12, Epoch: 0.08521181500194326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4386, "loss": 0.22738736867904663, "memory_gb": 7.721559524536133, "step_time_ms": 7498.754024505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:49] (step=0004386) Train Loss: 0.2297, Train Steps/Sec: 0.12, Epoch: 0.08523124757092888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:38:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4387, "loss": 0.21042072772979736, "memory_gb": 7.721559524536133, "step_time_ms": 7422.888517379761, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:38:57] (step=0004387) Train Loss: 0.1970, Train Steps/Sec: 0.13, Epoch: 0.08525068013991449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4388, "loss": 0.22893871366977692, "memory_gb": 7.721559524536133, "step_time_ms": 7637.295246124268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:05] (step=0004388) Train Loss: 0.2466, Train Steps/Sec: 0.12, Epoch: 0.08527011270890011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4389, "loss": 0.3017914593219757, "memory_gb": 7.721559524536133, "step_time_ms": 7423.334121704102, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:13] (step=0004389) Train Loss: 0.2573, Train Steps/Sec: 0.13, Epoch: 0.08528954527788574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4390, "loss": 0.2988453805446625, "memory_gb": 7.721559524536133, "step_time_ms": 7435.057878494263, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:21] (step=0004390) Train Loss: 0.2256, Train Steps/Sec: 0.12, Epoch: 0.08530897784687136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4391, "loss": 0.22746601700782776, "memory_gb": 7.721559524536133, "step_time_ms": 7544.248580932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:29] (step=0004391) Train Loss: 0.2293, Train Steps/Sec: 0.12, Epoch: 0.08532841041585698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4392, "loss": 0.3030174970626831, "memory_gb": 7.721559524536133, "step_time_ms": 7454.8540115356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:38] (step=0004392) Train Loss: 0.1998, Train Steps/Sec: 0.13, Epoch: 0.08534784298484259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4393, "loss": 0.22853873670101166, "memory_gb": 7.721559524536133, "step_time_ms": 7431.7896366119385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:45] (step=0004393) Train Loss: 0.2453, Train Steps/Sec: 0.13, Epoch: 0.08536727555382821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4394, "loss": 0.341155081987381, "memory_gb": 7.721559524536133, "step_time_ms": 7556.679248809814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:39:54] (step=0004394) Train Loss: 0.3195, Train Steps/Sec: 0.12, Epoch: 0.08538670812281383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4395, "loss": 0.20706117153167725, "memory_gb": 7.721559524536133, "step_time_ms": 7458.493947982788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:02] (step=0004395) Train Loss: 0.2562, Train Steps/Sec: 0.12, Epoch: 0.08540614069179946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4396, "loss": 0.16200119256973267, "memory_gb": 7.721559524536133, "step_time_ms": 7482.6459884643555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:10] (step=0004396) Train Loss: 0.2182, Train Steps/Sec: 0.12, Epoch: 0.08542557326078508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4397, "loss": 0.34363287687301636, "memory_gb": 7.721559524536133, "step_time_ms": 7537.894248962402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:18] (step=0004397) Train Loss: 0.3048, Train Steps/Sec: 0.12, Epoch: 0.0854450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4398, "loss": 0.21457359194755554, "memory_gb": 7.721559524536133, "step_time_ms": 7461.895942687988, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:26] (step=0004398) Train Loss: 0.2887, Train Steps/Sec: 0.12, Epoch: 0.08546443839875631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4399, "loss": 0.12607170641422272, "memory_gb": 7.721559524536133, "step_time_ms": 7462.427854537964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:34] (step=0004399) Train Loss: 0.2295, Train Steps/Sec: 0.12, Epoch: 0.08548387096774193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4400, "loss": 0.2041381448507309, "memory_gb": 7.721559524536133, "step_time_ms": 7534.548997879028, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:42] (step=0004400) Train Loss: 0.2111, Train Steps/Sec: 0.12, Epoch: 0.08550330353672755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4401, "loss": 0.1710604429244995, "memory_gb": 7.721559524536133, "step_time_ms": 7496.117115020752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:50] (step=0004401) Train Loss: 0.1693, Train Steps/Sec: 0.12, Epoch: 0.08552273610571318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4402, "loss": 0.2028113305568695, "memory_gb": 7.721559524536133, "step_time_ms": 7524.762392044067, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:40:58] (step=0004402) Train Loss: 0.2375, Train Steps/Sec: 0.12, Epoch: 0.0855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4403, "loss": 0.27404385805130005, "memory_gb": 7.721559524536133, "step_time_ms": 7594.902276992798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:06] (step=0004403) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.08556160124368442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4404, "loss": 0.2643003463745117, "memory_gb": 7.721559524536133, "step_time_ms": 7563.319444656372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:14] (step=0004404) Train Loss: 0.2080, Train Steps/Sec: 0.12, Epoch: 0.08558103381267003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4405, "loss": 0.29922419786453247, "memory_gb": 7.721559524536133, "step_time_ms": 7395.0300216674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:22] (step=0004405) Train Loss: 0.2747, Train Steps/Sec: 0.13, Epoch: 0.08560046638165565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4406, "loss": 0.2945195436477661, "memory_gb": 7.721559524536133, "step_time_ms": 7580.584526062012, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:30] (step=0004406) Train Loss: 0.2299, Train Steps/Sec: 0.13, Epoch: 0.08561989895064127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4407, "loss": 0.22011512517929077, "memory_gb": 7.721559524536133, "step_time_ms": 4492.8460121154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:36] (step=0004407) Train Loss: 0.2218, Train Steps/Sec: 0.17, Epoch: 0.0856393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4408, "loss": 0.26970916986465454, "memory_gb": 7.721559524536133, "step_time_ms": 7579.373121261597, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:44] (step=0004408) Train Loss: 0.2177, Train Steps/Sec: 0.12, Epoch: 0.08565876408861252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4409, "loss": 0.21222937107086182, "memory_gb": 7.721559524536133, "step_time_ms": 7540.865898132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:41:52] (step=0004409) Train Loss: 0.2550, Train Steps/Sec: 0.13, Epoch: 0.08567819665759814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4410, "loss": 0.18196804821491241, "memory_gb": 7.721559524536133, "step_time_ms": 7530.634880065918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:00] (step=0004410) Train Loss: 0.2280, Train Steps/Sec: 0.12, Epoch: 0.08569762922658375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4411, "loss": 0.24663758277893066, "memory_gb": 7.721559524536133, "step_time_ms": 7683.942079544067, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:08] (step=0004411) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.08571706179556937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4412, "loss": 0.24451306462287903, "memory_gb": 7.721559524536133, "step_time_ms": 7626.548528671265, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:16] (step=0004412) Train Loss: 0.2607, Train Steps/Sec: 0.12, Epoch: 0.085736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4413, "loss": 0.1534491777420044, "memory_gb": 7.721559524536133, "step_time_ms": 7546.859979629517, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:24] (step=0004413) Train Loss: 0.1975, Train Steps/Sec: 0.13, Epoch: 0.08575592693354062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4414, "loss": 0.1504104733467102, "memory_gb": 7.721559524536133, "step_time_ms": 7625.462055206299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:32] (step=0004414) Train Loss: 0.2041, Train Steps/Sec: 0.12, Epoch: 0.08577535950252624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4415, "loss": 0.16524738073349, "memory_gb": 7.721559524536133, "step_time_ms": 7548.736810684204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:40] (step=0004415) Train Loss: 0.2400, Train Steps/Sec: 0.13, Epoch: 0.08579479207151186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4416, "loss": 0.19740816950798035, "memory_gb": 7.721559524536133, "step_time_ms": 7523.470640182495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:48] (step=0004416) Train Loss: 0.2357, Train Steps/Sec: 0.12, Epoch: 0.08581422464049747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4417, "loss": 0.3579871654510498, "memory_gb": 7.721559524536133, "step_time_ms": 7531.783103942871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:42:56] (step=0004417) Train Loss: 0.2995, Train Steps/Sec: 0.12, Epoch: 0.08583365720948309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4418, "loss": 0.2428249716758728, "memory_gb": 7.721559524536133, "step_time_ms": 7494.686603546143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:04] (step=0004418) Train Loss: 0.2671, Train Steps/Sec: 0.13, Epoch: 0.08585308977846871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4419, "loss": 0.19862836599349976, "memory_gb": 7.721559524536133, "step_time_ms": 7476.218223571777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:12] (step=0004419) Train Loss: 0.2154, Train Steps/Sec: 0.13, Epoch: 0.08587252234745434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4420, "loss": 0.24053896963596344, "memory_gb": 7.721559524536133, "step_time_ms": 7593.771696090698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:20] (step=0004420) Train Loss: 0.2164, Train Steps/Sec: 0.12, Epoch: 0.08589195491643996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4421, "loss": 0.24594587087631226, "memory_gb": 7.715639114379883, "step_time_ms": 7451.568365097046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:28] (step=0004421) Train Loss: 0.1978, Train Steps/Sec: 0.13, Epoch: 0.08591138748542557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4422, "loss": 0.21047407388687134, "memory_gb": 7.721559524536133, "step_time_ms": 7444.903612136841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:36] (step=0004422) Train Loss: 0.2235, Train Steps/Sec: 0.13, Epoch: 0.08593082005441119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4423, "loss": 0.21764014661312103, "memory_gb": 7.721559524536133, "step_time_ms": 7507.423639297485, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:44] (step=0004423) Train Loss: 0.1916, Train Steps/Sec: 0.12, Epoch: 0.08595025262339681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4424, "loss": 0.2422042042016983, "memory_gb": 7.721559524536133, "step_time_ms": 7444.713830947876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:43:52] (step=0004424) Train Loss: 0.2402, Train Steps/Sec: 0.13, Epoch: 0.08596968519238243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4425, "loss": 0.16441327333450317, "memory_gb": 7.721559524536133, "step_time_ms": 7488.785028457642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:00] (step=0004425) Train Loss: 0.1541, Train Steps/Sec: 0.12, Epoch: 0.08598911776136806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4426, "loss": 0.29471367597579956, "memory_gb": 7.721559524536133, "step_time_ms": 7562.500715255737, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:08] (step=0004426) Train Loss: 0.2546, Train Steps/Sec: 0.12, Epoch: 0.08600855033035368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4427, "loss": 0.20737388730049133, "memory_gb": 7.721559524536133, "step_time_ms": 7422.323703765869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:16] (step=0004427) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.08602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4428, "loss": 0.13502652943134308, "memory_gb": 7.721559524536133, "step_time_ms": 7501.067161560059, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:24] (step=0004428) Train Loss: 0.2547, Train Steps/Sec: 0.13, Epoch: 0.08604741546832491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4429, "loss": 0.3083791136741638, "memory_gb": 7.721559524536133, "step_time_ms": 7495.036840438843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:32] (step=0004429) Train Loss: 0.2893, Train Steps/Sec: 0.12, Epoch: 0.08606684803731053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4430, "loss": 0.18270938098430634, "memory_gb": 7.721559524536133, "step_time_ms": 7430.289268493652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:40] (step=0004430) Train Loss: 0.1913, Train Steps/Sec: 0.13, Epoch: 0.08608628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4431, "loss": 0.1956140697002411, "memory_gb": 7.721559524536133, "step_time_ms": 7406.152725219727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:48] (step=0004431) Train Loss: 0.1925, Train Steps/Sec: 0.13, Epoch: 0.08610571317528178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:44:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4432, "loss": 0.21594330668449402, "memory_gb": 7.721559524536133, "step_time_ms": 7495.11456489563, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:44:56] (step=0004432) Train Loss: 0.2880, Train Steps/Sec: 0.12, Epoch: 0.0861251457442674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4433, "loss": 0.2427166998386383, "memory_gb": 7.721559524536133, "step_time_ms": 7410.844802856445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:04] (step=0004433) Train Loss: 0.2250, Train Steps/Sec: 0.13, Epoch: 0.086144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4434, "loss": 0.18787723779678345, "memory_gb": 7.721559524536133, "step_time_ms": 7256.409168243408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:12] (step=0004434) Train Loss: 0.2645, Train Steps/Sec: 0.13, Epoch: 0.08616401088223863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4435, "loss": 0.20931801199913025, "memory_gb": 7.721559524536133, "step_time_ms": 7470.327138900757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:20] (step=0004435) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.08618344345122425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4436, "loss": 0.22681018710136414, "memory_gb": 7.721559524536133, "step_time_ms": 4916.8524742126465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:25] (step=0004436) Train Loss: 0.2549, Train Steps/Sec: 0.18, Epoch: 0.08620287602020987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4437, "loss": 0.1307874321937561, "memory_gb": 7.721559524536133, "step_time_ms": 7471.924781799316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:33] (step=0004437) Train Loss: 0.1598, Train Steps/Sec: 0.12, Epoch: 0.0862223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4438, "loss": 0.337959885597229, "memory_gb": 7.721559524536133, "step_time_ms": 7465.395450592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:41] (step=0004438) Train Loss: 0.2692, Train Steps/Sec: 0.12, Epoch: 0.08624174115818112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4439, "loss": 0.35340988636016846, "memory_gb": 7.721559524536133, "step_time_ms": 7410.470247268677, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:49] (step=0004439) Train Loss: 0.2757, Train Steps/Sec: 0.13, Epoch: 0.08626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4440, "loss": 0.2769715189933777, "memory_gb": 7.721559524536133, "step_time_ms": 7475.418329238892, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:45:57] (step=0004440) Train Loss: 0.2327, Train Steps/Sec: 0.12, Epoch: 0.08628060629615235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4441, "loss": 0.17567656934261322, "memory_gb": 7.721559524536133, "step_time_ms": 7459.939002990723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:05] (step=0004441) Train Loss: 0.1592, Train Steps/Sec: 0.13, Epoch: 0.08630003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4442, "loss": 0.3319205641746521, "memory_gb": 7.721559524536133, "step_time_ms": 7440.061807632446, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:13] (step=0004442) Train Loss: 0.2552, Train Steps/Sec: 0.13, Epoch: 0.08631947143412359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4443, "loss": 0.21373671293258667, "memory_gb": 7.721559524536133, "step_time_ms": 7487.811326980591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:21] (step=0004443) Train Loss: 0.2751, Train Steps/Sec: 0.12, Epoch: 0.08633890400310922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4444, "loss": 0.21376650035381317, "memory_gb": 7.721559524536133, "step_time_ms": 7477.14900970459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:30] (step=0004444) Train Loss: 0.2103, Train Steps/Sec: 0.12, Epoch: 0.08635833657209484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4445, "loss": 0.2916949987411499, "memory_gb": 7.721559524536133, "step_time_ms": 7399.100780487061, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:38] (step=0004445) Train Loss: 0.2191, Train Steps/Sec: 0.12, Epoch: 0.08637776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4446, "loss": 0.34884870052337646, "memory_gb": 7.721559524536133, "step_time_ms": 7473.29568862915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:46] (step=0004446) Train Loss: 0.3042, Train Steps/Sec: 0.12, Epoch: 0.08639720171006607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:46:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4447, "loss": 0.22381602227687836, "memory_gb": 7.721559524536133, "step_time_ms": 7412.383794784546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:46:54] (step=0004447) Train Loss: 0.2965, Train Steps/Sec: 0.13, Epoch: 0.08641663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4448, "loss": 0.25007379055023193, "memory_gb": 7.721559524536133, "step_time_ms": 7457.835674285889, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:02] (step=0004448) Train Loss: 0.2392, Train Steps/Sec: 0.13, Epoch: 0.08643606684803731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4449, "loss": 0.219772070646286, "memory_gb": 7.721559524536133, "step_time_ms": 7532.470464706421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:10] (step=0004449) Train Loss: 0.2774, Train Steps/Sec: 0.12, Epoch: 0.08645549941702294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4450, "loss": 0.3013645112514496, "memory_gb": 7.721559524536133, "step_time_ms": 7541.0168170928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:18] (step=0004450) Train Loss: 0.2776, Train Steps/Sec: 0.12, Epoch: 0.08647493198600854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4451, "loss": 0.3183831572532654, "memory_gb": 7.721559524536133, "step_time_ms": 7419.052362442017, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:26] (step=0004451) Train Loss: 0.2556, Train Steps/Sec: 0.13, Epoch: 0.08649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4452, "loss": 0.2131151407957077, "memory_gb": 7.721559524536133, "step_time_ms": 7542.287588119507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:34] (step=0004452) Train Loss: 0.2466, Train Steps/Sec: 0.12, Epoch: 0.08651379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4453, "loss": 0.1911378800868988, "memory_gb": 7.721559524536133, "step_time_ms": 7524.245977401733, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:42] (step=0004453) Train Loss: 0.2512, Train Steps/Sec: 0.12, Epoch: 0.08653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4454, "loss": 0.25845110416412354, "memory_gb": 7.721559524536133, "step_time_ms": 7499.76372718811, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:50] (step=0004454) Train Loss: 0.2423, Train Steps/Sec: 0.12, Epoch: 0.08655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:47:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4455, "loss": 0.24144382774829865, "memory_gb": 7.721559524536133, "step_time_ms": 7561.454057693481, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:47:58] (step=0004455) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.08657209483093666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4456, "loss": 0.18953090906143188, "memory_gb": 7.721559524536133, "step_time_ms": 7590.690851211548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:06] (step=0004456) Train Loss: 0.1618, Train Steps/Sec: 0.12, Epoch: 0.08659152739992226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4457, "loss": 0.270005464553833, "memory_gb": 7.721559524536133, "step_time_ms": 7599.5283126831055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:14] (step=0004457) Train Loss: 0.2399, Train Steps/Sec: 0.12, Epoch: 0.08661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4458, "loss": 0.26110750436782837, "memory_gb": 7.721559524536133, "step_time_ms": 7632.360219955444, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:22] (step=0004458) Train Loss: 0.2834, Train Steps/Sec: 0.12, Epoch: 0.08663039253789351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4459, "loss": 0.2625512480735779, "memory_gb": 7.721559524536133, "step_time_ms": 7615.732431411743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:30] (step=0004459) Train Loss: 0.2638, Train Steps/Sec: 0.13, Epoch: 0.08664982510687913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4460, "loss": 0.22672757506370544, "memory_gb": 7.721559524536133, "step_time_ms": 7552.8693199157715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:38] (step=0004460) Train Loss: 0.2342, Train Steps/Sec: 0.13, Epoch: 0.08666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4461, "loss": 0.23730197548866272, "memory_gb": 7.721559524536133, "step_time_ms": 7613.339900970459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:46] (step=0004461) Train Loss: 0.3009, Train Steps/Sec: 0.12, Epoch: 0.08668869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:48:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4462, "loss": 0.28964763879776, "memory_gb": 7.721559524536133, "step_time_ms": 7600.23307800293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:48:54] (step=0004462) Train Loss: 0.2857, Train Steps/Sec: 0.13, Epoch: 0.08670812281383598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4463, "loss": 0.3652384877204895, "memory_gb": 7.721559524536133, "step_time_ms": 7411.802768707275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:02] (step=0004463) Train Loss: 0.2765, Train Steps/Sec: 0.13, Epoch: 0.0867275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4464, "loss": 0.11186471581459045, "memory_gb": 7.721559524536133, "step_time_ms": 7650.245428085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:10] (step=0004464) Train Loss: 0.1532, Train Steps/Sec: 0.12, Epoch: 0.08674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4465, "loss": 0.27212226390838623, "memory_gb": 7.721559524536133, "step_time_ms": 5165.7350063323975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:15] (step=0004465) Train Loss: 0.2450, Train Steps/Sec: 0.19, Epoch: 0.08676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4466, "loss": 0.2337748259305954, "memory_gb": 7.721559524536133, "step_time_ms": 7628.305912017822, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:23] (step=0004466) Train Loss: 0.2418, Train Steps/Sec: 0.12, Epoch: 0.08678585308977847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4467, "loss": 0.22288832068443298, "memory_gb": 7.721559524536133, "step_time_ms": 7569.775342941284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:31] (step=0004467) Train Loss: 0.2795, Train Steps/Sec: 0.12, Epoch: 0.0868052856587641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4468, "loss": 0.2074560523033142, "memory_gb": 7.721559524536133, "step_time_ms": 7508.946657180786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:40] (step=0004468) Train Loss: 0.1889, Train Steps/Sec: 0.12, Epoch: 0.0868247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4469, "loss": 0.28574657440185547, "memory_gb": 7.721559524536133, "step_time_ms": 7584.201335906982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:48] (step=0004469) Train Loss: 0.2673, Train Steps/Sec: 0.12, Epoch: 0.08684415079673533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:49:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4470, "loss": 0.211391881108284, "memory_gb": 7.721559524536133, "step_time_ms": 7494.510650634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:49:56] (step=0004470) Train Loss: 0.2102, Train Steps/Sec: 0.12, Epoch: 0.08686358336572095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4471, "loss": 0.27876147627830505, "memory_gb": 7.721559524536133, "step_time_ms": 7457.491159439087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:04] (step=0004471) Train Loss: 0.2542, Train Steps/Sec: 0.13, Epoch: 0.08688301593470657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4472, "loss": 0.22009463608264923, "memory_gb": 7.721559524536133, "step_time_ms": 7557.865381240845, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:12] (step=0004472) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.08690244850369219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4473, "loss": 0.2162466049194336, "memory_gb": 7.721559524536133, "step_time_ms": 7530.78031539917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:20] (step=0004473) Train Loss: 0.2152, Train Steps/Sec: 0.13, Epoch: 0.08692188107267781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4474, "loss": 0.30076462030410767, "memory_gb": 7.721559524536133, "step_time_ms": 7438.1163120269775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:28] (step=0004474) Train Loss: 0.3306, Train Steps/Sec: 0.13, Epoch: 0.08694131364166342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4475, "loss": 0.1561325192451477, "memory_gb": 7.721559524536133, "step_time_ms": 7706.983327865601, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:36] (step=0004475) Train Loss: 0.1826, Train Steps/Sec: 0.12, Epoch: 0.08696074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4476, "loss": 0.16309380531311035, "memory_gb": 7.721559524536133, "step_time_ms": 7508.754014968872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:44] (step=0004476) Train Loss: 0.2019, Train Steps/Sec: 0.13, Epoch: 0.08698017877963467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:50:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4477, "loss": 0.23252028226852417, "memory_gb": 7.721559524536133, "step_time_ms": 7279.28900718689, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:50:52] (step=0004477) Train Loss: 0.2369, Train Steps/Sec: 0.12, Epoch: 0.08699961134862029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4478, "loss": 0.17699165642261505, "memory_gb": 7.721559524536133, "step_time_ms": 7495.5339431762695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:00] (step=0004478) Train Loss: 0.2086, Train Steps/Sec: 0.12, Epoch: 0.08701904391760591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4479, "loss": 0.24645251035690308, "memory_gb": 7.721559524536133, "step_time_ms": 7466.095209121704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:08] (step=0004479) Train Loss: 0.2549, Train Steps/Sec: 0.12, Epoch: 0.08703847648659152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4480, "loss": 0.18952801823616028, "memory_gb": 7.721559524536133, "step_time_ms": 7381.41942024231, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:16] (step=0004480) Train Loss: 0.1759, Train Steps/Sec: 0.13, Epoch: 0.08705790905557714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4481, "loss": 0.24938586354255676, "memory_gb": 7.721559524536133, "step_time_ms": 7469.372987747192, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:24] (step=0004481) Train Loss: 0.2338, Train Steps/Sec: 0.13, Epoch: 0.08707734162456277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4482, "loss": 0.30792880058288574, "memory_gb": 7.721559524536133, "step_time_ms": 7485.679626464844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:32] (step=0004482) Train Loss: 0.3140, Train Steps/Sec: 0.12, Epoch: 0.08709677419354839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4483, "loss": 0.1944715976715088, "memory_gb": 7.721559524536133, "step_time_ms": 7459.832668304443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:40] (step=0004483) Train Loss: 0.2904, Train Steps/Sec: 0.12, Epoch: 0.08711620676253401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4484, "loss": 0.3495529294013977, "memory_gb": 7.721559524536133, "step_time_ms": 7448.924541473389, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:48] (step=0004484) Train Loss: 0.3107, Train Steps/Sec: 0.13, Epoch: 0.08713563933151963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:51:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4485, "loss": 0.3013876676559448, "memory_gb": 7.721559524536133, "step_time_ms": 7479.173898696899, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:51:56] (step=0004485) Train Loss: 0.2420, Train Steps/Sec: 0.12, Epoch: 0.08715507190050524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4486, "loss": 0.179831862449646, "memory_gb": 7.721559524536133, "step_time_ms": 7407.309055328369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:04] (step=0004486) Train Loss: 0.1838, Train Steps/Sec: 0.13, Epoch: 0.08717450446949086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4487, "loss": 0.2507564425468445, "memory_gb": 7.721559524536133, "step_time_ms": 7408.581733703613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:12] (step=0004487) Train Loss: 0.2513, Train Steps/Sec: 0.13, Epoch: 0.08719393703847649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4488, "loss": 0.2880013585090637, "memory_gb": 7.721559524536133, "step_time_ms": 7421.214580535889, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:20] (step=0004488) Train Loss: 0.2736, Train Steps/Sec: 0.13, Epoch: 0.08721336960746211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4489, "loss": 0.14201904833316803, "memory_gb": 7.721559524536133, "step_time_ms": 7387.5157833099365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:28] (step=0004489) Train Loss: 0.1900, Train Steps/Sec: 0.13, Epoch: 0.08723280217644773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4490, "loss": 0.31861066818237305, "memory_gb": 7.721559524536133, "step_time_ms": 7467.7698612213135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:36] (step=0004490) Train Loss: 0.3552, Train Steps/Sec: 0.12, Epoch: 0.08725223474543335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4491, "loss": 0.21137571334838867, "memory_gb": 7.721559524536133, "step_time_ms": 7439.184188842773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:44] (step=0004491) Train Loss: 0.1847, Train Steps/Sec: 0.12, Epoch: 0.08727166731441896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:52:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4492, "loss": 0.22185109555721283, "memory_gb": 7.721559524536133, "step_time_ms": 7319.713592529297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:52:52] (step=0004492) Train Loss: 0.1830, Train Steps/Sec: 0.13, Epoch: 0.08729109988340458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4493, "loss": 0.09523417800664902, "memory_gb": 7.721559524536133, "step_time_ms": 7506.4191818237305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:00] (step=0004493) Train Loss: 0.1390, Train Steps/Sec: 0.13, Epoch: 0.0873105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4494, "loss": 0.3478810489177704, "memory_gb": 7.721559524536133, "step_time_ms": 5334.6099853515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:05] (step=0004494) Train Loss: 0.3069, Train Steps/Sec: 0.18, Epoch: 0.08732996502137583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4495, "loss": 0.35983845591545105, "memory_gb": 7.721559524536133, "step_time_ms": 7519.220352172852, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:13] (step=0004495) Train Loss: 0.2724, Train Steps/Sec: 0.12, Epoch: 0.08734939759036145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4496, "loss": 0.2224874496459961, "memory_gb": 7.721559524536133, "step_time_ms": 7418.707370758057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:21] (step=0004496) Train Loss: 0.2588, Train Steps/Sec: 0.13, Epoch: 0.08736883015934707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4497, "loss": 0.24662475287914276, "memory_gb": 7.721559524536133, "step_time_ms": 7439.960956573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:29] (step=0004497) Train Loss: 0.2345, Train Steps/Sec: 0.13, Epoch: 0.08738826272833268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4498, "loss": 0.22724851965904236, "memory_gb": 7.721559524536133, "step_time_ms": 7544.221878051758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:37] (step=0004498) Train Loss: 0.2198, Train Steps/Sec: 0.12, Epoch: 0.0874076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4499, "loss": 0.298520028591156, "memory_gb": 7.721559524536133, "step_time_ms": 7482.424020767212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:45] (step=0004499) Train Loss: 0.3098, Train Steps/Sec: 0.12, Epoch: 0.08742712786630392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4500, "loss": 0.20715688169002533, "memory_gb": 7.721559524536133, "step_time_ms": 7450.642824172974, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:53:53] (step=0004500) Train Loss: 0.2137, Train Steps/Sec: 0.13, Epoch: 0.08744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4501, "loss": 0.20399364829063416, "memory_gb": 7.721559524536133, "step_time_ms": 7557.531833648682, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:01] (step=0004501) Train Loss: 0.2254, Train Steps/Sec: 0.12, Epoch: 0.08746599300427517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4502, "loss": 0.3669475018978119, "memory_gb": 7.721559524536133, "step_time_ms": 7529.122114181519, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:10] (step=0004502) Train Loss: 0.3618, Train Steps/Sec: 0.12, Epoch: 0.08748542557326079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4503, "loss": 0.19795551896095276, "memory_gb": 7.721559524536133, "step_time_ms": 7448.509931564331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:18] (step=0004503) Train Loss: 0.2728, Train Steps/Sec: 0.12, Epoch: 0.0875048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4504, "loss": 0.30759668350219727, "memory_gb": 7.721559524536133, "step_time_ms": 7538.069009780884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:26] (step=0004504) Train Loss: 0.2916, Train Steps/Sec: 0.12, Epoch: 0.08752429071123202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4505, "loss": 0.26656192541122437, "memory_gb": 7.721559524536133, "step_time_ms": 7609.528303146362, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:34] (step=0004505) Train Loss: 0.2396, Train Steps/Sec: 0.12, Epoch: 0.08754372328021764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4506, "loss": 0.2211715131998062, "memory_gb": 7.721559524536133, "step_time_ms": 7510.021209716797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:42] (step=0004506) Train Loss: 0.2445, Train Steps/Sec: 0.12, Epoch: 0.08756315584920327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4507, "loss": 0.35771268606185913, "memory_gb": 7.721559524536133, "step_time_ms": 7574.746608734131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:50] (step=0004507) Train Loss: 0.2468, Train Steps/Sec: 0.12, Epoch: 0.08758258841818889, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:54:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4508, "loss": 0.2737607955932617, "memory_gb": 7.721559524536133, "step_time_ms": 7545.623540878296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:54:58] (step=0004508) Train Loss: 0.2994, Train Steps/Sec: 0.12, Epoch: 0.0876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4509, "loss": 0.17971590161323547, "memory_gb": 7.721559524536133, "step_time_ms": 7462.926864624023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:06] (step=0004509) Train Loss: 0.1691, Train Steps/Sec: 0.13, Epoch: 0.08762145355616012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4510, "loss": 0.27159252762794495, "memory_gb": 7.721559524536133, "step_time_ms": 7544.193983078003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:14] (step=0004510) Train Loss: 0.2409, Train Steps/Sec: 0.12, Epoch: 0.08764088612514574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4511, "loss": 0.30672985315322876, "memory_gb": 7.721559524536133, "step_time_ms": 7592.221975326538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:22] (step=0004511) Train Loss: 0.2819, Train Steps/Sec: 0.12, Epoch: 0.08766031869413136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4512, "loss": 0.22366385161876678, "memory_gb": 7.721559524536133, "step_time_ms": 7502.554178237915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:30] (step=0004512) Train Loss: 0.2194, Train Steps/Sec: 0.13, Epoch: 0.08767975126311699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4513, "loss": 0.21570807695388794, "memory_gb": 7.721559524536133, "step_time_ms": 7539.56937789917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:38] (step=0004513) Train Loss: 0.1862, Train Steps/Sec: 0.12, Epoch: 0.08769918383210261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4514, "loss": 0.16565142571926117, "memory_gb": 7.721559524536133, "step_time_ms": 7548.765420913696, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:46] (step=0004514) Train Loss: 0.1396, Train Steps/Sec: 0.12, Epoch: 0.08771861640108822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:55:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4515, "loss": 0.2313256710767746, "memory_gb": 7.721559524536133, "step_time_ms": 7474.626064300537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:55:54] (step=0004515) Train Loss: 0.2061, Train Steps/Sec: 0.12, Epoch: 0.08773804897007384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4516, "loss": 0.1813277304172516, "memory_gb": 7.721559524536133, "step_time_ms": 7638.276815414429, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:02] (step=0004516) Train Loss: 0.2596, Train Steps/Sec: 0.12, Epoch: 0.08775748153905946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4517, "loss": 0.29663795232772827, "memory_gb": 7.721559524536133, "step_time_ms": 7622.480392456055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:10] (step=0004517) Train Loss: 0.3245, Train Steps/Sec: 0.12, Epoch: 0.08777691410804508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4518, "loss": 0.38086527585983276, "memory_gb": 7.721559524536133, "step_time_ms": 7536.616563796997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:18] (step=0004518) Train Loss: 0.3785, Train Steps/Sec: 0.12, Epoch: 0.0877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4519, "loss": 0.23615244030952454, "memory_gb": 7.721559524536133, "step_time_ms": 7578.278303146362, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:26] (step=0004519) Train Loss: 0.2698, Train Steps/Sec: 0.12, Epoch: 0.08781577924601633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4520, "loss": 0.3101663589477539, "memory_gb": 7.721559524536133, "step_time_ms": 7524.707317352295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:34] (step=0004520) Train Loss: 0.2173, Train Steps/Sec: 0.12, Epoch: 0.08783521181500194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4521, "loss": 0.20730070769786835, "memory_gb": 7.721559524536133, "step_time_ms": 7343.315601348877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:42] (step=0004521) Train Loss: 0.2070, Train Steps/Sec: 0.13, Epoch: 0.08785464438398756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4522, "loss": 0.22011986374855042, "memory_gb": 7.721559524536133, "step_time_ms": 7289.21103477478, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:50] (step=0004522) Train Loss: 0.2535, Train Steps/Sec: 0.13, Epoch: 0.08787407695297318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:56:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4523, "loss": 0.2153330147266388, "memory_gb": 7.721559524536133, "step_time_ms": 5811.811208724976, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:56:56] (step=0004523) Train Loss: 0.2373, Train Steps/Sec: 0.15, Epoch: 0.0878935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4524, "loss": 0.2613356113433838, "memory_gb": 7.721559524536133, "step_time_ms": 7517.194986343384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:04] (step=0004524) Train Loss: 0.2768, Train Steps/Sec: 0.12, Epoch: 0.08791294209094443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4525, "loss": 0.27764397859573364, "memory_gb": 7.721559524536133, "step_time_ms": 7483.550548553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:12] (step=0004525) Train Loss: 0.3088, Train Steps/Sec: 0.12, Epoch: 0.08793237465993005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4526, "loss": 0.23986992239952087, "memory_gb": 7.721559524536133, "step_time_ms": 7471.760511398315, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:21] (step=0004526) Train Loss: 0.2153, Train Steps/Sec: 0.12, Epoch: 0.08795180722891566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4527, "loss": 0.3053604066371918, "memory_gb": 7.721559524536133, "step_time_ms": 7404.343128204346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:29] (step=0004527) Train Loss: 0.2564, Train Steps/Sec: 0.13, Epoch: 0.08797123979790128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4528, "loss": 0.1751684695482254, "memory_gb": 7.721559524536133, "step_time_ms": 7480.591535568237, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:37] (step=0004528) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.0879906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4529, "loss": 0.24879328906536102, "memory_gb": 7.721559524536133, "step_time_ms": 7384.352445602417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:45] (step=0004529) Train Loss: 0.2263, Train Steps/Sec: 0.13, Epoch: 0.08801010493587252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:57:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4530, "loss": 0.13114285469055176, "memory_gb": 7.721559524536133, "step_time_ms": 7437.873363494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:57:53] (step=0004530) Train Loss: 0.1758, Train Steps/Sec: 0.13, Epoch: 0.08802953750485815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4531, "loss": 0.24612906575202942, "memory_gb": 7.721559524536133, "step_time_ms": 7499.0832805633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:01] (step=0004531) Train Loss: 0.2491, Train Steps/Sec: 0.12, Epoch: 0.08804897007384377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4532, "loss": 0.2437373250722885, "memory_gb": 7.721559524536133, "step_time_ms": 7400.511264801025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:09] (step=0004532) Train Loss: 0.2399, Train Steps/Sec: 0.13, Epoch: 0.08806840264282938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4533, "loss": 0.19559353590011597, "memory_gb": 7.721559524536133, "step_time_ms": 7498.565435409546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:17] (step=0004533) Train Loss: 0.1904, Train Steps/Sec: 0.12, Epoch: 0.088087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4534, "loss": 0.27741166949272156, "memory_gb": 7.721559524536133, "step_time_ms": 7477.541208267212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:25] (step=0004534) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.08810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4535, "loss": 0.2525542676448822, "memory_gb": 7.721559524536133, "step_time_ms": 7424.778461456299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:33] (step=0004535) Train Loss: 0.1987, Train Steps/Sec: 0.13, Epoch: 0.08812670034978624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4536, "loss": 0.19436655938625336, "memory_gb": 7.721559524536133, "step_time_ms": 7411.937952041626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:41] (step=0004536) Train Loss: 0.2656, Train Steps/Sec: 0.13, Epoch: 0.08814613291877187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4537, "loss": 0.25228115916252136, "memory_gb": 7.721559524536133, "step_time_ms": 7465.453863143921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:49] (step=0004537) Train Loss: 0.2376, Train Steps/Sec: 0.12, Epoch: 0.08816556548775749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:58:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4538, "loss": 0.19937632977962494, "memory_gb": 7.721559524536133, "step_time_ms": 7410.989999771118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:58:57] (step=0004538) Train Loss: 0.2676, Train Steps/Sec: 0.13, Epoch: 0.0881849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4539, "loss": 0.15104740858078003, "memory_gb": 7.721559524536133, "step_time_ms": 7401.905536651611, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:05] (step=0004539) Train Loss: 0.1952, Train Steps/Sec: 0.13, Epoch: 0.08820443062572872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4540, "loss": 0.10889820754528046, "memory_gb": 7.721559524536133, "step_time_ms": 7480.945110321045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:13] (step=0004540) Train Loss: 0.1853, Train Steps/Sec: 0.13, Epoch: 0.08822386319471434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4541, "loss": 0.10899731516838074, "memory_gb": 7.721559524536133, "step_time_ms": 7440.8509731292725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:21] (step=0004541) Train Loss: 0.2001, Train Steps/Sec: 0.13, Epoch: 0.08824329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4542, "loss": 0.19903577864170074, "memory_gb": 7.721559524536133, "step_time_ms": 7458.031415939331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:29] (step=0004542) Train Loss: 0.2102, Train Steps/Sec: 0.13, Epoch: 0.08826272833268559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4543, "loss": 0.26227402687072754, "memory_gb": 7.721559524536133, "step_time_ms": 7474.5941162109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:37] (step=0004543) Train Loss: 0.2791, Train Steps/Sec: 0.12, Epoch: 0.0882821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4544, "loss": 0.35084620118141174, "memory_gb": 7.721559524536133, "step_time_ms": 7216.822862625122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:45] (step=0004544) Train Loss: 0.2776, Train Steps/Sec: 0.13, Epoch: 0.08830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 03:59:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4545, "loss": 0.19171732664108276, "memory_gb": 7.721559524536133, "step_time_ms": 7490.414142608643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 03:59:53] (step=0004545) Train Loss: 0.2440, Train Steps/Sec: 0.12, Epoch: 0.08832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4546, "loss": 0.19629806280136108, "memory_gb": 7.721559524536133, "step_time_ms": 7531.5752029418945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:01] (step=0004546) Train Loss: 0.1699, Train Steps/Sec: 0.12, Epoch: 0.08834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4547, "loss": 0.20693819224834442, "memory_gb": 7.721559524536133, "step_time_ms": 7493.414878845215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:09] (step=0004547) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.08835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4548, "loss": 0.20828017592430115, "memory_gb": 7.721559524536133, "step_time_ms": 7474.823951721191, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:17] (step=0004548) Train Loss: 0.2647, Train Steps/Sec: 0.13, Epoch: 0.0883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4549, "loss": 0.3117937445640564, "memory_gb": 7.721559524536133, "step_time_ms": 7507.6775550842285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:25] (step=0004549) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.08839875631558491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4550, "loss": 0.2974223494529724, "memory_gb": 7.721559524536133, "step_time_ms": 7324.971914291382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:33] (step=0004550) Train Loss: 0.2521, Train Steps/Sec: 0.13, Epoch: 0.08841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4551, "loss": 0.26426005363464355, "memory_gb": 7.721559524536133, "step_time_ms": 6153.226375579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:39] (step=0004551) Train Loss: 0.2820, Train Steps/Sec: 0.15, Epoch: 0.08843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4552, "loss": 0.21321482956409454, "memory_gb": 7.721559524536133, "step_time_ms": 7535.528182983398, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:47] (step=0004552) Train Loss: 0.2193, Train Steps/Sec: 0.13, Epoch: 0.08845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4553, "loss": 0.22445619106292725, "memory_gb": 7.721559524536133, "step_time_ms": 7528.208255767822, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:00:55] (step=0004553) Train Loss: 0.2308, Train Steps/Sec: 0.12, Epoch: 0.0884764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4554, "loss": 0.3448057770729065, "memory_gb": 7.721559524536133, "step_time_ms": 7556.902170181274, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:03] (step=0004554) Train Loss: 0.2439, Train Steps/Sec: 0.12, Epoch: 0.08849591916051303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4555, "loss": 0.2841191291809082, "memory_gb": 7.721559524536133, "step_time_ms": 7481.05788230896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:11] (step=0004555) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.08851535172949863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4556, "loss": 0.2727207541465759, "memory_gb": 7.721559524536133, "step_time_ms": 7526.070356369019, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:19] (step=0004556) Train Loss: 0.2997, Train Steps/Sec: 0.12, Epoch: 0.08853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4557, "loss": 0.20136545598506927, "memory_gb": 7.721559524536133, "step_time_ms": 7526.863098144531, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:28] (step=0004557) Train Loss: 0.1894, Train Steps/Sec: 0.12, Epoch: 0.08855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4558, "loss": 0.19044384360313416, "memory_gb": 7.721559524536133, "step_time_ms": 7550.039052963257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:36] (step=0004558) Train Loss: 0.2063, Train Steps/Sec: 0.12, Epoch: 0.0885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4559, "loss": 0.3090547025203705, "memory_gb": 7.721559524536133, "step_time_ms": 7550.404071807861, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:44] (step=0004559) Train Loss: 0.2695, Train Steps/Sec: 0.13, Epoch: 0.08859308200544112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4560, "loss": 0.2598935663700104, "memory_gb": 7.721559524536133, "step_time_ms": 7654.5984745025635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:01:52] (step=0004560) Train Loss: 0.2608, Train Steps/Sec: 0.12, Epoch: 0.08861251457442675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4561, "loss": 0.2304852306842804, "memory_gb": 7.721559524536133, "step_time_ms": 7587.827682495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:00] (step=0004561) Train Loss: 0.2175, Train Steps/Sec: 0.12, Epoch: 0.08863194714341235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4562, "loss": 0.2310703545808792, "memory_gb": 7.721559524536133, "step_time_ms": 7460.099220275879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:08] (step=0004562) Train Loss: 0.2444, Train Steps/Sec: 0.12, Epoch: 0.08865137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4563, "loss": 0.27944931387901306, "memory_gb": 7.721559524536133, "step_time_ms": 7556.652784347534, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:16] (step=0004563) Train Loss: 0.2830, Train Steps/Sec: 0.12, Epoch: 0.0886708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4564, "loss": 0.1514683961868286, "memory_gb": 7.721559524536133, "step_time_ms": 7700.268983840942, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:24] (step=0004564) Train Loss: 0.1835, Train Steps/Sec: 0.12, Epoch: 0.08869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4565, "loss": 0.31687185168266296, "memory_gb": 7.721559524536133, "step_time_ms": 7531.710863113403, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:32] (step=0004565) Train Loss: 0.2907, Train Steps/Sec: 0.12, Epoch: 0.08870967741935484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4566, "loss": 0.2521807551383972, "memory_gb": 7.721559524536133, "step_time_ms": 7576.6613483428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:40] (step=0004566) Train Loss: 0.2403, Train Steps/Sec: 0.12, Epoch: 0.08872910998834047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4567, "loss": 0.17111708223819733, "memory_gb": 7.721559524536133, "step_time_ms": 7548.108816146851, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:48] (step=0004567) Train Loss: 0.2164, Train Steps/Sec: 0.12, Epoch: 0.08874854255732607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4568, "loss": 0.2709445357322693, "memory_gb": 7.721559524536133, "step_time_ms": 7516.27516746521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:02:56] (step=0004568) Train Loss: 0.2611, Train Steps/Sec: 0.12, Epoch: 0.0887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4569, "loss": 0.2146347612142563, "memory_gb": 7.721559524536133, "step_time_ms": 7566.911458969116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:04] (step=0004569) Train Loss: 0.2338, Train Steps/Sec: 0.12, Epoch: 0.08878740769529732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4570, "loss": 0.30588167905807495, "memory_gb": 7.721559524536133, "step_time_ms": 7556.455612182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:12] (step=0004570) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.08880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4571, "loss": 0.3122051954269409, "memory_gb": 7.721559524536133, "step_time_ms": 7514.511823654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:20] (step=0004571) Train Loss: 0.2490, Train Steps/Sec: 0.12, Epoch: 0.08882627283326856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4572, "loss": 0.3060111403465271, "memory_gb": 7.721559524536133, "step_time_ms": 7535.583257675171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:28] (step=0004572) Train Loss: 0.2286, Train Steps/Sec: 0.12, Epoch: 0.08884570540225417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4573, "loss": 0.3165794909000397, "memory_gb": 7.721559524536133, "step_time_ms": 7529.896020889282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:36] (step=0004573) Train Loss: 0.2616, Train Steps/Sec: 0.12, Epoch: 0.0888651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4574, "loss": 0.27504560351371765, "memory_gb": 7.721559524536133, "step_time_ms": 7475.583553314209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:44] (step=0004574) Train Loss: 0.2110, Train Steps/Sec: 0.13, Epoch: 0.08888457054022542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:03:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4575, "loss": 0.23409581184387207, "memory_gb": 7.721559524536133, "step_time_ms": 7528.68127822876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:03:52] (step=0004575) Train Loss: 0.2340, Train Steps/Sec: 0.12, Epoch: 0.08890400310921104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4576, "loss": 0.33770322799682617, "memory_gb": 7.721559524536133, "step_time_ms": 7466.304779052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:00] (step=0004576) Train Loss: 0.2625, Train Steps/Sec: 0.12, Epoch: 0.08892343567819666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4577, "loss": 0.28364527225494385, "memory_gb": 7.721559524536133, "step_time_ms": 7415.076971054077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:08] (step=0004577) Train Loss: 0.2482, Train Steps/Sec: 0.13, Epoch: 0.08894286824718228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4578, "loss": 0.2878182530403137, "memory_gb": 7.721559524536133, "step_time_ms": 7453.271389007568, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:16] (step=0004578) Train Loss: 0.2252, Train Steps/Sec: 0.13, Epoch: 0.08896230081616789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4579, "loss": 0.29846101999282837, "memory_gb": 7.721559524536133, "step_time_ms": 7478.0988693237305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:24] (step=0004579) Train Loss: 0.2361, Train Steps/Sec: 0.13, Epoch: 0.08898173338515351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4580, "loss": 0.24508816003799438, "memory_gb": 7.721559524536133, "step_time_ms": 5529.219627380371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:30] (step=0004580) Train Loss: 0.2624, Train Steps/Sec: 0.17, Epoch: 0.08900116595413914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4581, "loss": 0.10936616361141205, "memory_gb": 7.721559524536133, "step_time_ms": 7491.733551025391, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:38] (step=0004581) Train Loss: 0.1616, Train Steps/Sec: 0.13, Epoch: 0.08902059852312476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4582, "loss": 0.19696977734565735, "memory_gb": 7.721559524536133, "step_time_ms": 7432.761669158936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:46] (step=0004582) Train Loss: 0.2361, Train Steps/Sec: 0.13, Epoch: 0.08904003109211038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4583, "loss": 0.22230762243270874, "memory_gb": 7.721559524536133, "step_time_ms": 7500.304222106934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:04:54] (step=0004583) Train Loss: 0.2598, Train Steps/Sec: 0.12, Epoch: 0.089059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4584, "loss": 0.1560574769973755, "memory_gb": 7.721559524536133, "step_time_ms": 7430.418014526367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:02] (step=0004584) Train Loss: 0.1619, Train Steps/Sec: 0.13, Epoch: 0.08907889623008161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4585, "loss": 0.23478010296821594, "memory_gb": 7.715639114379883, "step_time_ms": 7424.9937534332275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:10] (step=0004585) Train Loss: 0.1968, Train Steps/Sec: 0.12, Epoch: 0.08909832879906723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4586, "loss": 0.22525301575660706, "memory_gb": 7.721559524536133, "step_time_ms": 7473.6597537994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:18] (step=0004586) Train Loss: 0.2144, Train Steps/Sec: 0.12, Epoch: 0.08911776136805286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4587, "loss": 0.23835688829421997, "memory_gb": 7.721559524536133, "step_time_ms": 7427.681922912598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:26] (step=0004587) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.08913719393703848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4588, "loss": 0.3025481402873993, "memory_gb": 7.721559524536133, "step_time_ms": 7374.639272689819, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:34] (step=0004588) Train Loss: 0.2840, Train Steps/Sec: 0.13, Epoch: 0.0891566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4589, "loss": 0.11895811557769775, "memory_gb": 7.721559524536133, "step_time_ms": 7432.961225509644, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:42] (step=0004589) Train Loss: 0.1917, Train Steps/Sec: 0.13, Epoch: 0.08917605907500972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4590, "loss": 0.21652626991271973, "memory_gb": 7.721559524536133, "step_time_ms": 7458.782911300659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:50] (step=0004590) Train Loss: 0.1665, Train Steps/Sec: 0.12, Epoch: 0.08919549164399533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4591, "loss": 0.23217028379440308, "memory_gb": 7.721559524536133, "step_time_ms": 7370.447635650635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:05:58] (step=0004591) Train Loss: 0.2338, Train Steps/Sec: 0.13, Epoch: 0.08921492421298095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4592, "loss": 0.18161392211914062, "memory_gb": 7.721559524536133, "step_time_ms": 7404.283046722412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:06] (step=0004592) Train Loss: 0.2342, Train Steps/Sec: 0.13, Epoch: 0.08923435678196658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4593, "loss": 0.25904884934425354, "memory_gb": 7.721559524536133, "step_time_ms": 7465.237617492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:14] (step=0004593) Train Loss: 0.2636, Train Steps/Sec: 0.12, Epoch: 0.0892537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4594, "loss": 0.2470114380121231, "memory_gb": 7.721559524536133, "step_time_ms": 7394.989490509033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:22] (step=0004594) Train Loss: 0.2511, Train Steps/Sec: 0.12, Epoch: 0.08927322191993782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4595, "loss": 0.30148839950561523, "memory_gb": 7.721559524536133, "step_time_ms": 7465.120315551758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:30] (step=0004595) Train Loss: 0.2831, Train Steps/Sec: 0.12, Epoch: 0.08929265448892344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4596, "loss": 0.26769179105758667, "memory_gb": 7.721559524536133, "step_time_ms": 7482.864856719971, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:38] (step=0004596) Train Loss: 0.2371, Train Steps/Sec: 0.12, Epoch: 0.08931208705790905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4597, "loss": 0.19783975183963776, "memory_gb": 7.721559524536133, "step_time_ms": 7421.414375305176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:46] (step=0004597) Train Loss: 0.1822, Train Steps/Sec: 0.12, Epoch: 0.08933151962689467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:06:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4598, "loss": 0.2611168622970581, "memory_gb": 7.721559524536133, "step_time_ms": 7515.383958816528, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:06:54] (step=0004598) Train Loss: 0.2731, Train Steps/Sec: 0.12, Epoch: 0.0893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4599, "loss": 0.20974618196487427, "memory_gb": 7.721559524536133, "step_time_ms": 7508.606910705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:02] (step=0004599) Train Loss: 0.1851, Train Steps/Sec: 0.12, Epoch: 0.08937038476486592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4600, "loss": 0.30233603715896606, "memory_gb": 7.721559524536133, "step_time_ms": 7416.443824768066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:10] (step=0004600) Train Loss: 0.2481, Train Steps/Sec: 0.13, Epoch: 0.08938981733385154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4601, "loss": 0.31762051582336426, "memory_gb": 7.721559524536133, "step_time_ms": 7433.730363845825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:18] (step=0004601) Train Loss: 0.2624, Train Steps/Sec: 0.12, Epoch: 0.08940924990283715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4602, "loss": 0.1759553700685501, "memory_gb": 7.721559524536133, "step_time_ms": 7481.237888336182, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:27] (step=0004602) Train Loss: 0.2398, Train Steps/Sec: 0.12, Epoch: 0.08942868247182277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4603, "loss": 0.17298197746276855, "memory_gb": 7.721559524536133, "step_time_ms": 7456.39705657959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:35] (step=0004603) Train Loss: 0.1727, Train Steps/Sec: 0.12, Epoch: 0.08944811504080839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4604, "loss": 0.2976754307746887, "memory_gb": 7.721559524536133, "step_time_ms": 7567.2173500061035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:43] (step=0004604) Train Loss: 0.2360, Train Steps/Sec: 0.13, Epoch: 0.08946754760979402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4605, "loss": 0.21532469987869263, "memory_gb": 7.721559524536133, "step_time_ms": 7536.055564880371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:51] (step=0004605) Train Loss: 0.2226, Train Steps/Sec: 0.12, Epoch: 0.08948698017877964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:07:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4606, "loss": 0.2750800848007202, "memory_gb": 7.721559524536133, "step_time_ms": 7419.437885284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:07:59] (step=0004606) Train Loss: 0.2739, Train Steps/Sec: 0.12, Epoch: 0.08950641274776526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4607, "loss": 0.1914292275905609, "memory_gb": 7.721559524536133, "step_time_ms": 7374.743223190308, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:07] (step=0004607) Train Loss: 0.1990, Train Steps/Sec: 0.13, Epoch: 0.08952584531675087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4608, "loss": 0.23127800226211548, "memory_gb": 7.721559524536133, "step_time_ms": 7555.774211883545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:15] (step=0004608) Train Loss: 0.2752, Train Steps/Sec: 0.12, Epoch: 0.08954527788573649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4609, "loss": 0.23654863238334656, "memory_gb": 7.721559524536133, "step_time_ms": 5141.487121582031, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:21] (step=0004609) Train Loss: 0.2094, Train Steps/Sec: 0.17, Epoch: 0.08956471045472211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4610, "loss": 0.24703805148601532, "memory_gb": 7.721559524536133, "step_time_ms": 7544.136047363281, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:29] (step=0004610) Train Loss: 0.2049, Train Steps/Sec: 0.13, Epoch: 0.08958414302370774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4611, "loss": 0.18788111209869385, "memory_gb": 7.721559524536133, "step_time_ms": 7296.274423599243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:36] (step=0004611) Train Loss: 0.1976, Train Steps/Sec: 0.13, Epoch: 0.08960357559269336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4612, "loss": 0.33167916536331177, "memory_gb": 7.721559524536133, "step_time_ms": 7548.316478729248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:44] (step=0004612) Train Loss: 0.3063, Train Steps/Sec: 0.13, Epoch: 0.08962300816167898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:08:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4613, "loss": 0.2118397355079651, "memory_gb": 7.721559524536133, "step_time_ms": 7617.698669433594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:08:52] (step=0004613) Train Loss: 0.1941, Train Steps/Sec: 0.12, Epoch: 0.08964244073066459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4614, "loss": 0.19825699925422668, "memory_gb": 7.721559524536133, "step_time_ms": 7541.723728179932, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:00] (step=0004614) Train Loss: 0.2154, Train Steps/Sec: 0.12, Epoch: 0.08966187329965021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4615, "loss": 0.19523940980434418, "memory_gb": 7.721559524536133, "step_time_ms": 7603.537321090698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:09] (step=0004615) Train Loss: 0.2273, Train Steps/Sec: 0.12, Epoch: 0.08968130586863583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4616, "loss": 0.2561923861503601, "memory_gb": 7.721559524536133, "step_time_ms": 7620.053052902222, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:17] (step=0004616) Train Loss: 0.2432, Train Steps/Sec: 0.12, Epoch: 0.08970073843762146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4617, "loss": 0.3438510000705719, "memory_gb": 7.721559524536133, "step_time_ms": 7549.368381500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:24] (step=0004617) Train Loss: 0.2966, Train Steps/Sec: 0.13, Epoch: 0.08972017100660708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4618, "loss": 0.20847749710083008, "memory_gb": 7.721559524536133, "step_time_ms": 7563.844680786133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:32] (step=0004618) Train Loss: 0.2113, Train Steps/Sec: 0.12, Epoch: 0.0897396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4619, "loss": 0.22773294150829315, "memory_gb": 7.721559524536133, "step_time_ms": 7566.57862663269, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:41] (step=0004619) Train Loss: 0.2953, Train Steps/Sec: 0.12, Epoch: 0.08975903614457831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4620, "loss": 0.2995166480541229, "memory_gb": 7.721559524536133, "step_time_ms": 7484.350681304932, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:49] (step=0004620) Train Loss: 0.2238, Train Steps/Sec: 0.13, Epoch: 0.08977846871356393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:09:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4621, "loss": 0.19720306992530823, "memory_gb": 7.721559524536133, "step_time_ms": 7469.891309738159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:09:57] (step=0004621) Train Loss: 0.2544, Train Steps/Sec: 0.12, Epoch: 0.08979790128254955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4622, "loss": 0.25711023807525635, "memory_gb": 7.721559524536133, "step_time_ms": 7513.707160949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:05] (step=0004622) Train Loss: 0.2209, Train Steps/Sec: 0.12, Epoch: 0.08981733385153517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4623, "loss": 0.279541552066803, "memory_gb": 7.721559524536133, "step_time_ms": 7453.213930130005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:13] (step=0004623) Train Loss: 0.3074, Train Steps/Sec: 0.12, Epoch: 0.0898367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4624, "loss": 0.19665777683258057, "memory_gb": 7.721559524536133, "step_time_ms": 7456.951141357422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:21] (step=0004624) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.08985619898950642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4625, "loss": 0.2784780263900757, "memory_gb": 7.721559524536133, "step_time_ms": 7476.715564727783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:29] (step=0004625) Train Loss: 0.2559, Train Steps/Sec: 0.12, Epoch: 0.08987563155849203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4626, "loss": 0.1607464998960495, "memory_gb": 7.721559524536133, "step_time_ms": 7420.735597610474, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:37] (step=0004626) Train Loss: 0.1832, Train Steps/Sec: 0.13, Epoch: 0.08989506412747765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4627, "loss": 0.2846781313419342, "memory_gb": 7.721559524536133, "step_time_ms": 7482.436656951904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:45] (step=0004627) Train Loss: 0.2913, Train Steps/Sec: 0.12, Epoch: 0.08991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:10:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4628, "loss": 0.22965416312217712, "memory_gb": 7.721559524536133, "step_time_ms": 7525.2954959869385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:10:53] (step=0004628) Train Loss: 0.2581, Train Steps/Sec: 0.12, Epoch: 0.0899339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4629, "loss": 0.2920863628387451, "memory_gb": 7.721559524536133, "step_time_ms": 7408.78438949585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:01] (step=0004629) Train Loss: 0.2536, Train Steps/Sec: 0.13, Epoch: 0.08995336183443452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4630, "loss": 0.2348749041557312, "memory_gb": 7.721559524536133, "step_time_ms": 7428.110837936401, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:09] (step=0004630) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.08997279440342013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4631, "loss": 0.29789286851882935, "memory_gb": 7.721559524536133, "step_time_ms": 7489.03226852417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:17] (step=0004631) Train Loss: 0.2851, Train Steps/Sec: 0.13, Epoch: 0.08999222697240575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4632, "loss": 0.24235033988952637, "memory_gb": 7.721559524536133, "step_time_ms": 7402.444839477539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:25] (step=0004632) Train Loss: 0.2134, Train Steps/Sec: 0.13, Epoch: 0.09001165954139137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4633, "loss": 0.3262293338775635, "memory_gb": 7.721559524536133, "step_time_ms": 7411.920547485352, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:33] (step=0004633) Train Loss: 0.2611, Train Steps/Sec: 0.13, Epoch: 0.09003109211037699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4634, "loss": 0.22311414778232574, "memory_gb": 7.721559524536133, "step_time_ms": 7479.893922805786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:41] (step=0004634) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.09005052467936261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4635, "loss": 0.10832321643829346, "memory_gb": 7.721559524536133, "step_time_ms": 7440.166711807251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:49] (step=0004635) Train Loss: 0.1595, Train Steps/Sec: 0.13, Epoch: 0.09006995724834824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:11:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4636, "loss": 0.26290416717529297, "memory_gb": 7.721559524536133, "step_time_ms": 7372.6911544799805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:11:57] (step=0004636) Train Loss: 0.2745, Train Steps/Sec: 0.13, Epoch: 0.09008938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4637, "loss": 0.3195801377296448, "memory_gb": 7.721559524536133, "step_time_ms": 7522.292375564575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:05] (step=0004637) Train Loss: 0.2480, Train Steps/Sec: 0.13, Epoch: 0.09010882238631947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4638, "loss": 0.24822619557380676, "memory_gb": 7.721559524536133, "step_time_ms": 5176.054239273071, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:10] (step=0004638) Train Loss: 0.2402, Train Steps/Sec: 0.18, Epoch: 0.09012825495530509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4639, "loss": 0.15205024182796478, "memory_gb": 7.721559524536133, "step_time_ms": 7485.131025314331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:18] (step=0004639) Train Loss: 0.1923, Train Steps/Sec: 0.12, Epoch: 0.09014768752429071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4640, "loss": 0.27271419763565063, "memory_gb": 7.721559524536133, "step_time_ms": 7445.517539978027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:26] (step=0004640) Train Loss: 0.2466, Train Steps/Sec: 0.13, Epoch: 0.09016712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4641, "loss": 0.15398555994033813, "memory_gb": 7.721559524536133, "step_time_ms": 7433.91227722168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:34] (step=0004641) Train Loss: 0.2423, Train Steps/Sec: 0.13, Epoch: 0.09018655266226196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4642, "loss": 0.3136484622955322, "memory_gb": 7.721559524536133, "step_time_ms": 7558.114767074585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:42] (step=0004642) Train Loss: 0.2893, Train Steps/Sec: 0.12, Epoch: 0.09020598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4643, "loss": 0.33552199602127075, "memory_gb": 7.721559524536133, "step_time_ms": 7492.962837219238, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:51] (step=0004643) Train Loss: 0.3008, Train Steps/Sec: 0.12, Epoch: 0.09022541780023319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:12:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4644, "loss": 0.262858122587204, "memory_gb": 7.721559524536133, "step_time_ms": 7402.8565883636475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:12:59] (step=0004644) Train Loss: 0.2555, Train Steps/Sec: 0.13, Epoch: 0.09024485036921881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4645, "loss": 0.26300227642059326, "memory_gb": 7.721559524536133, "step_time_ms": 7321.0508823394775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:07] (step=0004645) Train Loss: 0.2497, Train Steps/Sec: 0.12, Epoch: 0.09026428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4646, "loss": 0.23889054358005524, "memory_gb": 7.721559524536133, "step_time_ms": 7317.7173137664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:15] (step=0004646) Train Loss: 0.2548, Train Steps/Sec: 0.13, Epoch: 0.09028371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4647, "loss": 0.3092602491378784, "memory_gb": 7.721559524536133, "step_time_ms": 7394.031286239624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:23] (step=0004647) Train Loss: 0.2760, Train Steps/Sec: 0.13, Epoch: 0.09030314807617568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4648, "loss": 0.290847510099411, "memory_gb": 7.715639114379883, "step_time_ms": 7443.464517593384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:31] (step=0004648) Train Loss: 0.2472, Train Steps/Sec: 0.13, Epoch: 0.09032258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4649, "loss": 0.2535509467124939, "memory_gb": 7.721559524536133, "step_time_ms": 7418.604135513306, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:39] (step=0004649) Train Loss: 0.2021, Train Steps/Sec: 0.13, Epoch: 0.09034201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4650, "loss": 0.35107773542404175, "memory_gb": 7.721559524536133, "step_time_ms": 7537.7771854400635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:47] (step=0004650) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.09036144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:13:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4651, "loss": 0.15300381183624268, "memory_gb": 7.721559524536133, "step_time_ms": 7681.271076202393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:13:55] (step=0004651) Train Loss: 0.2350, Train Steps/Sec: 0.13, Epoch: 0.09038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4652, "loss": 0.310310423374176, "memory_gb": 7.721559524536133, "step_time_ms": 7480.333089828491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:03] (step=0004652) Train Loss: 0.2828, Train Steps/Sec: 0.13, Epoch: 0.09040031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4653, "loss": 0.3439043164253235, "memory_gb": 7.721559524536133, "step_time_ms": 7519.976615905762, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:11] (step=0004653) Train Loss: 0.3125, Train Steps/Sec: 0.12, Epoch: 0.0904197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4654, "loss": 0.1950472891330719, "memory_gb": 7.721559524536133, "step_time_ms": 7612.234115600586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:19] (step=0004654) Train Loss: 0.1990, Train Steps/Sec: 0.12, Epoch: 0.090439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4655, "loss": 0.20577101409435272, "memory_gb": 7.721559524536133, "step_time_ms": 7610.345363616943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:27] (step=0004655) Train Loss: 0.2084, Train Steps/Sec: 0.12, Epoch: 0.09045860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4656, "loss": 0.36740705370903015, "memory_gb": 7.721559524536133, "step_time_ms": 7506.519317626953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:35] (step=0004656) Train Loss: 0.3807, Train Steps/Sec: 0.12, Epoch: 0.09047804119704625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4657, "loss": 0.13792863488197327, "memory_gb": 7.721559524536133, "step_time_ms": 7599.869012832642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:43] (step=0004657) Train Loss: 0.1728, Train Steps/Sec: 0.12, Epoch: 0.09049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4658, "loss": 0.300476610660553, "memory_gb": 7.721559524536133, "step_time_ms": 7545.432090759277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:51] (step=0004658) Train Loss: 0.2510, Train Steps/Sec: 0.12, Epoch: 0.0905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4659, "loss": 0.29826590418815613, "memory_gb": 7.721559524536133, "step_time_ms": 7513.352632522583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:14:59] (step=0004659) Train Loss: 0.3024, Train Steps/Sec: 0.13, Epoch: 0.0905363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4660, "loss": 0.27494657039642334, "memory_gb": 7.721559524536133, "step_time_ms": 7579.323768615723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:07] (step=0004660) Train Loss: 0.2682, Train Steps/Sec: 0.12, Epoch: 0.09055577147298872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4661, "loss": 0.2605985403060913, "memory_gb": 7.721559524536133, "step_time_ms": 7497.21360206604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:15] (step=0004661) Train Loss: 0.2673, Train Steps/Sec: 0.13, Epoch: 0.09057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4662, "loss": 0.18723537027835846, "memory_gb": 7.721559524536133, "step_time_ms": 7537.373304367065, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:23] (step=0004662) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.09059463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4663, "loss": 0.22394207119941711, "memory_gb": 7.721559524536133, "step_time_ms": 7615.267038345337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:31] (step=0004663) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.09061406917994559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4664, "loss": 0.33180731534957886, "memory_gb": 7.721559524536133, "step_time_ms": 7615.581512451172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:39] (step=0004664) Train Loss: 0.2527, Train Steps/Sec: 0.12, Epoch: 0.09063350174893121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4665, "loss": 0.315697580575943, "memory_gb": 7.721559524536133, "step_time_ms": 7407.47594833374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:47] (step=0004665) Train Loss: 0.3006, Train Steps/Sec: 0.13, Epoch: 0.09065293431791682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:15:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4666, "loss": 0.28035563230514526, "memory_gb": 7.721559524536133, "step_time_ms": 7586.814403533936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:15:55] (step=0004666) Train Loss: 0.2814, Train Steps/Sec: 0.12, Epoch: 0.09067236688690244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4667, "loss": 0.24349801242351532, "memory_gb": 7.721559524536133, "step_time_ms": 5364.884376525879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:01] (step=0004667) Train Loss: 0.2198, Train Steps/Sec: 0.17, Epoch: 0.09069179945588807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4668, "loss": 0.2920083701610565, "memory_gb": 7.721559524536133, "step_time_ms": 7623.903751373291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:09] (step=0004668) Train Loss: 0.2852, Train Steps/Sec: 0.12, Epoch: 0.09071123202487369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4669, "loss": 0.1512002944946289, "memory_gb": 7.721559524536133, "step_time_ms": 7560.911178588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:17] (step=0004669) Train Loss: 0.2004, Train Steps/Sec: 0.12, Epoch: 0.09073066459385931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4670, "loss": 0.2546907663345337, "memory_gb": 7.721559524536133, "step_time_ms": 7524.145841598511, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:25] (step=0004670) Train Loss: 0.2238, Train Steps/Sec: 0.13, Epoch: 0.09075009716284493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4671, "loss": 0.2774333953857422, "memory_gb": 7.721559524536133, "step_time_ms": 7664.416074752808, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:33] (step=0004671) Train Loss: 0.2700, Train Steps/Sec: 0.12, Epoch: 0.09076952973183054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4672, "loss": 0.2317517101764679, "memory_gb": 7.721559524536133, "step_time_ms": 7578.643083572388, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:41] (step=0004672) Train Loss: 0.2105, Train Steps/Sec: 0.12, Epoch: 0.09078896230081616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4673, "loss": 0.20810666680335999, "memory_gb": 7.721559524536133, "step_time_ms": 7503.02529335022, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:49] (step=0004673) Train Loss: 0.2414, Train Steps/Sec: 0.12, Epoch: 0.09080839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:16:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4674, "loss": 0.23468998074531555, "memory_gb": 7.721559524536133, "step_time_ms": 7572.356224060059, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:16:57] (step=0004674) Train Loss: 0.1853, Train Steps/Sec: 0.12, Epoch: 0.09082782743878741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4675, "loss": 0.22701875865459442, "memory_gb": 7.721559524536133, "step_time_ms": 7553.954601287842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:05] (step=0004675) Train Loss: 0.2644, Train Steps/Sec: 0.13, Epoch: 0.09084726000777303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4676, "loss": 0.28423404693603516, "memory_gb": 7.721559524536133, "step_time_ms": 7487.9724979400635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:13] (step=0004676) Train Loss: 0.2333, Train Steps/Sec: 0.13, Epoch: 0.09086669257675865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4677, "loss": 0.29250067472457886, "memory_gb": 7.721559524536133, "step_time_ms": 7599.937677383423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:21] (step=0004677) Train Loss: 0.3083, Train Steps/Sec: 0.12, Epoch: 0.09088612514574426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4678, "loss": 0.23037905991077423, "memory_gb": 7.721559524536133, "step_time_ms": 7317.604541778564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:29] (step=0004678) Train Loss: 0.2342, Train Steps/Sec: 0.12, Epoch: 0.09090555771472988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4679, "loss": 0.2840878367424011, "memory_gb": 7.721559524536133, "step_time_ms": 7507.728338241577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:37] (step=0004679) Train Loss: 0.2402, Train Steps/Sec: 0.13, Epoch: 0.0909249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4680, "loss": 0.1870957314968109, "memory_gb": 7.721559524536133, "step_time_ms": 7609.503269195557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:45] (step=0004680) Train Loss: 0.2386, Train Steps/Sec: 0.12, Epoch: 0.09094442285270113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:17:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4681, "loss": 0.3035341501235962, "memory_gb": 7.721559524536133, "step_time_ms": 7524.3189334869385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:17:53] (step=0004681) Train Loss: 0.2922, Train Steps/Sec: 0.12, Epoch: 0.09096385542168675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4682, "loss": 0.22162896394729614, "memory_gb": 7.721559524536133, "step_time_ms": 7495.377540588379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:01] (step=0004682) Train Loss: 0.2113, Train Steps/Sec: 0.13, Epoch: 0.09098328799067237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4683, "loss": 0.21484291553497314, "memory_gb": 7.721559524536133, "step_time_ms": 7545.1977252960205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:09] (step=0004683) Train Loss: 0.2231, Train Steps/Sec: 0.12, Epoch: 0.09100272055965798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4684, "loss": 0.23394200205802917, "memory_gb": 7.721559524536133, "step_time_ms": 7500.355005264282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:17] (step=0004684) Train Loss: 0.2917, Train Steps/Sec: 0.13, Epoch: 0.0910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4685, "loss": 0.3002367913722992, "memory_gb": 7.721559524536133, "step_time_ms": 7422.624588012695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:25] (step=0004685) Train Loss: 0.2157, Train Steps/Sec: 0.13, Epoch: 0.09104158569762923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4686, "loss": 0.3209039568901062, "memory_gb": 7.721559524536133, "step_time_ms": 7480.9887409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:33] (step=0004686) Train Loss: 0.2947, Train Steps/Sec: 0.12, Epoch: 0.09106101826661485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4687, "loss": 0.1611958146095276, "memory_gb": 7.721559524536133, "step_time_ms": 7442.027807235718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:41] (step=0004687) Train Loss: 0.1664, Train Steps/Sec: 0.13, Epoch: 0.09108045083560047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4688, "loss": 0.31088370084762573, "memory_gb": 7.721559524536133, "step_time_ms": 7411.882638931274, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:49] (step=0004688) Train Loss: 0.3090, Train Steps/Sec: 0.13, Epoch: 0.09109988340458608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4689, "loss": 0.28331273794174194, "memory_gb": 7.721559524536133, "step_time_ms": 7496.223926544189, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:18:57] (step=0004689) Train Loss: 0.2590, Train Steps/Sec: 0.12, Epoch: 0.0911193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4690, "loss": 0.14646658301353455, "memory_gb": 7.721559524536133, "step_time_ms": 7436.394453048706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:05] (step=0004690) Train Loss: 0.1515, Train Steps/Sec: 0.12, Epoch: 0.09113874854255732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4691, "loss": 0.1286790817975998, "memory_gb": 7.721559524536133, "step_time_ms": 7405.0750732421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:13] (step=0004691) Train Loss: 0.1675, Train Steps/Sec: 0.13, Epoch: 0.09115818111154295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4692, "loss": 0.29506629705429077, "memory_gb": 7.721559524536133, "step_time_ms": 7617.702960968018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:21] (step=0004692) Train Loss: 0.2677, Train Steps/Sec: 0.12, Epoch: 0.09117761368052857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4693, "loss": 0.15823091566562653, "memory_gb": 7.721559524536133, "step_time_ms": 7418.602705001831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:29] (step=0004693) Train Loss: 0.1786, Train Steps/Sec: 0.13, Epoch: 0.09119704624951419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4694, "loss": 0.2527780532836914, "memory_gb": 7.721559524536133, "step_time_ms": 7319.183826446533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:37] (step=0004694) Train Loss: 0.2322, Train Steps/Sec: 0.13, Epoch: 0.0912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4695, "loss": 0.2858237326145172, "memory_gb": 7.721559524536133, "step_time_ms": 7512.485980987549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:45] (step=0004695) Train Loss: 0.3086, Train Steps/Sec: 0.12, Epoch: 0.09123591138748542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4696, "loss": 0.20920462906360626, "memory_gb": 7.721559524536133, "step_time_ms": 5042.874336242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:51] (step=0004696) Train Loss: 0.2119, Train Steps/Sec: 0.18, Epoch: 0.09125534395647104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:19:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4697, "loss": 0.22211876511573792, "memory_gb": 7.721559524536133, "step_time_ms": 7563.89856338501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:19:59] (step=0004697) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.09127477652545667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4698, "loss": 0.2403380423784256, "memory_gb": 7.721559524536133, "step_time_ms": 7528.5351276397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:07] (step=0004698) Train Loss: 0.2125, Train Steps/Sec: 0.13, Epoch: 0.09129420909444229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4699, "loss": 0.22307443618774414, "memory_gb": 7.721559524536133, "step_time_ms": 7524.121284484863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:15] (step=0004699) Train Loss: 0.2345, Train Steps/Sec: 0.13, Epoch: 0.09131364166342791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4700, "loss": 0.212785542011261, "memory_gb": 7.721559524536133, "step_time_ms": 7650.920391082764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:23] (step=0004700) Train Loss: 0.1997, Train Steps/Sec: 0.12, Epoch: 0.09133307423241352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4701, "loss": 0.2877947688102722, "memory_gb": 7.721559524536133, "step_time_ms": 7624.537944793701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:31] (step=0004701) Train Loss: 0.2969, Train Steps/Sec: 0.12, Epoch: 0.09135250680139914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4702, "loss": 0.31368327140808105, "memory_gb": 7.721559524536133, "step_time_ms": 7506.964206695557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:39] (step=0004702) Train Loss: 0.2405, Train Steps/Sec: 0.13, Epoch: 0.09137193937038476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4703, "loss": 0.1837637722492218, "memory_gb": 7.721559524536133, "step_time_ms": 7578.199625015259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:47] (step=0004703) Train Loss: 0.2312, Train Steps/Sec: 0.12, Epoch: 0.09139137193937039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:20:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4704, "loss": 0.34409213066101074, "memory_gb": 7.721559524536133, "step_time_ms": 7521.999359130859, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:20:55] (step=0004704) Train Loss: 0.2672, Train Steps/Sec: 0.12, Epoch: 0.09141080450835601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4705, "loss": 0.2002219557762146, "memory_gb": 7.721559524536133, "step_time_ms": 7434.410095214844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:03] (step=0004705) Train Loss: 0.2520, Train Steps/Sec: 0.13, Epoch: 0.09143023707734163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4706, "loss": 0.2810768485069275, "memory_gb": 7.721559524536133, "step_time_ms": 7561.302185058594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:11] (step=0004706) Train Loss: 0.2463, Train Steps/Sec: 0.12, Epoch: 0.09144966964632724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4707, "loss": 0.21957901120185852, "memory_gb": 7.721559524536133, "step_time_ms": 7428.608655929565, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:19] (step=0004707) Train Loss: 0.2193, Train Steps/Sec: 0.12, Epoch: 0.09146910221531286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4708, "loss": 0.24089287221431732, "memory_gb": 7.721559524536133, "step_time_ms": 7489.7801876068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:27] (step=0004708) Train Loss: 0.1834, Train Steps/Sec: 0.12, Epoch: 0.09148853478429848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4709, "loss": 0.267269492149353, "memory_gb": 7.721559524536133, "step_time_ms": 7602.834939956665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:35] (step=0004709) Train Loss: 0.2943, Train Steps/Sec: 0.12, Epoch: 0.0915079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4710, "loss": 0.15605643391609192, "memory_gb": 7.721559524536133, "step_time_ms": 7587.624549865723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:43] (step=0004710) Train Loss: 0.2058, Train Steps/Sec: 0.13, Epoch: 0.09152739992226973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4711, "loss": 0.1929345428943634, "memory_gb": 7.721559524536133, "step_time_ms": 7507.439374923706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:51] (step=0004711) Train Loss: 0.1768, Train Steps/Sec: 0.12, Epoch: 0.09154683249125535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:21:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4712, "loss": 0.2737172245979309, "memory_gb": 7.721559524536133, "step_time_ms": 7303.227663040161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:21:59] (step=0004712) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.09156626506024096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4713, "loss": 0.23331373929977417, "memory_gb": 7.721559524536133, "step_time_ms": 7438.397407531738, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:07] (step=0004713) Train Loss: 0.2714, Train Steps/Sec: 0.13, Epoch: 0.09158569762922658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4714, "loss": 0.22107437252998352, "memory_gb": 7.721559524536133, "step_time_ms": 7417.708158493042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:15] (step=0004714) Train Loss: 0.2614, Train Steps/Sec: 0.12, Epoch: 0.0916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4715, "loss": 0.23362091183662415, "memory_gb": 7.721559524536133, "step_time_ms": 7485.18967628479, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:23] (step=0004715) Train Loss: 0.2399, Train Steps/Sec: 0.12, Epoch: 0.09162456276719783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4716, "loss": 0.18236903846263885, "memory_gb": 7.721559524536133, "step_time_ms": 7478.450536727905, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:31] (step=0004716) Train Loss: 0.2494, Train Steps/Sec: 0.12, Epoch: 0.09164399533618345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4717, "loss": 0.16897742450237274, "memory_gb": 7.721559524536133, "step_time_ms": 7508.227586746216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:39] (step=0004717) Train Loss: 0.2240, Train Steps/Sec: 0.12, Epoch: 0.09166342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4718, "loss": 0.3185311555862427, "memory_gb": 7.721559524536133, "step_time_ms": 7630.75852394104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:47] (step=0004718) Train Loss: 0.2725, Train Steps/Sec: 0.12, Epoch: 0.09168286047415468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:22:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4719, "loss": 0.2799544334411621, "memory_gb": 7.721559524536133, "step_time_ms": 7545.155763626099, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:22:55] (step=0004719) Train Loss: 0.2694, Train Steps/Sec: 0.12, Epoch: 0.0917022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4720, "loss": 0.3162059485912323, "memory_gb": 7.721559524536133, "step_time_ms": 7480.751991271973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:03] (step=0004720) Train Loss: 0.2454, Train Steps/Sec: 0.13, Epoch: 0.09172172561212592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4721, "loss": 0.2919052541255951, "memory_gb": 7.721559524536133, "step_time_ms": 7531.827926635742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:11] (step=0004721) Train Loss: 0.2683, Train Steps/Sec: 0.13, Epoch: 0.09174115818111155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4722, "loss": 0.11309076100587845, "memory_gb": 7.721559524536133, "step_time_ms": 7542.413234710693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:19] (step=0004722) Train Loss: 0.1465, Train Steps/Sec: 0.12, Epoch: 0.09176059075009717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4723, "loss": 0.2765998840332031, "memory_gb": 7.721559524536133, "step_time_ms": 7370.43833732605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:27] (step=0004723) Train Loss: 0.2254, Train Steps/Sec: 0.13, Epoch: 0.09178002331908278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4724, "loss": 0.21931037306785583, "memory_gb": 7.721559524536133, "step_time_ms": 7541.592359542847, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:35] (step=0004724) Train Loss: 0.2120, Train Steps/Sec: 0.13, Epoch: 0.0917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4725, "loss": 0.2705540657043457, "memory_gb": 7.721559524536133, "step_time_ms": 5293.666124343872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:42] (step=0004725) Train Loss: 0.2857, Train Steps/Sec: 0.16, Epoch: 0.09181888845705402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4726, "loss": 0.23995935916900635, "memory_gb": 7.721559524536133, "step_time_ms": 7494.638681411743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:50] (step=0004726) Train Loss: 0.2305, Train Steps/Sec: 0.12, Epoch: 0.09183832102603964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:23:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4727, "loss": 0.29727208614349365, "memory_gb": 7.721559524536133, "step_time_ms": 7495.5761432647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:23:58] (step=0004727) Train Loss: 0.3155, Train Steps/Sec: 0.12, Epoch: 0.09185775359502527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4728, "loss": 0.1825007051229477, "memory_gb": 7.721559524536133, "step_time_ms": 7503.8182735443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:06] (step=0004728) Train Loss: 0.2642, Train Steps/Sec: 0.12, Epoch: 0.09187718616401089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4729, "loss": 0.2557745575904846, "memory_gb": 7.721559524536133, "step_time_ms": 7482.353687286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:14] (step=0004729) Train Loss: 0.1922, Train Steps/Sec: 0.12, Epoch: 0.0918966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4730, "loss": 0.2386418879032135, "memory_gb": 7.721559524536133, "step_time_ms": 7541.255950927734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:22] (step=0004730) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.09191605130198212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4731, "loss": 0.3429974317550659, "memory_gb": 7.721559524536133, "step_time_ms": 7456.340074539185, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:30] (step=0004731) Train Loss: 0.2656, Train Steps/Sec: 0.13, Epoch: 0.09193548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4732, "loss": 0.2710955739021301, "memory_gb": 7.721559524536133, "step_time_ms": 7466.049671173096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:38] (step=0004732) Train Loss: 0.2759, Train Steps/Sec: 0.13, Epoch: 0.09195491643995336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4733, "loss": 0.24015100300312042, "memory_gb": 7.721559524536133, "step_time_ms": 7530.990123748779, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:46] (step=0004733) Train Loss: 0.2474, Train Steps/Sec: 0.13, Epoch: 0.09197434900893899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:24:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4734, "loss": 0.2640347480773926, "memory_gb": 7.721559524536133, "step_time_ms": 7413.785934448242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:24:54] (step=0004734) Train Loss: 0.2708, Train Steps/Sec: 0.13, Epoch: 0.09199378157792461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4735, "loss": 0.3115886449813843, "memory_gb": 7.721559524536133, "step_time_ms": 7503.1256675720215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:02] (step=0004735) Train Loss: 0.3132, Train Steps/Sec: 0.12, Epoch: 0.09201321414691022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4736, "loss": 0.24030762910842896, "memory_gb": 7.721559524536133, "step_time_ms": 7480.478763580322, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:10] (step=0004736) Train Loss: 0.2556, Train Steps/Sec: 0.12, Epoch: 0.09203264671589584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4737, "loss": 0.14930498600006104, "memory_gb": 7.721559524536133, "step_time_ms": 7475.851535797119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:18] (step=0004737) Train Loss: 0.1907, Train Steps/Sec: 0.12, Epoch: 0.09205207928488146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4738, "loss": 0.2583393156528473, "memory_gb": 7.721559524536133, "step_time_ms": 7458.6029052734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:26] (step=0004738) Train Loss: 0.2477, Train Steps/Sec: 0.12, Epoch: 0.09207151185386708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4739, "loss": 0.29808756709098816, "memory_gb": 7.721559524536133, "step_time_ms": 7440.690994262695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:34] (step=0004739) Train Loss: 0.2104, Train Steps/Sec: 0.12, Epoch: 0.0920909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4740, "loss": 0.3284115195274353, "memory_gb": 7.721559524536133, "step_time_ms": 7492.282867431641, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:42] (step=0004740) Train Loss: 0.3227, Train Steps/Sec: 0.13, Epoch: 0.09211037699183833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4741, "loss": 0.22552180290222168, "memory_gb": 7.721559524536133, "step_time_ms": 7442.3253536224365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:50] (step=0004741) Train Loss: 0.2277, Train Steps/Sec: 0.12, Epoch: 0.09212980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4742, "loss": 0.21518613398075104, "memory_gb": 7.721559524536133, "step_time_ms": 7443.920373916626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:25:58] (step=0004742) Train Loss: 0.1950, Train Steps/Sec: 0.12, Epoch: 0.09214924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4743, "loss": 0.20453786849975586, "memory_gb": 7.721559524536133, "step_time_ms": 7390.429496765137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:06] (step=0004743) Train Loss: 0.2024, Train Steps/Sec: 0.13, Epoch: 0.09216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4744, "loss": 0.23826749622821808, "memory_gb": 7.721559524536133, "step_time_ms": 7437.392473220825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:14] (step=0004744) Train Loss: 0.2771, Train Steps/Sec: 0.12, Epoch: 0.0921881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4745, "loss": 0.2444002330303192, "memory_gb": 7.721559524536133, "step_time_ms": 7463.312387466431, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:22] (step=0004745) Train Loss: 0.1959, Train Steps/Sec: 0.12, Epoch: 0.09220753983676643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4746, "loss": 0.21310114860534668, "memory_gb": 7.721559524536133, "step_time_ms": 7245.692729949951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:30] (step=0004746) Train Loss: 0.2228, Train Steps/Sec: 0.12, Epoch: 0.09222697240575205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4747, "loss": 0.25688815116882324, "memory_gb": 7.721559524536133, "step_time_ms": 7408.137798309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:38] (step=0004747) Train Loss: 0.2706, Train Steps/Sec: 0.13, Epoch: 0.09224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4748, "loss": 0.19567978382110596, "memory_gb": 7.721559524536133, "step_time_ms": 7467.352628707886, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:46] (step=0004748) Train Loss: 0.2081, Train Steps/Sec: 0.12, Epoch: 0.09226583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:26:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4749, "loss": 0.1834906041622162, "memory_gb": 7.721559524536133, "step_time_ms": 7401.532411575317, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:26:54] (step=0004749) Train Loss: 0.2052, Train Steps/Sec: 0.13, Epoch: 0.0922852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4750, "loss": 0.23144008219242096, "memory_gb": 7.721559524536133, "step_time_ms": 7443.825721740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:02] (step=0004750) Train Loss: 0.2434, Train Steps/Sec: 0.12, Epoch: 0.09230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4751, "loss": 0.29731422662734985, "memory_gb": 7.721559524536133, "step_time_ms": 7533.990144729614, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:10] (step=0004751) Train Loss: 0.2176, Train Steps/Sec: 0.12, Epoch: 0.09232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4752, "loss": 0.18232451379299164, "memory_gb": 7.721559524536133, "step_time_ms": 7303.740978240967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:18] (step=0004752) Train Loss: 0.2058, Train Steps/Sec: 0.13, Epoch: 0.09234356781966575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4753, "loss": 0.20550651848316193, "memory_gb": 7.721559524536133, "step_time_ms": 6898.216485977173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:25] (step=0004753) Train Loss: 0.1940, Train Steps/Sec: 0.14, Epoch: 0.09236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4754, "loss": 0.20317402482032776, "memory_gb": 7.721559524536133, "step_time_ms": 6215.427875518799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:32] (step=0004754) Train Loss: 0.2629, Train Steps/Sec: 0.15, Epoch: 0.092382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4755, "loss": 0.3044053912162781, "memory_gb": 7.721559524536133, "step_time_ms": 7515.458583831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:40] (step=0004755) Train Loss: 0.2485, Train Steps/Sec: 0.12, Epoch: 0.09240186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4756, "loss": 0.2563973069190979, "memory_gb": 7.721559524536133, "step_time_ms": 7440.439224243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:48] (step=0004756) Train Loss: 0.2568, Train Steps/Sec: 0.13, Epoch: 0.09242129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:27:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4757, "loss": 0.14222878217697144, "memory_gb": 7.721559524536133, "step_time_ms": 7400.1617431640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:27:56] (step=0004757) Train Loss: 0.1386, Train Steps/Sec: 0.12, Epoch: 0.09244073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4758, "loss": 0.29158419370651245, "memory_gb": 7.721559524536133, "step_time_ms": 7508.110761642456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:04] (step=0004758) Train Loss: 0.2722, Train Steps/Sec: 0.12, Epoch: 0.09246016323357947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4759, "loss": 0.2490694373846054, "memory_gb": 7.721559524536133, "step_time_ms": 7569.896697998047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:12] (step=0004759) Train Loss: 0.1981, Train Steps/Sec: 0.12, Epoch: 0.0924795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4760, "loss": 0.27895790338516235, "memory_gb": 7.721559524536133, "step_time_ms": 7554.917097091675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:20] (step=0004760) Train Loss: 0.2554, Train Steps/Sec: 0.12, Epoch: 0.09249902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4761, "loss": 0.18578875064849854, "memory_gb": 7.721559524536133, "step_time_ms": 7583.16707611084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:28] (step=0004761) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.09251846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4762, "loss": 0.36054861545562744, "memory_gb": 7.721559524536133, "step_time_ms": 7645.700454711914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:36] (step=0004762) Train Loss: 0.3075, Train Steps/Sec: 0.12, Epoch: 0.09253789350952196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4763, "loss": 0.22052785754203796, "memory_gb": 7.721559524536133, "step_time_ms": 7555.117845535278, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:44] (step=0004763) Train Loss: 0.2255, Train Steps/Sec: 0.12, Epoch: 0.09255732607850758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:28:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4764, "loss": 0.2382076382637024, "memory_gb": 7.721559524536133, "step_time_ms": 7584.65576171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:28:52] (step=0004764) Train Loss: 0.2263, Train Steps/Sec: 0.12, Epoch: 0.09257675864749319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4765, "loss": 0.29258376359939575, "memory_gb": 7.721559524536133, "step_time_ms": 7575.942754745483, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:00] (step=0004765) Train Loss: 0.2657, Train Steps/Sec: 0.12, Epoch: 0.09259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4766, "loss": 0.18170320987701416, "memory_gb": 7.721559524536133, "step_time_ms": 7571.019411087036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:08] (step=0004766) Train Loss: 0.2408, Train Steps/Sec: 0.12, Epoch: 0.09261562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4767, "loss": 0.303289532661438, "memory_gb": 7.721559524536133, "step_time_ms": 7599.396705627441, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:16] (step=0004767) Train Loss: 0.2774, Train Steps/Sec: 0.12, Epoch: 0.09263505635445006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4768, "loss": 0.30728796124458313, "memory_gb": 7.721559524536133, "step_time_ms": 7593.827486038208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:24] (step=0004768) Train Loss: 0.3010, Train Steps/Sec: 0.12, Epoch: 0.09265448892343568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4769, "loss": 0.2465173304080963, "memory_gb": 7.721559524536133, "step_time_ms": 7626.842260360718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:32] (step=0004769) Train Loss: 0.2525, Train Steps/Sec: 0.12, Epoch: 0.0926739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4770, "loss": 0.34228813648223877, "memory_gb": 7.721559524536133, "step_time_ms": 7553.649425506592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:40] (step=0004770) Train Loss: 0.2929, Train Steps/Sec: 0.13, Epoch: 0.09269335406140691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4771, "loss": 0.2889194190502167, "memory_gb": 7.721559524536133, "step_time_ms": 7561.755180358887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:48] (step=0004771) Train Loss: 0.2825, Train Steps/Sec: 0.12, Epoch: 0.09271278663039254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:29:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4772, "loss": 0.21273937821388245, "memory_gb": 7.721559524536133, "step_time_ms": 7495.4986572265625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:29:57] (step=0004772) Train Loss: 0.1950, Train Steps/Sec: 0.13, Epoch: 0.09273221919937816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4773, "loss": 0.18386723101139069, "memory_gb": 7.721559524536133, "step_time_ms": 7589.364290237427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:05] (step=0004773) Train Loss: 0.2071, Train Steps/Sec: 0.12, Epoch: 0.09275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4774, "loss": 0.27541661262512207, "memory_gb": 7.721559524536133, "step_time_ms": 7612.2145652771, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:13] (step=0004774) Train Loss: 0.2755, Train Steps/Sec: 0.12, Epoch: 0.0927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4775, "loss": 0.23560330271720886, "memory_gb": 7.721559524536133, "step_time_ms": 7487.440586090088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:21] (step=0004775) Train Loss: 0.2635, Train Steps/Sec: 0.12, Epoch: 0.09279051690633502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4776, "loss": 0.21361315250396729, "memory_gb": 7.721559524536133, "step_time_ms": 7481.035947799683, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:29] (step=0004776) Train Loss: 0.2368, Train Steps/Sec: 0.13, Epoch: 0.09280994947532063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4777, "loss": 0.2758287191390991, "memory_gb": 7.721559524536133, "step_time_ms": 7559.788942337036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:37] (step=0004777) Train Loss: 0.2985, Train Steps/Sec: 0.12, Epoch: 0.09282938204430626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4778, "loss": 0.19884485006332397, "memory_gb": 7.721559524536133, "step_time_ms": 7443.79734992981, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:45] (step=0004778) Train Loss: 0.2096, Train Steps/Sec: 0.12, Epoch: 0.09284881461329188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:30:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4779, "loss": 0.27683258056640625, "memory_gb": 7.721559524536133, "step_time_ms": 7499.961853027344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:30:53] (step=0004779) Train Loss: 0.3178, Train Steps/Sec: 0.12, Epoch: 0.0928682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4780, "loss": 0.25842905044555664, "memory_gb": 7.721559524536133, "step_time_ms": 7638.545036315918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:01] (step=0004780) Train Loss: 0.2212, Train Steps/Sec: 0.12, Epoch: 0.09288767975126312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4781, "loss": 0.18867889046669006, "memory_gb": 7.721559524536133, "step_time_ms": 7323.467493057251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:09] (step=0004781) Train Loss: 0.2068, Train Steps/Sec: 0.13, Epoch: 0.09290711232024873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4782, "loss": 0.17153383791446686, "memory_gb": 7.721559524536133, "step_time_ms": 6351.329803466797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:16] (step=0004782) Train Loss: 0.2003, Train Steps/Sec: 0.15, Epoch: 0.09292654488923435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4783, "loss": 0.2225753664970398, "memory_gb": 7.721559524536133, "step_time_ms": 6466.2981033325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:23] (step=0004783) Train Loss: 0.2272, Train Steps/Sec: 0.14, Epoch: 0.09294597745821997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4784, "loss": 0.2708609700202942, "memory_gb": 7.721559524536133, "step_time_ms": 7391.955852508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:31] (step=0004784) Train Loss: 0.1806, Train Steps/Sec: 0.13, Epoch: 0.0929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4785, "loss": 0.1779940128326416, "memory_gb": 7.721559524536133, "step_time_ms": 7497.926950454712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:39] (step=0004785) Train Loss: 0.2344, Train Steps/Sec: 0.12, Epoch: 0.09298484259619122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4786, "loss": 0.2363288849592209, "memory_gb": 7.721559524536133, "step_time_ms": 7448.309659957886, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:47] (step=0004786) Train Loss: 0.2291, Train Steps/Sec: 0.12, Epoch: 0.09300427516517684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:31:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4787, "loss": 0.32436203956604004, "memory_gb": 7.721559524536133, "step_time_ms": 7413.707733154297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:31:55] (step=0004787) Train Loss: 0.3240, Train Steps/Sec: 0.13, Epoch: 0.09302370773416245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4788, "loss": 0.16858620941638947, "memory_gb": 7.721559524536133, "step_time_ms": 7498.456239700317, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:03] (step=0004788) Train Loss: 0.2136, Train Steps/Sec: 0.12, Epoch: 0.09304314030314807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4789, "loss": 0.2997804284095764, "memory_gb": 7.721559524536133, "step_time_ms": 7456.73131942749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:11] (step=0004789) Train Loss: 0.2983, Train Steps/Sec: 0.12, Epoch: 0.0930625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4790, "loss": 0.28361546993255615, "memory_gb": 7.721559524536133, "step_time_ms": 7428.707838058472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:19] (step=0004790) Train Loss: 0.2741, Train Steps/Sec: 0.13, Epoch: 0.09308200544111932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4791, "loss": 0.3259332776069641, "memory_gb": 7.715639114379883, "step_time_ms": 7490.456104278564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:27] (step=0004791) Train Loss: 0.2712, Train Steps/Sec: 0.13, Epoch: 0.09310143801010494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4792, "loss": 0.30277368426322937, "memory_gb": 7.721559524536133, "step_time_ms": 7412.396192550659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:35] (step=0004792) Train Loss: 0.2397, Train Steps/Sec: 0.13, Epoch: 0.09312087057909056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4793, "loss": 0.21552319824695587, "memory_gb": 7.721559524536133, "step_time_ms": 7403.1336307525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:43] (step=0004793) Train Loss: 0.2283, Train Steps/Sec: 0.13, Epoch: 0.09314030314807617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4794, "loss": 0.24002328515052795, "memory_gb": 7.721559524536133, "step_time_ms": 7485.105752944946, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:51] (step=0004794) Train Loss: 0.2475, Train Steps/Sec: 0.13, Epoch: 0.09315973571706179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:32:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4795, "loss": 0.21707598865032196, "memory_gb": 7.721559524536133, "step_time_ms": 7415.019750595093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:32:59] (step=0004795) Train Loss: 0.2389, Train Steps/Sec: 0.13, Epoch: 0.09317916828604741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4796, "loss": 0.1736665666103363, "memory_gb": 7.721559524536133, "step_time_ms": 7441.060304641724, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:07] (step=0004796) Train Loss: 0.2815, Train Steps/Sec: 0.13, Epoch: 0.09319860085503304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4797, "loss": 0.260427325963974, "memory_gb": 7.721559524536133, "step_time_ms": 7480.9675216674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:15] (step=0004797) Train Loss: 0.2644, Train Steps/Sec: 0.13, Epoch: 0.09321803342401866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4798, "loss": 0.1799948364496231, "memory_gb": 7.721559524536133, "step_time_ms": 7456.883430480957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:23] (step=0004798) Train Loss: 0.2351, Train Steps/Sec: 0.13, Epoch: 0.09323746599300428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4799, "loss": 0.15676239132881165, "memory_gb": 7.721559524536133, "step_time_ms": 7443.031549453735, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:30] (step=0004799) Train Loss: 0.1889, Train Steps/Sec: 0.13, Epoch: 0.09325689856198989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4800, "loss": 0.26581212878227234, "memory_gb": 7.721559524536133, "step_time_ms": 7528.174877166748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:38] (step=0004800) Train Loss: 0.2400, Train Steps/Sec: 0.13, Epoch: 0.09327633113097551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4801, "loss": 0.2916426360607147, "memory_gb": 7.721559524536133, "step_time_ms": 7482.287406921387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:46] (step=0004801) Train Loss: 0.3078, Train Steps/Sec: 0.13, Epoch: 0.09329576369996113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:33:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4802, "loss": 0.15271443128585815, "memory_gb": 7.721559524536133, "step_time_ms": 7433.389663696289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:33:54] (step=0004802) Train Loss: 0.2756, Train Steps/Sec: 0.13, Epoch: 0.09331519626894676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4803, "loss": 0.25700709223747253, "memory_gb": 7.721559524536133, "step_time_ms": 7502.143383026123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:02] (step=0004803) Train Loss: 0.2708, Train Steps/Sec: 0.12, Epoch: 0.09333462883793238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4804, "loss": 0.2947603464126587, "memory_gb": 7.721559524536133, "step_time_ms": 7444.393873214722, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:10] (step=0004804) Train Loss: 0.2859, Train Steps/Sec: 0.13, Epoch: 0.093354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4805, "loss": 0.18683725595474243, "memory_gb": 7.721559524536133, "step_time_ms": 7479.773759841919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:18] (step=0004805) Train Loss: 0.2090, Train Steps/Sec: 0.12, Epoch: 0.09337349397590361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4806, "loss": 0.18541279435157776, "memory_gb": 7.721559524536133, "step_time_ms": 7479.794263839722, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:26] (step=0004806) Train Loss: 0.2371, Train Steps/Sec: 0.12, Epoch: 0.09339292654488923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4807, "loss": 0.23374533653259277, "memory_gb": 7.721559524536133, "step_time_ms": 7451.146841049194, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:34] (step=0004807) Train Loss: 0.1988, Train Steps/Sec: 0.12, Epoch: 0.09341235911387485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4808, "loss": 0.2972662150859833, "memory_gb": 7.721559524536133, "step_time_ms": 7505.187034606934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:42] (step=0004808) Train Loss: 0.2929, Train Steps/Sec: 0.12, Epoch: 0.09343179168286048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4809, "loss": 0.23863303661346436, "memory_gb": 7.721559524536133, "step_time_ms": 7542.3455238342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:50] (step=0004809) Train Loss: 0.2411, Train Steps/Sec: 0.12, Epoch: 0.0934512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:34:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4810, "loss": 0.1725204586982727, "memory_gb": 7.721559524536133, "step_time_ms": 7241.576671600342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:34:58] (step=0004810) Train Loss: 0.1602, Train Steps/Sec: 0.13, Epoch: 0.09347065682083171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4811, "loss": 0.2730792164802551, "memory_gb": 7.721559524536133, "step_time_ms": 5714.110851287842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:04] (step=0004811) Train Loss: 0.2563, Train Steps/Sec: 0.17, Epoch: 0.09349008938981733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4812, "loss": 0.23532244563102722, "memory_gb": 7.721559524536133, "step_time_ms": 7327.530860900879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:12] (step=0004812) Train Loss: 0.2282, Train Steps/Sec: 0.13, Epoch: 0.09350952195880295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4813, "loss": 0.3103492856025696, "memory_gb": 7.721559524536133, "step_time_ms": 7488.459348678589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:20] (step=0004813) Train Loss: 0.2764, Train Steps/Sec: 0.13, Epoch: 0.09352895452778857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4814, "loss": 0.22305357456207275, "memory_gb": 7.721559524536133, "step_time_ms": 7596.308708190918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:28] (step=0004814) Train Loss: 0.2612, Train Steps/Sec: 0.12, Epoch: 0.0935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4815, "loss": 0.261535108089447, "memory_gb": 7.721559524536133, "step_time_ms": 7532.512187957764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:36] (step=0004815) Train Loss: 0.2593, Train Steps/Sec: 0.13, Epoch: 0.09356781966575982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4816, "loss": 0.2476157695055008, "memory_gb": 7.721559524536133, "step_time_ms": 7470.260143280029, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:44] (step=0004816) Train Loss: 0.2393, Train Steps/Sec: 0.12, Epoch: 0.09358725223474543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:35:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4817, "loss": 0.18021813035011292, "memory_gb": 7.721559524536133, "step_time_ms": 7556.219816207886, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:35:52] (step=0004817) Train Loss: 0.1990, Train Steps/Sec: 0.12, Epoch: 0.09360668480373105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4818, "loss": 0.15732848644256592, "memory_gb": 7.721559524536133, "step_time_ms": 7520.5979347229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:00] (step=0004818) Train Loss: 0.1929, Train Steps/Sec: 0.13, Epoch: 0.09362611737271667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4819, "loss": 0.289206862449646, "memory_gb": 7.721559524536133, "step_time_ms": 7515.410900115967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:08] (step=0004819) Train Loss: 0.2877, Train Steps/Sec: 0.12, Epoch: 0.0936455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4820, "loss": 0.2236396223306656, "memory_gb": 7.721559524536133, "step_time_ms": 7605.245351791382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:16] (step=0004820) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.09366498251068792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4821, "loss": 0.2293499857187271, "memory_gb": 7.721559524536133, "step_time_ms": 7525.959491729736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:24] (step=0004821) Train Loss: 0.2369, Train Steps/Sec: 0.13, Epoch: 0.09368441507967354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4822, "loss": 0.24261386692523956, "memory_gb": 7.721559524536133, "step_time_ms": 7430.82857131958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:32] (step=0004822) Train Loss: 0.2415, Train Steps/Sec: 0.13, Epoch: 0.09370384764865915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4823, "loss": 0.24636796116828918, "memory_gb": 7.721559524536133, "step_time_ms": 7606.1975955963135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:40] (step=0004823) Train Loss: 0.1908, Train Steps/Sec: 0.12, Epoch: 0.09372328021764477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4824, "loss": 0.21661332249641418, "memory_gb": 7.721559524536133, "step_time_ms": 7477.272748947144, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:48] (step=0004824) Train Loss: 0.2555, Train Steps/Sec: 0.12, Epoch: 0.09374271278663039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4825, "loss": 0.24687540531158447, "memory_gb": 7.721559524536133, "step_time_ms": 7451.274871826172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:36:56] (step=0004825) Train Loss: 0.2406, Train Steps/Sec: 0.13, Epoch: 0.09376214535561601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4826, "loss": 0.2380678653717041, "memory_gb": 7.721559524536133, "step_time_ms": 7502.97474861145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:04] (step=0004826) Train Loss: 0.2337, Train Steps/Sec: 0.12, Epoch: 0.09378157792460164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4827, "loss": 0.15194761753082275, "memory_gb": 7.721559524536133, "step_time_ms": 7588.395118713379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:12] (step=0004827) Train Loss: 0.1630, Train Steps/Sec: 0.13, Epoch: 0.09380101049358726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4828, "loss": 0.19192460179328918, "memory_gb": 7.721559524536133, "step_time_ms": 7483.740568161011, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:20] (step=0004828) Train Loss: 0.2221, Train Steps/Sec: 0.12, Epoch: 0.09382044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4829, "loss": 0.2998592257499695, "memory_gb": 7.721559524536133, "step_time_ms": 7555.577754974365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:28] (step=0004829) Train Loss: 0.3030, Train Steps/Sec: 0.12, Epoch: 0.09383987563155849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4830, "loss": 0.2447901964187622, "memory_gb": 7.715639114379883, "step_time_ms": 7452.155590057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:36] (step=0004830) Train Loss: 0.2206, Train Steps/Sec: 0.13, Epoch: 0.09385930820054411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4831, "loss": 0.21290373802185059, "memory_gb": 7.721559524536133, "step_time_ms": 7411.9391441345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:44] (step=0004831) Train Loss: 0.2107, Train Steps/Sec: 0.13, Epoch: 0.09387874076952973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4832, "loss": 0.20944002270698547, "memory_gb": 7.721559524536133, "step_time_ms": 7509.98854637146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:37:52] (step=0004832) Train Loss: 0.2733, Train Steps/Sec: 0.12, Epoch: 0.09389817333851536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4833, "loss": 0.25023141503334045, "memory_gb": 7.721559524536133, "step_time_ms": 7449.509382247925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:00] (step=0004833) Train Loss: 0.2708, Train Steps/Sec: 0.12, Epoch: 0.09391760590750098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4834, "loss": 0.16946366429328918, "memory_gb": 7.721559524536133, "step_time_ms": 7470.37410736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:08] (step=0004834) Train Loss: 0.1585, Train Steps/Sec: 0.13, Epoch: 0.09393703847648659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4835, "loss": 0.29159820079803467, "memory_gb": 7.721559524536133, "step_time_ms": 7465.895175933838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:16] (step=0004835) Train Loss: 0.2746, Train Steps/Sec: 0.13, Epoch: 0.09395647104547221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4836, "loss": 0.16260001063346863, "memory_gb": 7.721559524536133, "step_time_ms": 7534.44504737854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:24] (step=0004836) Train Loss: 0.2480, Train Steps/Sec: 0.12, Epoch: 0.09397590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4837, "loss": 0.16057738661766052, "memory_gb": 7.721559524536133, "step_time_ms": 7459.1639041900635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:32] (step=0004837) Train Loss: 0.1992, Train Steps/Sec: 0.12, Epoch: 0.09399533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4838, "loss": 0.19790877401828766, "memory_gb": 7.721559524536133, "step_time_ms": 7501.369953155518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:40] (step=0004838) Train Loss: 0.2020, Train Steps/Sec: 0.13, Epoch: 0.09401476875242908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4839, "loss": 0.29257816076278687, "memory_gb": 7.721559524536133, "step_time_ms": 7499.268054962158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:48] (step=0004839) Train Loss: 0.2433, Train Steps/Sec: 0.12, Epoch: 0.09403420132141468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:38:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4840, "loss": 0.20267638564109802, "memory_gb": 7.721559524536133, "step_time_ms": 5431.950807571411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:38:54] (step=0004840) Train Loss: 0.2235, Train Steps/Sec: 0.17, Epoch: 0.0940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4841, "loss": 0.24457964301109314, "memory_gb": 7.721559524536133, "step_time_ms": 7493.422985076904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:02] (step=0004841) Train Loss: 0.2214, Train Steps/Sec: 0.12, Epoch: 0.09407306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4842, "loss": 0.14639565348625183, "memory_gb": 7.721559524536133, "step_time_ms": 7416.4886474609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:10] (step=0004842) Train Loss: 0.1510, Train Steps/Sec: 0.13, Epoch: 0.09409249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4843, "loss": 0.2323886752128601, "memory_gb": 7.715639114379883, "step_time_ms": 7457.900285720825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:18] (step=0004843) Train Loss: 0.2361, Train Steps/Sec: 0.12, Epoch: 0.09411193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4844, "loss": 0.27155300974845886, "memory_gb": 7.721559524536133, "step_time_ms": 7486.268520355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:26] (step=0004844) Train Loss: 0.2422, Train Steps/Sec: 0.12, Epoch: 0.0941313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4845, "loss": 0.21665707230567932, "memory_gb": 7.721559524536133, "step_time_ms": 7412.217140197754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:34] (step=0004845) Train Loss: 0.2594, Train Steps/Sec: 0.13, Epoch: 0.0941507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4846, "loss": 0.21397347748279572, "memory_gb": 7.721559524536133, "step_time_ms": 7447.88384437561, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:42] (step=0004846) Train Loss: 0.1839, Train Steps/Sec: 0.13, Epoch: 0.09417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4847, "loss": 0.24060046672821045, "memory_gb": 7.721559524536133, "step_time_ms": 7449.2506980896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:50] (step=0004847) Train Loss: 0.2453, Train Steps/Sec: 0.13, Epoch: 0.09418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4848, "loss": 0.17770297825336456, "memory_gb": 7.721559524536133, "step_time_ms": 7457.682371139526, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:39:58] (step=0004848) Train Loss: 0.1551, Train Steps/Sec: 0.12, Epoch: 0.09420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4849, "loss": 0.21201804280281067, "memory_gb": 7.721559524536133, "step_time_ms": 7411.515235900879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:06] (step=0004849) Train Loss: 0.2287, Train Steps/Sec: 0.13, Epoch: 0.0942285270112709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4850, "loss": 0.27528780698776245, "memory_gb": 7.721559524536133, "step_time_ms": 7465.890884399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:14] (step=0004850) Train Loss: 0.2546, Train Steps/Sec: 0.13, Epoch: 0.09424795958025652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4851, "loss": 0.2557299733161926, "memory_gb": 7.721559524536133, "step_time_ms": 7405.991315841675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:22] (step=0004851) Train Loss: 0.2299, Train Steps/Sec: 0.13, Epoch: 0.09426739214924212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4852, "loss": 0.24013158679008484, "memory_gb": 7.721559524536133, "step_time_ms": 7437.274694442749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:30] (step=0004852) Train Loss: 0.2588, Train Steps/Sec: 0.13, Epoch: 0.09428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4853, "loss": 0.28784388303756714, "memory_gb": 7.721559524536133, "step_time_ms": 7480.136156082153, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:38] (step=0004853) Train Loss: 0.2690, Train Steps/Sec: 0.12, Epoch: 0.09430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4854, "loss": 0.3237234055995941, "memory_gb": 7.721559524536133, "step_time_ms": 7427.532196044922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:46] (step=0004854) Train Loss: 0.2467, Train Steps/Sec: 0.13, Epoch: 0.09432568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4855, "loss": 0.21327170729637146, "memory_gb": 7.721559524536133, "step_time_ms": 7439.523935317993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:40:54] (step=0004855) Train Loss: 0.2162, Train Steps/Sec: 0.13, Epoch: 0.09434512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4856, "loss": 0.2921049892902374, "memory_gb": 7.721559524536133, "step_time_ms": 7471.043109893799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:02] (step=0004856) Train Loss: 0.2762, Train Steps/Sec: 0.12, Epoch: 0.09436455499417024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 4857, "loss": 0.22011983394622803, "memory_gb": 7.721559524536133, "step_time_ms": 7482.633590698242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:11] (step=0004857) Train Loss: 0.2086, Train Steps/Sec: 0.12, Epoch: 0.09438398756315584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 4858, "loss": 0.191685751080513, "memory_gb": 7.721559524536133, "step_time_ms": 7465.87347984314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:19] (step=0004858) Train Loss: 0.2038, Train Steps/Sec: 0.12, Epoch: 0.09440342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 4859, "loss": 0.33507171273231506, "memory_gb": 7.721559524536133, "step_time_ms": 7472.210884094238, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:27] (step=0004859) Train Loss: 0.2618, Train Steps/Sec: 0.12, Epoch: 0.09442285270112709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4860, "loss": 0.2239028662443161, "memory_gb": 7.721559524536133, "step_time_ms": 7431.556701660156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:35] (step=0004860) Train Loss: 0.2467, Train Steps/Sec: 0.13, Epoch: 0.09444228527011271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4861, "loss": 0.2268926501274109, "memory_gb": 7.721559524536133, "step_time_ms": 7480.147361755371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:43] (step=0004861) Train Loss: 0.2181, Train Steps/Sec: 0.12, Epoch: 0.09446171783909833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4862, "loss": 0.27210062742233276, "memory_gb": 7.721559524536133, "step_time_ms": 7479.772090911865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:51] (step=0004862) Train Loss: 0.2591, Train Steps/Sec: 0.12, Epoch: 0.09448115040808396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:41:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4863, "loss": 0.21587219834327698, "memory_gb": 7.721559524536133, "step_time_ms": 7455.991268157959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:41:59] (step=0004863) Train Loss: 0.2027, Train Steps/Sec: 0.13, Epoch: 0.09450058297706956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4864, "loss": 0.25423985719680786, "memory_gb": 7.721559524536133, "step_time_ms": 7516.923189163208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:07] (step=0004864) Train Loss: 0.2421, Train Steps/Sec: 0.12, Epoch: 0.09452001554605519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4865, "loss": 0.28992974758148193, "memory_gb": 7.721559524536133, "step_time_ms": 7528.428316116333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:15] (step=0004865) Train Loss: 0.2850, Train Steps/Sec: 0.12, Epoch: 0.09453944811504081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4866, "loss": 0.18238942325115204, "memory_gb": 7.721559524536133, "step_time_ms": 7460.400342941284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:23] (step=0004866) Train Loss: 0.2221, Train Steps/Sec: 0.13, Epoch: 0.09455888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4867, "loss": 0.2739986181259155, "memory_gb": 7.721559524536133, "step_time_ms": 7375.366687774658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:31] (step=0004867) Train Loss: 0.2400, Train Steps/Sec: 0.12, Epoch: 0.09457831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4868, "loss": 0.25612372159957886, "memory_gb": 7.721559524536133, "step_time_ms": 7655.29203414917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:39] (step=0004868) Train Loss: 0.2663, Train Steps/Sec: 0.12, Epoch: 0.09459774582199766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4869, "loss": 0.30011045932769775, "memory_gb": 7.721559524536133, "step_time_ms": 5217.427015304565, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:45] (step=0004869) Train Loss: 0.2858, Train Steps/Sec: 0.17, Epoch: 0.09461717839098328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:42:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4870, "loss": 0.22629496455192566, "memory_gb": 7.721559524536133, "step_time_ms": 7580.930233001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:42:53] (step=0004870) Train Loss: 0.2503, Train Steps/Sec: 0.12, Epoch: 0.0946366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4871, "loss": 0.24974438548088074, "memory_gb": 7.721559524536133, "step_time_ms": 7522.918224334717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:01] (step=0004871) Train Loss: 0.2207, Train Steps/Sec: 0.12, Epoch: 0.09465604352895453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4872, "loss": 0.2929658591747284, "memory_gb": 7.721559524536133, "step_time_ms": 7512.733697891235, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:09] (step=0004872) Train Loss: 0.2447, Train Steps/Sec: 0.12, Epoch: 0.09467547609794015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4873, "loss": 0.1689426600933075, "memory_gb": 7.721559524536133, "step_time_ms": 7493.32594871521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:17] (step=0004873) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.09469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4874, "loss": 0.19136327505111694, "memory_gb": 7.721559524536133, "step_time_ms": 7474.348068237305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:25] (step=0004874) Train Loss: 0.2101, Train Steps/Sec: 0.12, Epoch: 0.09471434123591138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4875, "loss": 0.2637510299682617, "memory_gb": 7.715639114379883, "step_time_ms": 7445.26481628418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:33] (step=0004875) Train Loss: 0.3002, Train Steps/Sec: 0.13, Epoch: 0.094733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4876, "loss": 0.2898566424846649, "memory_gb": 7.721559524536133, "step_time_ms": 7591.399669647217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:41] (step=0004876) Train Loss: 0.2684, Train Steps/Sec: 0.12, Epoch: 0.09475320637388263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4877, "loss": 0.36861059069633484, "memory_gb": 7.721559524536133, "step_time_ms": 7493.751764297485, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:49] (step=0004877) Train Loss: 0.3305, Train Steps/Sec: 0.13, Epoch: 0.09477263894286825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:43:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4878, "loss": 0.19456440210342407, "memory_gb": 7.721559524536133, "step_time_ms": 7538.902282714844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:43:57] (step=0004878) Train Loss: 0.1910, Train Steps/Sec: 0.12, Epoch: 0.09479207151185387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4879, "loss": 0.24401403963565826, "memory_gb": 7.721559524536133, "step_time_ms": 7591.939926147461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:05] (step=0004879) Train Loss: 0.2222, Train Steps/Sec: 0.12, Epoch: 0.09481150408083949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4880, "loss": 0.19184821844100952, "memory_gb": 7.721559524536133, "step_time_ms": 7466.273784637451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:13] (step=0004880) Train Loss: 0.1816, Train Steps/Sec: 0.13, Epoch: 0.0948309366498251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4881, "loss": 0.17495471239089966, "memory_gb": 7.721559524536133, "step_time_ms": 7477.224349975586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:21] (step=0004881) Train Loss: 0.2380, Train Steps/Sec: 0.13, Epoch: 0.09485036921881072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4882, "loss": 0.24704419076442719, "memory_gb": 7.721559524536133, "step_time_ms": 7305.45449256897, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:29] (step=0004882) Train Loss: 0.2165, Train Steps/Sec: 0.12, Epoch: 0.09486980178779635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4883, "loss": 0.21735301613807678, "memory_gb": 7.721559524536133, "step_time_ms": 7452.137470245361, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:37] (step=0004883) Train Loss: 0.2513, Train Steps/Sec: 0.13, Epoch: 0.09488923435678197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4884, "loss": 0.16926297545433044, "memory_gb": 7.721559524536133, "step_time_ms": 7519.8915004730225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:45] (step=0004884) Train Loss: 0.1629, Train Steps/Sec: 0.12, Epoch: 0.09490866692576759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4885, "loss": 0.35361742973327637, "memory_gb": 7.721559524536133, "step_time_ms": 7560.187101364136, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:44:53] (step=0004885) Train Loss: 0.2647, Train Steps/Sec: 0.12, Epoch: 0.09492809949475321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4886, "loss": 0.2469320148229599, "memory_gb": 7.721559524536133, "step_time_ms": 7459.263563156128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:01] (step=0004886) Train Loss: 0.2822, Train Steps/Sec: 0.13, Epoch: 0.09494753206373882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4887, "loss": 0.32579901814460754, "memory_gb": 7.721559524536133, "step_time_ms": 7463.680028915405, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:09] (step=0004887) Train Loss: 0.2346, Train Steps/Sec: 0.13, Epoch: 0.09496696463272444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4888, "loss": 0.17513374984264374, "memory_gb": 7.721559524536133, "step_time_ms": 7554.43263053894, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:17] (step=0004888) Train Loss: 0.1903, Train Steps/Sec: 0.13, Epoch: 0.09498639720171007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4889, "loss": 0.21084600687026978, "memory_gb": 7.721559524536133, "step_time_ms": 7496.669769287109, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:25] (step=0004889) Train Loss: 0.2146, Train Steps/Sec: 0.12, Epoch: 0.09500582977069569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4890, "loss": 0.20769725739955902, "memory_gb": 7.721559524536133, "step_time_ms": 7535.299301147461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:33] (step=0004890) Train Loss: 0.2594, Train Steps/Sec: 0.12, Epoch: 0.09502526233968131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4891, "loss": 0.2369016408920288, "memory_gb": 7.721559524536133, "step_time_ms": 7549.8175621032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:41] (step=0004891) Train Loss: 0.2233, Train Steps/Sec: 0.12, Epoch: 0.09504469490866693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4892, "loss": 0.20314139127731323, "memory_gb": 7.721559524536133, "step_time_ms": 7464.13779258728, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:49] (step=0004892) Train Loss: 0.1871, Train Steps/Sec: 0.12, Epoch: 0.09506412747765254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4893, "loss": 0.3215608298778534, "memory_gb": 7.721559524536133, "step_time_ms": 7403.185129165649, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:45:57] (step=0004893) Train Loss: 0.2719, Train Steps/Sec: 0.13, Epoch: 0.09508356004663816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4894, "loss": 0.30046671628952026, "memory_gb": 7.715639114379883, "step_time_ms": 7442.60573387146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:05] (step=0004894) Train Loss: 0.2393, Train Steps/Sec: 0.12, Epoch: 0.09510299261562379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4895, "loss": 0.26033392548561096, "memory_gb": 7.721559524536133, "step_time_ms": 7457.251787185669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:13] (step=0004895) Train Loss: 0.2111, Train Steps/Sec: 0.13, Epoch: 0.09512242518460941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4896, "loss": 0.33208057284355164, "memory_gb": 7.721559524536133, "step_time_ms": 7266.748666763306, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:21] (step=0004896) Train Loss: 0.3369, Train Steps/Sec: 0.13, Epoch: 0.09514185775359503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4897, "loss": 0.2655092775821686, "memory_gb": 7.721559524536133, "step_time_ms": 7513.2176876068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:29] (step=0004897) Train Loss: 0.2530, Train Steps/Sec: 0.12, Epoch: 0.09516129032258064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 4898, "loss": 0.30648553371429443, "memory_gb": 7.721559524536133, "step_time_ms": 4950.4234790802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:35] (step=0004898) Train Loss: 0.2352, Train Steps/Sec: 0.18, Epoch: 0.09518072289156626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 4899, "loss": 0.2743435800075531, "memory_gb": 7.721559524536133, "step_time_ms": 7512.77494430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:43] (step=0004899) Train Loss: 0.2146, Train Steps/Sec: 0.12, Epoch: 0.09520015546055188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 4900, "loss": 0.2501351833343506, "memory_gb": 7.721559524536133, "step_time_ms": 7467.4999713897705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:51] (step=0004900) Train Loss: 0.2756, Train Steps/Sec: 0.12, Epoch: 0.0952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 4901, "loss": 0.37291407585144043, "memory_gb": 7.721559524536133, "step_time_ms": 7458.917140960693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:46:59] (step=0004901) Train Loss: 0.2910, Train Steps/Sec: 0.12, Epoch: 0.09523902059852313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 4902, "loss": 0.3441164493560791, "memory_gb": 7.721559524536133, "step_time_ms": 7560.35852432251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:07] (step=0004902) Train Loss: 0.2609, Train Steps/Sec: 0.12, Epoch: 0.09525845316750875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 4903, "loss": 0.31054651737213135, "memory_gb": 7.721559524536133, "step_time_ms": 7465.161561965942, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:15] (step=0004903) Train Loss: 0.3173, Train Steps/Sec: 0.13, Epoch: 0.09527788573649436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 4904, "loss": 0.2171282172203064, "memory_gb": 7.721559524536133, "step_time_ms": 7442.1985149383545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:23] (step=0004904) Train Loss: 0.2101, Train Steps/Sec: 0.13, Epoch: 0.09529731830547998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 4905, "loss": 0.2343175858259201, "memory_gb": 7.721559524536133, "step_time_ms": 7582.1850299835205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:31] (step=0004905) Train Loss: 0.2194, Train Steps/Sec: 0.12, Epoch: 0.0953167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 4906, "loss": 0.27551037073135376, "memory_gb": 7.721559524536133, "step_time_ms": 7447.750091552734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:39] (step=0004906) Train Loss: 0.2790, Train Steps/Sec: 0.13, Epoch: 0.09533618344345123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 4907, "loss": 0.2447129338979721, "memory_gb": 7.721559524536133, "step_time_ms": 7505.953073501587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:47] (step=0004907) Train Loss: 0.2519, Train Steps/Sec: 0.12, Epoch: 0.09535561601243685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:47:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 4908, "loss": 0.20100530982017517, "memory_gb": 7.721559524536133, "step_time_ms": 7568.295478820801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:47:55] (step=0004908) Train Loss: 0.2156, Train Steps/Sec: 0.12, Epoch: 0.09537504858142247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 4909, "loss": 0.2800006568431854, "memory_gb": 7.721559524536133, "step_time_ms": 7589.33162689209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:04] (step=0004909) Train Loss: 0.2741, Train Steps/Sec: 0.12, Epoch: 0.09539448115040808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4910, "loss": 0.2397298961877823, "memory_gb": 7.721559524536133, "step_time_ms": 7557.401657104492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:12] (step=0004910) Train Loss: 0.2590, Train Steps/Sec: 0.12, Epoch: 0.0954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4911, "loss": 0.30410414934158325, "memory_gb": 7.721559524536133, "step_time_ms": 7605.286121368408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:20] (step=0004911) Train Loss: 0.3167, Train Steps/Sec: 0.12, Epoch: 0.09543334628837932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4912, "loss": 0.21930821239948273, "memory_gb": 7.721559524536133, "step_time_ms": 7552.582263946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:28] (step=0004912) Train Loss: 0.2492, Train Steps/Sec: 0.12, Epoch: 0.09545277885736494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4913, "loss": 0.37429121136665344, "memory_gb": 7.721559524536133, "step_time_ms": 7568.389415740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:36] (step=0004913) Train Loss: 0.3141, Train Steps/Sec: 0.12, Epoch: 0.09547221142635057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4914, "loss": 0.23538559675216675, "memory_gb": 7.721559524536133, "step_time_ms": 7592.536449432373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:44] (step=0004914) Train Loss: 0.2572, Train Steps/Sec: 0.12, Epoch: 0.09549164399533619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:48:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4915, "loss": 0.19365113973617554, "memory_gb": 7.721559524536133, "step_time_ms": 7598.634958267212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:48:52] (step=0004915) Train Loss: 0.2250, Train Steps/Sec: 0.12, Epoch: 0.0955110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4916, "loss": 0.23454272747039795, "memory_gb": 7.721559524536133, "step_time_ms": 7715.1758670806885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:00] (step=0004916) Train Loss: 0.2297, Train Steps/Sec: 0.12, Epoch: 0.09553050913330742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4917, "loss": 0.3236720561981201, "memory_gb": 7.721559524536133, "step_time_ms": 7617.790460586548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:08] (step=0004917) Train Loss: 0.2274, Train Steps/Sec: 0.12, Epoch: 0.09554994170229304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4918, "loss": 0.1750921756029129, "memory_gb": 7.721559524536133, "step_time_ms": 7592.1595096588135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:16] (step=0004918) Train Loss: 0.2151, Train Steps/Sec: 0.12, Epoch: 0.09556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4919, "loss": 0.16074764728546143, "memory_gb": 7.721559524536133, "step_time_ms": 7550.7471561431885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:24] (step=0004919) Train Loss: 0.2464, Train Steps/Sec: 0.13, Epoch: 0.09558880684026429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4920, "loss": 0.23253537714481354, "memory_gb": 7.721559524536133, "step_time_ms": 7687.095642089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:32] (step=0004920) Train Loss: 0.2523, Train Steps/Sec: 0.12, Epoch: 0.09560823940924991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4921, "loss": 0.2796754240989685, "memory_gb": 7.721559524536133, "step_time_ms": 7535.186052322388, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:40] (step=0004921) Train Loss: 0.2483, Train Steps/Sec: 0.12, Epoch: 0.09562767197823552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4922, "loss": 0.15683512389659882, "memory_gb": 7.721559524536133, "step_time_ms": 7567.673206329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:48] (step=0004922) Train Loss: 0.1752, Train Steps/Sec: 0.12, Epoch: 0.09564710454722114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:49:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4923, "loss": 0.1435803472995758, "memory_gb": 7.721559524536133, "step_time_ms": 7682.285308837891, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:49:56] (step=0004923) Train Loss: 0.1662, Train Steps/Sec: 0.12, Epoch: 0.09566653711620676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4924, "loss": 0.151179239153862, "memory_gb": 7.721559524536133, "step_time_ms": 7570.907115936279, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:04] (step=0004924) Train Loss: 0.1731, Train Steps/Sec: 0.13, Epoch: 0.09568596968519238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4925, "loss": 0.16087615489959717, "memory_gb": 7.721559524536133, "step_time_ms": 7461.674928665161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:12] (step=0004925) Train Loss: 0.1928, Train Steps/Sec: 0.13, Epoch: 0.09570540225417801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4926, "loss": 0.27537471055984497, "memory_gb": 7.721559524536133, "step_time_ms": 7668.645620346069, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:20] (step=0004926) Train Loss: 0.2243, Train Steps/Sec: 0.12, Epoch: 0.09572483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4927, "loss": 0.24594534933567047, "memory_gb": 7.721559524536133, "step_time_ms": 5267.317056655884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:26] (step=0004927) Train Loss: 0.2238, Train Steps/Sec: 0.18, Epoch: 0.09574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4928, "loss": 0.2055319994688034, "memory_gb": 7.721559524536133, "step_time_ms": 7566.713094711304, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:34] (step=0004928) Train Loss: 0.2646, Train Steps/Sec: 0.12, Epoch: 0.09576369996113486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4929, "loss": 0.16489127278327942, "memory_gb": 7.721559524536133, "step_time_ms": 7434.127330780029, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:42] (step=0004929) Train Loss: 0.1824, Train Steps/Sec: 0.13, Epoch: 0.09578313253012048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4930, "loss": 0.313956081867218, "memory_gb": 7.721559524536133, "step_time_ms": 7442.3394203186035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:50] (step=0004930) Train Loss: 0.2869, Train Steps/Sec: 0.12, Epoch: 0.0958025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4931, "loss": 0.20359161496162415, "memory_gb": 7.721559524536133, "step_time_ms": 7470.383882522583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:50:58] (step=0004931) Train Loss: 0.2206, Train Steps/Sec: 0.12, Epoch: 0.09582199766809173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4932, "loss": 0.22963812947273254, "memory_gb": 7.721559524536133, "step_time_ms": 7419.708251953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:06] (step=0004932) Train Loss: 0.2592, Train Steps/Sec: 0.13, Epoch: 0.09584143023707734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4933, "loss": 0.2769782543182373, "memory_gb": 7.721559524536133, "step_time_ms": 7398.458003997803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:14] (step=0004933) Train Loss: 0.2474, Train Steps/Sec: 0.13, Epoch: 0.09586086280606296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4934, "loss": 0.3304310441017151, "memory_gb": 7.721559524536133, "step_time_ms": 7511.633634567261, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:22] (step=0004934) Train Loss: 0.2968, Train Steps/Sec: 0.12, Epoch: 0.09588029537504858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4935, "loss": 0.3079749047756195, "memory_gb": 7.721559524536133, "step_time_ms": 7470.560312271118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:30] (step=0004935) Train Loss: 0.2269, Train Steps/Sec: 0.12, Epoch: 0.0958997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4936, "loss": 0.23514887690544128, "memory_gb": 7.721559524536133, "step_time_ms": 7440.568208694458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:38] (step=0004936) Train Loss: 0.2588, Train Steps/Sec: 0.12, Epoch: 0.09591916051301982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4937, "loss": 0.25008895993232727, "memory_gb": 7.721559524536133, "step_time_ms": 7488.835096359253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:46] (step=0004937) Train Loss: 0.2530, Train Steps/Sec: 0.12, Epoch: 0.09593859308200545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4938, "loss": 0.26477938890457153, "memory_gb": 7.721559524536133, "step_time_ms": 7469.928741455078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:51:54] (step=0004938) Train Loss: 0.2596, Train Steps/Sec: 0.12, Epoch: 0.09595802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4939, "loss": 0.21803651750087738, "memory_gb": 7.721559524536133, "step_time_ms": 7504.569530487061, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:02] (step=0004939) Train Loss: 0.2531, Train Steps/Sec: 0.12, Epoch: 0.09597745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4940, "loss": 0.2752767503261566, "memory_gb": 7.721559524536133, "step_time_ms": 7534.144878387451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:10] (step=0004940) Train Loss: 0.2674, Train Steps/Sec: 0.12, Epoch: 0.0959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 4941, "loss": 0.14784656465053558, "memory_gb": 7.721559524536133, "step_time_ms": 7406.585216522217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:18] (step=0004941) Train Loss: 0.1692, Train Steps/Sec: 0.13, Epoch: 0.09601632335794792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 4942, "loss": 0.25384438037872314, "memory_gb": 7.721559524536133, "step_time_ms": 7357.871294021606, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:26] (step=0004942) Train Loss: 0.2648, Train Steps/Sec: 0.13, Epoch: 0.09603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 4943, "loss": 0.18959160149097443, "memory_gb": 7.721559524536133, "step_time_ms": 7470.955610275269, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:34] (step=0004943) Train Loss: 0.2261, Train Steps/Sec: 0.12, Epoch: 0.09605518849591917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4944, "loss": 0.1881043016910553, "memory_gb": 7.721559524536133, "step_time_ms": 7468.258619308472, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:42] (step=0004944) Train Loss: 0.2405, Train Steps/Sec: 0.13, Epoch: 0.09607462106490477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4945, "loss": 0.22643914818763733, "memory_gb": 7.715639114379883, "step_time_ms": 7377.26616859436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:50] (step=0004945) Train Loss: 0.2294, Train Steps/Sec: 0.13, Epoch: 0.0960940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:52:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4946, "loss": 0.3406417965888977, "memory_gb": 7.721559524536133, "step_time_ms": 7489.694595336914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:52:58] (step=0004946) Train Loss: 0.2803, Train Steps/Sec: 0.12, Epoch: 0.09611348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4947, "loss": 0.20959395170211792, "memory_gb": 7.721559524536133, "step_time_ms": 7504.477262496948, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:06] (step=0004947) Train Loss: 0.2453, Train Steps/Sec: 0.12, Epoch: 0.09613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4948, "loss": 0.1552160680294037, "memory_gb": 7.721559524536133, "step_time_ms": 7263.822793960571, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:14] (step=0004948) Train Loss: 0.2201, Train Steps/Sec: 0.12, Epoch: 0.09615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4949, "loss": 0.25757351517677307, "memory_gb": 7.721559524536133, "step_time_ms": 7519.658803939819, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:22] (step=0004949) Train Loss: 0.2270, Train Steps/Sec: 0.12, Epoch: 0.09617178390983289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4950, "loss": 0.18818524479866028, "memory_gb": 7.721559524536133, "step_time_ms": 7504.595518112183, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:30] (step=0004950) Train Loss: 0.1800, Train Steps/Sec: 0.12, Epoch: 0.0961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4951, "loss": 0.2925337255001068, "memory_gb": 7.721559524536133, "step_time_ms": 7469.653129577637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:38] (step=0004951) Train Loss: 0.2631, Train Steps/Sec: 0.12, Epoch: 0.09621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4952, "loss": 0.1931881606578827, "memory_gb": 7.721559524536133, "step_time_ms": 7546.167850494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:46] (step=0004952) Train Loss: 0.1966, Train Steps/Sec: 0.12, Epoch: 0.09623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:53:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4953, "loss": 0.2156454175710678, "memory_gb": 7.721559524536133, "step_time_ms": 7540.719270706177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:53:54] (step=0004953) Train Loss: 0.2662, Train Steps/Sec: 0.12, Epoch: 0.09624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 4954, "loss": 0.22535490989685059, "memory_gb": 7.721559524536133, "step_time_ms": 7330.337762832642, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:02] (step=0004954) Train Loss: 0.2428, Train Steps/Sec: 0.13, Epoch: 0.09626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 4955, "loss": 0.29919326305389404, "memory_gb": 7.721559524536133, "step_time_ms": 7567.319393157959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:10] (step=0004955) Train Loss: 0.2785, Train Steps/Sec: 0.13, Epoch: 0.09628837932374659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4956, "loss": 0.2611677944660187, "memory_gb": 7.721559524536133, "step_time_ms": 6356.419801712036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:17] (step=0004956) Train Loss: 0.2295, Train Steps/Sec: 0.15, Epoch: 0.09630781189273221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4957, "loss": 0.20822295546531677, "memory_gb": 7.715639114379883, "step_time_ms": 7489.148378372192, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:25] (step=0004957) Train Loss: 0.2481, Train Steps/Sec: 0.12, Epoch: 0.09632724446171784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4958, "loss": 0.2685042917728424, "memory_gb": 7.721559524536133, "step_time_ms": 7493.854522705078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:33] (step=0004958) Train Loss: 0.2148, Train Steps/Sec: 0.12, Epoch: 0.09634667703070346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 4959, "loss": 0.12015298753976822, "memory_gb": 7.721559524536133, "step_time_ms": 7385.591506958008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:41] (step=0004959) Train Loss: 0.1875, Train Steps/Sec: 0.12, Epoch: 0.09636610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 4960, "loss": 0.2400825023651123, "memory_gb": 7.721559524536133, "step_time_ms": 7459.513187408447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:49] (step=0004960) Train Loss: 0.2704, Train Steps/Sec: 0.12, Epoch: 0.0963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:54:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 4961, "loss": 0.1745966672897339, "memory_gb": 7.721559524536133, "step_time_ms": 7513.519048690796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:54:57] (step=0004961) Train Loss: 0.2036, Train Steps/Sec: 0.12, Epoch: 0.09640497473766031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 4962, "loss": 0.2615559697151184, "memory_gb": 7.721559524536133, "step_time_ms": 7562.039852142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:05] (step=0004962) Train Loss: 0.2809, Train Steps/Sec: 0.12, Epoch: 0.09642440730664593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 4963, "loss": 0.3663368225097656, "memory_gb": 7.721559524536133, "step_time_ms": 7550.507545471191, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:13] (step=0004963) Train Loss: 0.3361, Train Steps/Sec: 0.12, Epoch: 0.09644383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 4964, "loss": 0.20787155628204346, "memory_gb": 7.721559524536133, "step_time_ms": 7552.446603775024, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:21] (step=0004964) Train Loss: 0.2847, Train Steps/Sec: 0.12, Epoch: 0.09646327244461718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 4965, "loss": 0.20450937747955322, "memory_gb": 7.721559524536133, "step_time_ms": 7488.760709762573, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:29] (step=0004965) Train Loss: 0.2484, Train Steps/Sec: 0.12, Epoch: 0.0964827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 4966, "loss": 0.21119466423988342, "memory_gb": 7.721559524536133, "step_time_ms": 7553.105592727661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:37] (step=0004966) Train Loss: 0.2236, Train Steps/Sec: 0.12, Epoch: 0.09650213758258842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 4967, "loss": 0.14789071679115295, "memory_gb": 7.721559524536133, "step_time_ms": 7565.28115272522, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:45] (step=0004967) Train Loss: 0.1887, Train Steps/Sec: 0.12, Epoch: 0.09652157015157403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 4968, "loss": 0.19198527932167053, "memory_gb": 7.721559524536133, "step_time_ms": 7487.387180328369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:55:53] (step=0004968) Train Loss: 0.2645, Train Steps/Sec: 0.13, Epoch: 0.09654100272055965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4969, "loss": 0.1607196033000946, "memory_gb": 7.721559524536133, "step_time_ms": 7529.837608337402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:01] (step=0004969) Train Loss: 0.2011, Train Steps/Sec: 0.12, Epoch: 0.09656043528954528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 4970, "loss": 0.1944693624973297, "memory_gb": 7.721559524536133, "step_time_ms": 7614.747762680054, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:09] (step=0004970) Train Loss: 0.2244, Train Steps/Sec: 0.12, Epoch: 0.0965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 4971, "loss": 0.25476497411727905, "memory_gb": 7.721559524536133, "step_time_ms": 7521.241664886475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:17] (step=0004971) Train Loss: 0.2388, Train Steps/Sec: 0.12, Epoch: 0.09659930042751652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 4972, "loss": 0.33498257398605347, "memory_gb": 7.721559524536133, "step_time_ms": 7660.310745239258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:25] (step=0004972) Train Loss: 0.2931, Train Steps/Sec: 0.12, Epoch: 0.09661873299650214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 4973, "loss": 0.3077661991119385, "memory_gb": 7.721559524536133, "step_time_ms": 7600.980758666992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:33] (step=0004973) Train Loss: 0.2589, Train Steps/Sec: 0.12, Epoch: 0.09663816556548775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 4974, "loss": 0.36041057109832764, "memory_gb": 7.715639114379883, "step_time_ms": 7526.501417160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:42] (step=0004974) Train Loss: 0.2999, Train Steps/Sec: 0.12, Epoch: 0.09665759813447337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 4975, "loss": 0.3045775890350342, "memory_gb": 7.721559524536133, "step_time_ms": 7518.810033798218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:50] (step=0004975) Train Loss: 0.2904, Train Steps/Sec: 0.12, Epoch: 0.096677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:56:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 4976, "loss": 0.24116528034210205, "memory_gb": 7.721559524536133, "step_time_ms": 7521.239519119263, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:56:58] (step=0004976) Train Loss: 0.2455, Train Steps/Sec: 0.12, Epoch: 0.09669646327244462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 4977, "loss": 0.20545640587806702, "memory_gb": 7.721559524536133, "step_time_ms": 7472.188949584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:06] (step=0004977) Train Loss: 0.2478, Train Steps/Sec: 0.12, Epoch: 0.09671589584143024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 4978, "loss": 0.24781803786754608, "memory_gb": 7.721559524536133, "step_time_ms": 7473.253965377808, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:14] (step=0004978) Train Loss: 0.2640, Train Steps/Sec: 0.13, Epoch: 0.09673532841041586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 4979, "loss": 0.16502639651298523, "memory_gb": 7.721559524536133, "step_time_ms": 7558.725118637085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:22] (step=0004979) Train Loss: 0.2202, Train Steps/Sec: 0.12, Epoch: 0.09675476097940147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 4980, "loss": 0.20393116772174835, "memory_gb": 7.721559524536133, "step_time_ms": 7462.9786014556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:30] (step=0004980) Train Loss: 0.2394, Train Steps/Sec: 0.12, Epoch: 0.0967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 4981, "loss": 0.16705811023712158, "memory_gb": 7.721559524536133, "step_time_ms": 7494.290113449097, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:38] (step=0004981) Train Loss: 0.2618, Train Steps/Sec: 0.12, Epoch: 0.09679362611737272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 4982, "loss": 0.2284354269504547, "memory_gb": 7.721559524536133, "step_time_ms": 7539.527177810669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:46] (step=0004982) Train Loss: 0.2282, Train Steps/Sec: 0.12, Epoch: 0.09681305868635834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:57:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 4983, "loss": 0.23569101095199585, "memory_gb": 7.721559524536133, "step_time_ms": 7342.735767364502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:57:54] (step=0004983) Train Loss: 0.2094, Train Steps/Sec: 0.13, Epoch: 0.09683249125534396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 4984, "loss": 0.2066667377948761, "memory_gb": 7.721559524536133, "step_time_ms": 6957.57269859314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:01] (step=0004984) Train Loss: 0.2071, Train Steps/Sec: 0.14, Epoch: 0.09685192382432958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 4985, "loss": 0.2487722933292389, "memory_gb": 7.721559524536133, "step_time_ms": 6016.394376754761, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:08] (step=0004985) Train Loss: 0.2632, Train Steps/Sec: 0.15, Epoch: 0.09687135639331519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 4986, "loss": 0.1851242631673813, "memory_gb": 7.721559524536133, "step_time_ms": 7471.887826919556, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:16] (step=0004986) Train Loss: 0.1948, Train Steps/Sec: 0.13, Epoch: 0.09689078896230081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 4987, "loss": 0.2268228530883789, "memory_gb": 7.721559524536133, "step_time_ms": 7557.508707046509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:24] (step=0004987) Train Loss: 0.2028, Train Steps/Sec: 0.12, Epoch: 0.09691022153128644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 4988, "loss": 0.1972106695175171, "memory_gb": 7.721559524536133, "step_time_ms": 7445.672035217285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:32] (step=0004988) Train Loss: 0.1954, Train Steps/Sec: 0.13, Epoch: 0.09692965410027206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 4989, "loss": 0.3364315927028656, "memory_gb": 7.721559524536133, "step_time_ms": 7495.940685272217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:40] (step=0004989) Train Loss: 0.3115, Train Steps/Sec: 0.13, Epoch: 0.09694908666925768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 4990, "loss": 0.20590762794017792, "memory_gb": 7.721559524536133, "step_time_ms": 7576.886415481567, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:48] (step=0004990) Train Loss: 0.2022, Train Steps/Sec: 0.12, Epoch: 0.09696851923824329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 4991, "loss": 0.18865680694580078, "memory_gb": 7.721559524536133, "step_time_ms": 7428.779363632202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:58:56] (step=0004991) Train Loss: 0.2758, Train Steps/Sec: 0.13, Epoch: 0.09698795180722891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 4992, "loss": 0.22758613526821136, "memory_gb": 7.721559524536133, "step_time_ms": 7448.42791557312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:04] (step=0004992) Train Loss: 0.2601, Train Steps/Sec: 0.13, Epoch: 0.09700738437621453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 4993, "loss": 0.19615167379379272, "memory_gb": 7.721559524536133, "step_time_ms": 7489.052772521973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:12] (step=0004993) Train Loss: 0.1635, Train Steps/Sec: 0.12, Epoch: 0.09702681694520016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 4994, "loss": 0.22968244552612305, "memory_gb": 7.721559524536133, "step_time_ms": 7390.42592048645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:20] (step=0004994) Train Loss: 0.2393, Train Steps/Sec: 0.13, Epoch: 0.09704624951418578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 4995, "loss": 0.24283364415168762, "memory_gb": 7.721559524536133, "step_time_ms": 7462.581634521484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:28] (step=0004995) Train Loss: 0.2940, Train Steps/Sec: 0.12, Epoch: 0.0970656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 4996, "loss": 0.3070419132709503, "memory_gb": 7.721559524536133, "step_time_ms": 7549.324035644531, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:36] (step=0004996) Train Loss: 0.2271, Train Steps/Sec: 0.12, Epoch: 0.09708511465215701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 4997, "loss": 0.27252161502838135, "memory_gb": 7.721559524536133, "step_time_ms": 7402.861595153809, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:44] (step=0004997) Train Loss: 0.2125, Train Steps/Sec: 0.12, Epoch: 0.09710454722114263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 04:59:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 4998, "loss": 0.26984506845474243, "memory_gb": 7.721559524536133, "step_time_ms": 7443.915843963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 04:59:52] (step=0004998) Train Loss: 0.2887, Train Steps/Sec: 0.13, Epoch: 0.09712397979012825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 4999, "loss": 0.1666492521762848, "memory_gb": 7.721559524536133, "step_time_ms": 7523.29158782959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:00] (step=0004999) Train Loss: 0.1996, Train Steps/Sec: 0.12, Epoch: 0.09714341235911388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5000, "loss": 0.2063213586807251, "memory_gb": 7.721559524536133, "step_time_ms": 7436.035394668579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:08] (step=0005000) Train Loss: 0.1855, Train Steps/Sec: 0.13, Epoch: 0.0971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5001, "loss": 0.23018184304237366, "memory_gb": 7.721559524536133, "step_time_ms": 7443.893194198608, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:16] (step=0005001) Train Loss: 0.1951, Train Steps/Sec: 0.13, Epoch: 0.09718227749708512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5002, "loss": 0.24718475341796875, "memory_gb": 7.721559524536133, "step_time_ms": 7535.851240158081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:24] (step=0005002) Train Loss: 0.2599, Train Steps/Sec: 0.12, Epoch: 0.09720171006607073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5003, "loss": 0.2768796682357788, "memory_gb": 7.721559524536133, "step_time_ms": 7580.974578857422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:32] (step=0005003) Train Loss: 0.2993, Train Steps/Sec: 0.12, Epoch: 0.09722114263505635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5004, "loss": 0.24416756629943848, "memory_gb": 7.721559524536133, "step_time_ms": 7440.572738647461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:40] (step=0005004) Train Loss: 0.2896, Train Steps/Sec: 0.12, Epoch: 0.09724057520404197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5005, "loss": 0.2795889377593994, "memory_gb": 7.721559524536133, "step_time_ms": 7531.4836502075195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:48] (step=0005005) Train Loss: 0.2482, Train Steps/Sec: 0.12, Epoch: 0.0972600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:00:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5006, "loss": 0.30126267671585083, "memory_gb": 7.721559524536133, "step_time_ms": 7439.856290817261, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:00:56] (step=0005006) Train Loss: 0.2364, Train Steps/Sec: 0.13, Epoch: 0.09727944034201322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5007, "loss": 0.30345645546913147, "memory_gb": 7.721559524536133, "step_time_ms": 7504.655838012695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:04] (step=0005007) Train Loss: 0.2564, Train Steps/Sec: 0.12, Epoch: 0.09729887291099884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5008, "loss": 0.24585622549057007, "memory_gb": 7.721559524536133, "step_time_ms": 7593.18733215332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:12] (step=0005008) Train Loss: 0.2019, Train Steps/Sec: 0.12, Epoch: 0.09731830547998445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5009, "loss": 0.24323153495788574, "memory_gb": 7.721559524536133, "step_time_ms": 7493.782043457031, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:20] (step=0005009) Train Loss: 0.2552, Train Steps/Sec: 0.13, Epoch: 0.09733773804897007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5010, "loss": 0.2323654592037201, "memory_gb": 7.721559524536133, "step_time_ms": 7494.796514511108, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:28] (step=0005010) Train Loss: 0.2340, Train Steps/Sec: 0.13, Epoch: 0.0973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5011, "loss": 0.30033203959465027, "memory_gb": 7.721559524536133, "step_time_ms": 7572.000026702881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:36] (step=0005011) Train Loss: 0.2846, Train Steps/Sec: 0.12, Epoch: 0.09737660318694132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5012, "loss": 0.25021645426750183, "memory_gb": 7.721559524536133, "step_time_ms": 7312.116861343384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:44] (step=0005012) Train Loss: 0.2436, Train Steps/Sec: 0.13, Epoch: 0.09739603575592694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5013, "loss": 0.1140558198094368, "memory_gb": 7.721559524536133, "step_time_ms": 6509.759187698364, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:51] (step=0005013) Train Loss: 0.1463, Train Steps/Sec: 0.15, Epoch: 0.09741546832491256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5014, "loss": 0.20295678079128265, "memory_gb": 7.721559524536133, "step_time_ms": 6730.595827102661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:01:58] (step=0005014) Train Loss: 0.1818, Train Steps/Sec: 0.14, Epoch: 0.09743490089389817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5015, "loss": 0.2157633751630783, "memory_gb": 7.721559524536133, "step_time_ms": 7558.8812828063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:06] (step=0005015) Train Loss: 0.1904, Train Steps/Sec: 0.12, Epoch: 0.09745433346288379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5016, "loss": 0.1670965999364853, "memory_gb": 7.721559524536133, "step_time_ms": 7387.843370437622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:14] (step=0005016) Train Loss: 0.2034, Train Steps/Sec: 0.12, Epoch: 0.09747376603186941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5017, "loss": 0.3259962797164917, "memory_gb": 7.721559524536133, "step_time_ms": 3500.8950233459473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:18] (step=0005017) Train Loss: 0.3115, Train Steps/Sec: 0.27, Epoch: 0.09749319860085504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5018, "loss": 0.26743143796920776, "memory_gb": 7.721559524536133, "step_time_ms": 3444.422960281372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:21] (step=0005018) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.09751263116984066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5019, "loss": 0.1869240552186966, "memory_gb": 7.721559524536133, "step_time_ms": 3433.666944503784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:25] (step=0005019) Train Loss: 0.1737, Train Steps/Sec: 0.27, Epoch: 0.09753206373882627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5020, "loss": 0.320414662361145, "memory_gb": 7.721559524536133, "step_time_ms": 3438.4357929229736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:29] (step=0005020) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.09755149630781189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5021, "loss": 0.22595958411693573, "memory_gb": 7.721559524536133, "step_time_ms": 3445.023775100708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:32] (step=0005021) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.09757092887679751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5022, "loss": 0.28533339500427246, "memory_gb": 7.721559524536133, "step_time_ms": 3448.7271308898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:36] (step=0005022) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.09759036144578313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5023, "loss": 0.20530271530151367, "memory_gb": 7.721559524536133, "step_time_ms": 3443.0201053619385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:40] (step=0005023) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.09760979401476876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5024, "loss": 0.24630749225616455, "memory_gb": 7.721559524536133, "step_time_ms": 3439.5854473114014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:43] (step=0005024) Train Loss: 0.3373, Train Steps/Sec: 0.28, Epoch: 0.09762922658375438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5025, "loss": 0.379472017288208, "memory_gb": 7.721559524536133, "step_time_ms": 3430.0453662872314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:47] (step=0005025) Train Loss: 0.3749, Train Steps/Sec: 0.28, Epoch: 0.09764865915273999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5026, "loss": 0.2608602046966553, "memory_gb": 7.721559524536133, "step_time_ms": 3427.85906791687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:51] (step=0005026) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.09766809172172561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5027, "loss": 0.15608720481395721, "memory_gb": 7.721559524536133, "step_time_ms": 3420.643091201782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:54] (step=0005027) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.09768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:02:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5028, "loss": 0.21932189166545868, "memory_gb": 7.721559524536133, "step_time_ms": 3399.4266986846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:02:58] (step=0005028) Train Loss: 0.2494, Train Steps/Sec: 0.28, Epoch: 0.09770695685969685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5029, "loss": 0.24568386375904083, "memory_gb": 7.721559524536133, "step_time_ms": 3423.173189163208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:01] (step=0005029) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.09772638942868248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5030, "loss": 0.29212892055511475, "memory_gb": 7.721559524536133, "step_time_ms": 3412.8966331481934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:05] (step=0005030) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.0977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5031, "loss": 0.21904785931110382, "memory_gb": 7.721559524536133, "step_time_ms": 3409.40260887146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:09] (step=0005031) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.0977652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5032, "loss": 0.15477325022220612, "memory_gb": 7.721559524536133, "step_time_ms": 3400.611639022827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:12] (step=0005032) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.09778468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5033, "loss": 0.2367437183856964, "memory_gb": 7.721559524536133, "step_time_ms": 3401.7746448516846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:16] (step=0005033) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.09780411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5034, "loss": 0.26810675859451294, "memory_gb": 7.721559524536133, "step_time_ms": 3396.550178527832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:20] (step=0005034) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.09782355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5035, "loss": 0.16217955946922302, "memory_gb": 7.721559524536133, "step_time_ms": 3396.292209625244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:23] (step=0005035) Train Loss: 0.1454, Train Steps/Sec: 0.28, Epoch: 0.0978429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5036, "loss": 0.19364413619041443, "memory_gb": 7.721559524536133, "step_time_ms": 3392.4176692962646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:27] (step=0005036) Train Loss: 0.1843, Train Steps/Sec: 0.27, Epoch: 0.09786241741158182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5037, "loss": 0.21778327226638794, "memory_gb": 7.721559524536133, "step_time_ms": 3394.782066345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:30] (step=0005037) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.09788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5038, "loss": 0.2975318133831024, "memory_gb": 7.721559524536133, "step_time_ms": 3389.857769012451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:34] (step=0005038) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.09790128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5039, "loss": 0.2220529019832611, "memory_gb": 7.721559524536133, "step_time_ms": 3440.4804706573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:38] (step=0005039) Train Loss: 0.2946, Train Steps/Sec: 0.27, Epoch: 0.09792071511853867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5040, "loss": 0.22842849791049957, "memory_gb": 7.721559524536133, "step_time_ms": 3394.0038681030273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:41] (step=0005040) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.09794014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5041, "loss": 0.23735670745372772, "memory_gb": 7.721559524536133, "step_time_ms": 3386.794328689575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:45] (step=0005041) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.09795958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5042, "loss": 0.2897869944572449, "memory_gb": 7.721559524536133, "step_time_ms": 3401.0326862335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:48] (step=0005042) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.09797901282549554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5043, "loss": 0.24864928424358368, "memory_gb": 7.721559524536133, "step_time_ms": 3375.2431869506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:52] (step=0005043) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.09799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5044, "loss": 0.25339123606681824, "memory_gb": 7.721559524536133, "step_time_ms": 3518.3961391448975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:56] (step=0005044) Train Loss: 0.2358, Train Steps/Sec: 0.28, Epoch: 0.09801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:03:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5045, "loss": 0.3001525402069092, "memory_gb": 7.721559524536133, "step_time_ms": 3371.291399002075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:03:59] (step=0005045) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.09803731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5046, "loss": 0.22499100863933563, "memory_gb": 7.721559524536133, "step_time_ms": 3372.105836868286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:03] (step=0005046) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.09805674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5047, "loss": 0.1966443508863449, "memory_gb": 7.721559524536133, "step_time_ms": 3373.007297515869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:06] (step=0005047) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.09807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5048, "loss": 0.16630281507968903, "memory_gb": 7.721559524536133, "step_time_ms": 3368.812322616577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:10] (step=0005048) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.09809560823940924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5049, "loss": 0.2724454998970032, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2452182769775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:13] (step=0005049) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.09811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5050, "loss": 0.1798848956823349, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2961463928223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:17] (step=0005050) Train Loss: 0.2002, Train Steps/Sec: 0.28, Epoch: 0.09813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5051, "loss": 0.2657700777053833, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6390647888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:20] (step=0005051) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.09815390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5052, "loss": 0.2521625757217407, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9066638946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:24] (step=0005052) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.09817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5053, "loss": 0.25935760140419006, "memory_gb": 7.721559524536133, "step_time_ms": 3371.572494506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:28] (step=0005053) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.09819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5054, "loss": 0.29945576190948486, "memory_gb": 7.721559524536133, "step_time_ms": 3365.839719772339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:31] (step=0005054) Train Loss: 0.3100, Train Steps/Sec: 0.28, Epoch: 0.09821220365332296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5055, "loss": 0.22863706946372986, "memory_gb": 7.721559524536133, "step_time_ms": 3366.496801376343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:35] (step=0005055) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.09823163622230859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5056, "loss": 0.3042760491371155, "memory_gb": 7.721559524536133, "step_time_ms": 3366.090774536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:38] (step=0005056) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.09825106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5057, "loss": 0.30657386779785156, "memory_gb": 7.721559524536133, "step_time_ms": 3361.567497253418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:42] (step=0005057) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.09827050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5058, "loss": 0.27372270822525024, "memory_gb": 7.721559524536133, "step_time_ms": 3364.396095275879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:46] (step=0005058) Train Loss: 0.3231, Train Steps/Sec: 0.28, Epoch: 0.09828993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5059, "loss": 0.21910212934017181, "memory_gb": 7.721559524536133, "step_time_ms": 3361.182451248169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:49] (step=0005059) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.09830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5060, "loss": 0.22051839530467987, "memory_gb": 7.721559524536133, "step_time_ms": 3363.26003074646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:53] (step=0005060) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.09832879906723668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5061, "loss": 0.22040323913097382, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6836280822754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:04:57] (step=0005061) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.0983482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5062, "loss": 0.18390560150146484, "memory_gb": 7.715639114379883, "step_time_ms": 3327.9647827148438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:00] (step=0005062) Train Loss: 0.2229, Train Steps/Sec: 0.28, Epoch: 0.09836766420520793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5063, "loss": 0.15709230303764343, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2938861846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:04] (step=0005063) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.09838709677419355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5064, "loss": 0.28874874114990234, "memory_gb": 7.721559524536133, "step_time_ms": 3368.777275085449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:07] (step=0005064) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.09840652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5065, "loss": 0.1354525089263916, "memory_gb": 7.721559524536133, "step_time_ms": 3366.420269012451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:11] (step=0005065) Train Loss: 0.1299, Train Steps/Sec: 0.28, Epoch: 0.0984259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5066, "loss": 0.18938647210597992, "memory_gb": 7.715639114379883, "step_time_ms": 3323.7059116363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:15] (step=0005066) Train Loss: 0.2042, Train Steps/Sec: 0.28, Epoch: 0.0984453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5067, "loss": 0.34688660502433777, "memory_gb": 7.721559524536133, "step_time_ms": 3363.507032394409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:18] (step=0005067) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.09846482705013603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5068, "loss": 0.280891090631485, "memory_gb": 7.721559524536133, "step_time_ms": 3363.575220108032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:22] (step=0005068) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.09848425961912165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5069, "loss": 0.21762174367904663, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9770278930664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:26] (step=0005069) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.09850369218810727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5070, "loss": 0.27998924255371094, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4557209014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:29] (step=0005070) Train Loss: 0.2867, Train Steps/Sec: 0.28, Epoch: 0.09852312475709289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5071, "loss": 0.3223515748977661, "memory_gb": 7.721559524536133, "step_time_ms": 3365.330934524536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:33] (step=0005071) Train Loss: 0.3279, Train Steps/Sec: 0.28, Epoch: 0.09854255732607851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5072, "loss": 0.19184237718582153, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0576553344727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:37] (step=0005072) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.09856198989506412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5073, "loss": 0.2252173125743866, "memory_gb": 7.721559524536133, "step_time_ms": 3368.938446044922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:40] (step=0005073) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.09858142246404974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5074, "loss": 0.2336270809173584, "memory_gb": 7.721559524536133, "step_time_ms": 3363.276958465576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:44] (step=0005074) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.09860085503303537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5075, "loss": 0.24429233372211456, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9840869903564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:47] (step=0005075) Train Loss: 0.1918, Train Steps/Sec: 0.28, Epoch: 0.09862028760202099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5076, "loss": 0.16322143375873566, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6439151763916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:51] (step=0005076) Train Loss: 0.1740, Train Steps/Sec: 0.28, Epoch: 0.09863972017100661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5077, "loss": 0.2932937741279602, "memory_gb": 7.721559524536133, "step_time_ms": 3393.566370010376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:55] (step=0005077) Train Loss: 0.2406, Train Steps/Sec: 0.27, Epoch: 0.09865915273999222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5078, "loss": 0.27978044748306274, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0434741973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:05:58] (step=0005078) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.09867858530897784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5079, "loss": 0.23110602796077728, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1375045776367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:02] (step=0005079) Train Loss: 0.2672, Train Steps/Sec: 0.27, Epoch: 0.09869801787796346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5080, "loss": 0.16879239678382874, "memory_gb": 7.721559524536133, "step_time_ms": 3394.148826599121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:06] (step=0005080) Train Loss: 0.1642, Train Steps/Sec: 0.27, Epoch: 0.09871745044694909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5081, "loss": 0.16132208704948425, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8202629089355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:09] (step=0005081) Train Loss: 0.1826, Train Steps/Sec: 0.27, Epoch: 0.09873688301593471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5082, "loss": 0.23976801335811615, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3057765960693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:13] (step=0005082) Train Loss: 0.2320, Train Steps/Sec: 0.27, Epoch: 0.09875631558492033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5083, "loss": 0.23978278040885925, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7753047943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:17] (step=0005083) Train Loss: 0.2607, Train Steps/Sec: 0.26, Epoch: 0.09877574815390594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5084, "loss": 0.19997075200080872, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1890964508057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:20] (step=0005084) Train Loss: 0.1813, Train Steps/Sec: 0.28, Epoch: 0.09879518072289156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5085, "loss": 0.18584227561950684, "memory_gb": 7.721559524536133, "step_time_ms": 3364.811897277832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:24] (step=0005085) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.09881461329187718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5086, "loss": 0.34039580821990967, "memory_gb": 7.721559524536133, "step_time_ms": 3362.104654312134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:28] (step=0005086) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.09883404586086281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5087, "loss": 0.22489041090011597, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0339584350586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:31] (step=0005087) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.09885347842984843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5088, "loss": 0.20881778001785278, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8571033477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:35] (step=0005088) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.09887291099883405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5089, "loss": 0.2815398573875427, "memory_gb": 7.721559524536133, "step_time_ms": 3365.49711227417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:39] (step=0005089) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.09889234356781966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5090, "loss": 0.15276163816452026, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2564544677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:42] (step=0005090) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.09891177613680528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5091, "loss": 0.2881771922111511, "memory_gb": 7.715639114379883, "step_time_ms": 3326.167106628418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:46] (step=0005091) Train Loss: 0.2949, Train Steps/Sec: 0.28, Epoch: 0.0989312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5092, "loss": 0.25303950905799866, "memory_gb": 7.721559524536133, "step_time_ms": 3512.9072666168213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:49] (step=0005092) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.09895064127477653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5093, "loss": 0.27503782510757446, "memory_gb": 7.721559524536133, "step_time_ms": 3364.513874053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:53] (step=0005093) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.09897007384376215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:06:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5094, "loss": 0.2654913067817688, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9543266296387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:06:57] (step=0005094) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.09898950641274777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5095, "loss": 0.29913365840911865, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6670532226562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:00] (step=0005095) Train Loss: 0.2484, Train Steps/Sec: 0.27, Epoch: 0.09900893898173338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5096, "loss": 0.3344005048274994, "memory_gb": 7.715639114379883, "step_time_ms": 3333.1573009490967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:04] (step=0005096) Train Loss: 0.3092, Train Steps/Sec: 0.28, Epoch: 0.099028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5097, "loss": 0.17901334166526794, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6371574401855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:08] (step=0005097) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.09904780411970462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5098, "loss": 0.30642837285995483, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7005767822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:11] (step=0005098) Train Loss: 0.2599, Train Steps/Sec: 0.28, Epoch: 0.09906723668869025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5099, "loss": 0.3101208209991455, "memory_gb": 7.721559524536133, "step_time_ms": 3366.304874420166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:15] (step=0005099) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.09908666925767587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5100, "loss": 0.14988833665847778, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1856231689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:19] (step=0005100) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.09910610182666149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5101, "loss": 0.30897676944732666, "memory_gb": 7.721559524536133, "step_time_ms": 3367.917776107788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:22] (step=0005101) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.0991255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5102, "loss": 0.18137602508068085, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5667819976807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:26] (step=0005102) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.09914496696463272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5103, "loss": 0.20136910676956177, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8525009155273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:29] (step=0005103) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.09916439953361834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5104, "loss": 0.21854902803897858, "memory_gb": 7.721559524536133, "step_time_ms": 3361.330986022949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:33] (step=0005104) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.09918383210260397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5105, "loss": 0.3164442777633667, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2051639556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:37] (step=0005105) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.09920326467158959, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5106, "loss": 0.19190678000450134, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9912090301514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:40] (step=0005106) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.0992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5107, "loss": 0.20384854078292847, "memory_gb": 7.721559524536133, "step_time_ms": 3363.696336746216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:44] (step=0005107) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.09924212980956082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5108, "loss": 0.41249871253967285, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0829067230225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:48] (step=0005108) Train Loss: 0.2852, Train Steps/Sec: 0.28, Epoch: 0.09926156237854644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5109, "loss": 0.25492918491363525, "memory_gb": 7.721559524536133, "step_time_ms": 3395.1361179351807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:51] (step=0005109) Train Loss: 0.1890, Train Steps/Sec: 0.27, Epoch: 0.09928099494753206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5110, "loss": 0.18381989002227783, "memory_gb": 7.721559524536133, "step_time_ms": 3396.812915802002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:55] (step=0005110) Train Loss: 0.1763, Train Steps/Sec: 0.27, Epoch: 0.09930042751651769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:07:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5111, "loss": 0.1769176870584488, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7897968292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:07:59] (step=0005111) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.09931986008550331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5112, "loss": 0.2734246850013733, "memory_gb": 7.721559524536133, "step_time_ms": 3381.77490234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:02] (step=0005112) Train Loss: 0.2649, Train Steps/Sec: 0.27, Epoch: 0.09933929265448892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5113, "loss": 0.27017155289649963, "memory_gb": 7.721559524536133, "step_time_ms": 3373.514413833618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:06] (step=0005113) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.09935872522347454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5114, "loss": 0.14871060848236084, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4705028533936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:10] (step=0005114) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.09937815779246016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5115, "loss": 0.268766850233078, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1864948272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:13] (step=0005115) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.09939759036144578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5116, "loss": 0.24796319007873535, "memory_gb": 7.721559524536133, "step_time_ms": 3370.656967163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:17] (step=0005116) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.0994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5117, "loss": 0.2682674527168274, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4468994140625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:20] (step=0005117) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.09943645549941703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5118, "loss": 0.26657921075820923, "memory_gb": 7.721559524536133, "step_time_ms": 3367.38920211792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:24] (step=0005118) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.09945588806840264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5119, "loss": 0.2630118727684021, "memory_gb": 7.721559524536133, "step_time_ms": 3367.405891418457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:28] (step=0005119) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.09947532063738826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5120, "loss": 0.23521234095096588, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2607860565186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:31] (step=0005120) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.09949475320637388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5121, "loss": 0.2254902720451355, "memory_gb": 7.721559524536133, "step_time_ms": 3362.332820892334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:35] (step=0005121) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.0995141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5122, "loss": 0.22209495306015015, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0834560394287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:39] (step=0005122) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.09953361834434513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5123, "loss": 0.22571833431720734, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0052967071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:42] (step=0005123) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.09955305091333075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5124, "loss": 0.2939686179161072, "memory_gb": 7.721559524536133, "step_time_ms": 3366.377353668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:46] (step=0005124) Train Loss: 0.2820, Train Steps/Sec: 0.26, Epoch: 0.09957248348231636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5125, "loss": 0.1632383167743683, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1421279907227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:50] (step=0005125) Train Loss: 0.1740, Train Steps/Sec: 0.28, Epoch: 0.09959191605130198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5126, "loss": 0.23553971946239471, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3830242156982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:53] (step=0005126) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.0996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5127, "loss": 0.16041205823421478, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3014430999756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:08:57] (step=0005127) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.09963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5128, "loss": 0.264479398727417, "memory_gb": 7.721559524536133, "step_time_ms": 3370.373010635376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:00] (step=0005128) Train Loss: 0.3183, Train Steps/Sec: 0.28, Epoch: 0.09965021375825885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5129, "loss": 0.22055740654468536, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7941093444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:04] (step=0005129) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.09966964632724447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5130, "loss": 0.1398206353187561, "memory_gb": 7.721559524536133, "step_time_ms": 3372.173070907593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:08] (step=0005130) Train Loss: 0.1526, Train Steps/Sec: 0.28, Epoch: 0.09968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5131, "loss": 0.20288926362991333, "memory_gb": 7.721559524536133, "step_time_ms": 3371.870756149292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:11] (step=0005131) Train Loss: 0.2161, Train Steps/Sec: 0.28, Epoch: 0.0997085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5132, "loss": 0.27590423822402954, "memory_gb": 7.721559524536133, "step_time_ms": 3515.4130458831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:15] (step=0005132) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.09972794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5133, "loss": 0.19204461574554443, "memory_gb": 7.721559524536133, "step_time_ms": 3365.39626121521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:19] (step=0005133) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.09974737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5134, "loss": 0.22992193698883057, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6889457702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:22] (step=0005134) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.09976680917217257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5135, "loss": 0.25408533215522766, "memory_gb": 7.721559524536133, "step_time_ms": 3366.910934448242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:26] (step=0005135) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.09978624174115817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5136, "loss": 0.2593010663986206, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4514503479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:30] (step=0005136) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.0998056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5137, "loss": 0.29191070795059204, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9284114837646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:33] (step=0005137) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.09982510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5138, "loss": 0.2691429555416107, "memory_gb": 7.721559524536133, "step_time_ms": 3372.406482696533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:37] (step=0005138) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.09984453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5139, "loss": 0.21599513292312622, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2421913146973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:40] (step=0005139) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.09986397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5140, "loss": 0.27302035689353943, "memory_gb": 7.721559524536133, "step_time_ms": 3372.478485107422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:44] (step=0005140) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.09988340458608629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5141, "loss": 0.14102813601493835, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2218322753906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:48] (step=0005141) Train Loss: 0.2406, Train Steps/Sec: 0.27, Epoch: 0.0999028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5142, "loss": 0.27006995677948, "memory_gb": 7.721559524536133, "step_time_ms": 3372.659921646118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:51] (step=0005142) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.09992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5143, "loss": 0.2367534041404724, "memory_gb": 7.721559524536133, "step_time_ms": 3371.110677719116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:55] (step=0005143) Train Loss: 0.2778, Train Steps/Sec: 0.28, Epoch: 0.09994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5144, "loss": 0.2299100160598755, "memory_gb": 7.721559524536133, "step_time_ms": 3367.509603500366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:09:59] (step=0005144) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.09996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5145, "loss": 0.17149876058101654, "memory_gb": 7.721559524536133, "step_time_ms": 3362.497329711914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:02] (step=0005145) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.09998056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5146, "loss": 0.1938135027885437, "memory_gb": 7.721559524536133, "step_time_ms": 3368.351459503174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:06] (step=0005146) Train Loss: 0.1601, Train Steps/Sec: 0.28, Epoch: 0.1, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5147, "loss": 0.18606729805469513, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3041381835938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:10] (step=0005147) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.10001943256898561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5148, "loss": 0.2179054319858551, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1062507629395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:13] (step=0005148) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.10003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5149, "loss": 0.18843260407447815, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1976890563965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:17] (step=0005149) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.10005829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5150, "loss": 0.14903846383094788, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7430458068848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:20] (step=0005150) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.10007773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5151, "loss": 0.18522652983665466, "memory_gb": 7.721559524536133, "step_time_ms": 3360.57448387146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:24] (step=0005151) Train Loss: 0.1711, Train Steps/Sec: 0.28, Epoch: 0.1000971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5152, "loss": 0.28074944019317627, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8360710144043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:28] (step=0005152) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.10011659541391373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5153, "loss": 0.20682597160339355, "memory_gb": 7.721559524536133, "step_time_ms": 3367.817163467407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:31] (step=0005153) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.10013602798289933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5154, "loss": 0.24866902828216553, "memory_gb": 7.721559524536133, "step_time_ms": 3365.84734916687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:35] (step=0005154) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.10015546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5155, "loss": 0.3384591341018677, "memory_gb": 7.721559524536133, "step_time_ms": 3363.723039627075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:39] (step=0005155) Train Loss: 0.3041, Train Steps/Sec: 0.28, Epoch: 0.10017489312087058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5156, "loss": 0.2402365356683731, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9377613067627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:42] (step=0005156) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.1001943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5157, "loss": 0.3002757430076599, "memory_gb": 7.721559524536133, "step_time_ms": 3368.222236633301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:46] (step=0005157) Train Loss: 0.2897, Train Steps/Sec: 0.27, Epoch: 0.10021375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5158, "loss": 0.22224341332912445, "memory_gb": 7.721559524536133, "step_time_ms": 3367.95711517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:50] (step=0005158) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.10023319082782745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5159, "loss": 0.29072219133377075, "memory_gb": 7.721559524536133, "step_time_ms": 3363.679885864258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:53] (step=0005159) Train Loss: 0.2644, Train Steps/Sec: 0.28, Epoch: 0.10025262339681305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5160, "loss": 0.18168902397155762, "memory_gb": 7.721559524536133, "step_time_ms": 3367.366313934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:10:57] (step=0005160) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.10027205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5161, "loss": 0.24309740960597992, "memory_gb": 7.721559524536133, "step_time_ms": 3368.535280227661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:00] (step=0005161) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.1002914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5162, "loss": 0.2953259348869324, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5056228637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:04] (step=0005162) Train Loss: 0.3120, Train Steps/Sec: 0.28, Epoch: 0.10031092110376992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5163, "loss": 0.16071249544620514, "memory_gb": 7.721559524536133, "step_time_ms": 3362.598180770874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:08] (step=0005163) Train Loss: 0.1861, Train Steps/Sec: 0.28, Epoch: 0.10033035367275554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5164, "loss": 0.1275734007358551, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8770065307617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:11] (step=0005164) Train Loss: 0.1489, Train Steps/Sec: 0.28, Epoch: 0.10034978624174115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5165, "loss": 0.2587790787220001, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9843883514404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:15] (step=0005165) Train Loss: 0.3082, Train Steps/Sec: 0.28, Epoch: 0.10036921881072677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5166, "loss": 0.3260995149612427, "memory_gb": 7.721559524536133, "step_time_ms": 3366.910457611084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:19] (step=0005166) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.1003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5167, "loss": 0.1791810691356659, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9772148132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:22] (step=0005167) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.10040808394869802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5168, "loss": 0.2950459122657776, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1910037994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:26] (step=0005168) Train Loss: 0.3209, Train Steps/Sec: 0.28, Epoch: 0.10042751651768364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5169, "loss": 0.33696436882019043, "memory_gb": 7.721559524536133, "step_time_ms": 3371.9730377197266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:29] (step=0005169) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.10044694908666926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5170, "loss": 0.33533918857574463, "memory_gb": 7.721559524536133, "step_time_ms": 3365.920305252075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:33] (step=0005170) Train Loss: 0.2968, Train Steps/Sec: 0.28, Epoch: 0.10046638165565487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5171, "loss": 0.2692028284072876, "memory_gb": 7.721559524536133, "step_time_ms": 3369.4007396698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:37] (step=0005171) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.1004858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5172, "loss": 0.11025208979845047, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9256439208984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:41] (step=0005172) Train Loss: 0.1732, Train Steps/Sec: 0.27, Epoch: 0.10050524679362612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5173, "loss": 0.23149165511131287, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6534423828125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:44] (step=0005173) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.10052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5174, "loss": 0.19004344940185547, "memory_gb": 7.721559524536133, "step_time_ms": 3369.123935699463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:48] (step=0005174) Train Loss: 0.1646, Train Steps/Sec: 0.28, Epoch: 0.10054411193159736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5175, "loss": 0.17734721302986145, "memory_gb": 7.721559524536133, "step_time_ms": 3368.443489074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:51] (step=0005175) Train Loss: 0.1728, Train Steps/Sec: 0.28, Epoch: 0.10056354450058298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5176, "loss": 0.2223799079656601, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0856227874756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:55] (step=0005176) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.10058297706956859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5177, "loss": 0.33658507466316223, "memory_gb": 7.721559524536133, "step_time_ms": 3369.276762008667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:11:59] (step=0005177) Train Loss: 0.3577, Train Steps/Sec: 0.28, Epoch: 0.10060240963855421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5178, "loss": 0.2846772372722626, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0269718170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:02] (step=0005178) Train Loss: 0.2960, Train Steps/Sec: 0.28, Epoch: 0.10062184220753984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5179, "loss": 0.20563122630119324, "memory_gb": 7.721559524536133, "step_time_ms": 3509.6452236175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:06] (step=0005179) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.10064127477652546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5180, "loss": 0.18406429886817932, "memory_gb": 7.721559524536133, "step_time_ms": 3366.522789001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:10] (step=0005180) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.10066070734551108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5181, "loss": 0.3066211938858032, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8693771362305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:13] (step=0005181) Train Loss: 0.3129, Train Steps/Sec: 0.28, Epoch: 0.1006801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5182, "loss": 0.24329009652137756, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9544925689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:17] (step=0005182) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.10069957248348231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5183, "loss": 0.27454376220703125, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8577461242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:21] (step=0005183) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.10071900505246793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5184, "loss": 0.3071638345718384, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1701469421387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:24] (step=0005184) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.10073843762145356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5185, "loss": 0.12208248674869537, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6249771118164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:28] (step=0005185) Train Loss: 0.1714, Train Steps/Sec: 0.28, Epoch: 0.10075787019043918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5186, "loss": 0.24329161643981934, "memory_gb": 7.721559524536133, "step_time_ms": 3366.823434829712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:31] (step=0005186) Train Loss: 0.2133, Train Steps/Sec: 0.28, Epoch: 0.1007773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5187, "loss": 0.2294805645942688, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0728931427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:35] (step=0005187) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.10079673532841042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5188, "loss": 0.21535301208496094, "memory_gb": 7.721559524536133, "step_time_ms": 3368.621587753296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:39] (step=0005188) Train Loss: 0.2771, Train Steps/Sec: 0.27, Epoch: 0.10081616789739603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5189, "loss": 0.31778809428215027, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7119483947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:42] (step=0005189) Train Loss: 0.2692, Train Steps/Sec: 0.28, Epoch: 0.10083560046638165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5190, "loss": 0.13238896429538727, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1491146087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:46] (step=0005190) Train Loss: 0.1888, Train Steps/Sec: 0.28, Epoch: 0.10085503303536728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5191, "loss": 0.23353073000907898, "memory_gb": 7.721559524536133, "step_time_ms": 3368.654489517212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:50] (step=0005191) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.1008744656043529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5192, "loss": 0.2933686375617981, "memory_gb": 7.721559524536133, "step_time_ms": 3368.499755859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:53] (step=0005192) Train Loss: 0.2723, Train Steps/Sec: 0.28, Epoch: 0.10089389817333852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:12:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5193, "loss": 0.3305717706680298, "memory_gb": 7.721559524536133, "step_time_ms": 3345.625877380371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:12:57] (step=0005193) Train Loss: 0.2992, Train Steps/Sec: 0.28, Epoch: 0.10091333074232414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5194, "loss": 0.30431532859802246, "memory_gb": 7.721559524536133, "step_time_ms": 3363.56782913208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:00] (step=0005194) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.10093276331130975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5195, "loss": 0.29331427812576294, "memory_gb": 7.721559524536133, "step_time_ms": 3367.971897125244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:04] (step=0005195) Train Loss: 0.3002, Train Steps/Sec: 0.28, Epoch: 0.10095219588029537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5196, "loss": 0.3763972818851471, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8195266723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:08] (step=0005196) Train Loss: 0.3395, Train Steps/Sec: 0.28, Epoch: 0.100971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5197, "loss": 0.24732638895511627, "memory_gb": 7.721559524536133, "step_time_ms": 3370.844841003418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:11] (step=0005197) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.10099106101826662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5198, "loss": 0.1921929121017456, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3454780578613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:15] (step=0005198) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.10101049358725224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5199, "loss": 0.3657938838005066, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8863563537598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:19] (step=0005199) Train Loss: 0.3344, Train Steps/Sec: 0.28, Epoch: 0.10102992615623785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5200, "loss": 0.3248180150985718, "memory_gb": 7.721559524536133, "step_time_ms": 3366.452693939209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:22] (step=0005200) Train Loss: 0.3433, Train Steps/Sec: 0.28, Epoch: 0.10104935872522347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5201, "loss": 0.1728736162185669, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6959533691406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:26] (step=0005201) Train Loss: 0.1792, Train Steps/Sec: 0.28, Epoch: 0.10106879129420909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5202, "loss": 0.23381654918193817, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4935359954834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:30] (step=0005202) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.10108822386319471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5203, "loss": 0.25130513310432434, "memory_gb": 7.721559524536133, "step_time_ms": 3362.028121948242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:33] (step=0005203) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.10110765643218034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5204, "loss": 0.18635293841362, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6110553741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:37] (step=0005204) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.10112708900116596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5205, "loss": 0.37321168184280396, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1059398651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:40] (step=0005205) Train Loss: 0.3052, Train Steps/Sec: 0.28, Epoch: 0.10114652157015157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5206, "loss": 0.24511197209358215, "memory_gb": 7.721559524536133, "step_time_ms": 3363.788604736328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:44] (step=0005206) Train Loss: 0.2296, Train Steps/Sec: 0.28, Epoch: 0.10116595413913719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5207, "loss": 0.26331013441085815, "memory_gb": 7.721559524536133, "step_time_ms": 3363.178491592407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:48] (step=0005207) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.10118538670812281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5208, "loss": 0.13501450419425964, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0406131744385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:51] (step=0005208) Train Loss: 0.1891, Train Steps/Sec: 0.28, Epoch: 0.10120481927710843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5209, "loss": 0.25004449486732483, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6205196380615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:55] (step=0005209) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.10122425184609406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:13:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5210, "loss": 0.20990347862243652, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0500564575195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:13:59] (step=0005210) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.10124368441507968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5211, "loss": 0.17991018295288086, "memory_gb": 7.721559524536133, "step_time_ms": 3358.44087600708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:02] (step=0005211) Train Loss: 0.1777, Train Steps/Sec: 0.28, Epoch: 0.10126311698406529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5212, "loss": 0.2712957262992859, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1202449798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:06] (step=0005212) Train Loss: 0.2542, Train Steps/Sec: 0.27, Epoch: 0.10128254955305091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5213, "loss": 0.24743042886257172, "memory_gb": 7.721559524536133, "step_time_ms": 3358.396530151367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:09] (step=0005213) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.10130198212203653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5214, "loss": 0.23333962261676788, "memory_gb": 7.721559524536133, "step_time_ms": 3359.537363052368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:13] (step=0005214) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.10132141469102215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5215, "loss": 0.1590700000524521, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6982021331787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:17] (step=0005215) Train Loss: 0.1728, Train Steps/Sec: 0.28, Epoch: 0.10134084726000778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5216, "loss": 0.19323569536209106, "memory_gb": 7.721559524536133, "step_time_ms": 3361.924648284912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:20] (step=0005216) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.1013602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5217, "loss": 0.25472497940063477, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5149002075195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:24] (step=0005217) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.10137971239797901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5218, "loss": 0.31732526421546936, "memory_gb": 7.721559524536133, "step_time_ms": 3364.886999130249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:27] (step=0005218) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.10139914496696463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5219, "loss": 0.26393043994903564, "memory_gb": 7.721559524536133, "step_time_ms": 3364.055633544922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:31] (step=0005219) Train Loss: 0.2965, Train Steps/Sec: 0.28, Epoch: 0.10141857753595025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5220, "loss": 0.22486010193824768, "memory_gb": 7.721559524536133, "step_time_ms": 3511.730909347534, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:35] (step=0005220) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.10143801010493587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5221, "loss": 0.19007116556167603, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0568981170654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:38] (step=0005221) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.1014574426739215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5222, "loss": 0.2763884365558624, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3734970092773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:42] (step=0005222) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.10147687524290712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5223, "loss": 0.2778598368167877, "memory_gb": 7.721559524536133, "step_time_ms": 3359.485149383545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:46] (step=0005223) Train Loss: 0.2667, Train Steps/Sec: 0.28, Epoch: 0.10149630781189273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5224, "loss": 0.2417692095041275, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1838817596436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:49] (step=0005224) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.10151574038087835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5225, "loss": 0.2576919198036194, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0271892547607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:53] (step=0005225) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.10153517294986397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:14:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5226, "loss": 0.2756425738334656, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7845726013184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:14:56] (step=0005226) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.1015546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5227, "loss": 0.2436847984790802, "memory_gb": 7.721559524536133, "step_time_ms": 3362.116575241089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:00] (step=0005227) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.10157403808783522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5228, "loss": 0.24153247475624084, "memory_gb": 7.721559524536133, "step_time_ms": 3361.699104309082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:04] (step=0005228) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.10159347065682083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5229, "loss": 0.40084969997406006, "memory_gb": 7.721559524536133, "step_time_ms": 3361.513137817383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:07] (step=0005229) Train Loss: 0.3574, Train Steps/Sec: 0.28, Epoch: 0.10161290322580645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5230, "loss": 0.2958681881427765, "memory_gb": 7.721559524536133, "step_time_ms": 3366.617202758789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:11] (step=0005230) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.10163233579479207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5231, "loss": 0.20661401748657227, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1460666656494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:14] (step=0005231) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.10165176836377769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5232, "loss": 0.2702552080154419, "memory_gb": 7.721559524536133, "step_time_ms": 3365.082263946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:18] (step=0005232) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.10167120093276331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 5233, "loss": 0.28340715169906616, "memory_gb": 7.715639114379883, "step_time_ms": 3327.3065090179443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:22] (step=0005233) Train Loss: 0.3204, Train Steps/Sec: 0.28, Epoch: 0.10169063350174894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5234, "loss": 0.3457227349281311, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8107776641846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:25] (step=0005234) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.10171006607073454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5235, "loss": 0.27573806047439575, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7254962921143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:29] (step=0005235) Train Loss: 0.3100, Train Steps/Sec: 0.28, Epoch: 0.10172949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5236, "loss": 0.19553226232528687, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9137020111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:32] (step=0005236) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.10174893120870579, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5237, "loss": 0.1851820945739746, "memory_gb": 7.721559524536133, "step_time_ms": 3362.131118774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:36] (step=0005237) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.10176836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5238, "loss": 0.16224738955497742, "memory_gb": 7.721559524536133, "step_time_ms": 3364.102363586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:40] (step=0005238) Train Loss: 0.1976, Train Steps/Sec: 0.28, Epoch: 0.10178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5239, "loss": 0.20226135849952698, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5569553375244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:43] (step=0005239) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.10180722891566266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5240, "loss": 0.2991126775741577, "memory_gb": 7.721559524536133, "step_time_ms": 3370.408535003662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:47] (step=0005240) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.10182666148464826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5241, "loss": 0.2313375025987625, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9683723449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:51] (step=0005241) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.10184609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5242, "loss": 0.159765362739563, "memory_gb": 7.721559524536133, "step_time_ms": 3366.90092086792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:54] (step=0005242) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.10186552662261951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5243, "loss": 0.22294707596302032, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1602993011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:15:58] (step=0005243) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.10188495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5244, "loss": 0.2442278265953064, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2573776245117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:01] (step=0005244) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.10190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5245, "loss": 0.28814437985420227, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6274337768555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:05] (step=0005245) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.10192382432957638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5246, "loss": 0.14346641302108765, "memory_gb": 7.721559524536133, "step_time_ms": 3374.5293617248535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:09] (step=0005246) Train Loss: 0.1624, Train Steps/Sec: 0.28, Epoch: 0.10194325689856198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5247, "loss": 0.21166905760765076, "memory_gb": 7.721559524536133, "step_time_ms": 3374.030113220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:12] (step=0005247) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.10196268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5248, "loss": 0.20222169160842896, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5895671844482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:16] (step=0005248) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.10198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5249, "loss": 0.2411971539258957, "memory_gb": 7.721559524536133, "step_time_ms": 3368.849277496338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:20] (step=0005249) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.10200155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5250, "loss": 0.31349462270736694, "memory_gb": 7.721559524536133, "step_time_ms": 3376.5993118286133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:23] (step=0005250) Train Loss: 0.2858, Train Steps/Sec: 0.28, Epoch: 0.10202098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5251, "loss": 0.32583558559417725, "memory_gb": 7.721559524536133, "step_time_ms": 3376.0859966278076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:27] (step=0005251) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.1020404197434901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5252, "loss": 0.22796402871608734, "memory_gb": 7.721559524536133, "step_time_ms": 3372.2822666168213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:30] (step=0005252) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.1020598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5253, "loss": 0.2599824368953705, "memory_gb": 7.721559524536133, "step_time_ms": 3379.0786266326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:34] (step=0005253) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.10207928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5254, "loss": 0.22914792597293854, "memory_gb": 7.721559524536133, "step_time_ms": 3378.934621810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:38] (step=0005254) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.10209871745044695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5255, "loss": 0.359561562538147, "memory_gb": 7.721559524536133, "step_time_ms": 3365.654945373535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:41] (step=0005255) Train Loss: 0.3250, Train Steps/Sec: 0.28, Epoch: 0.10211815001943257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5256, "loss": 0.20285239815711975, "memory_gb": 7.721559524536133, "step_time_ms": 3381.603717803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:45] (step=0005256) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.1021375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5257, "loss": 0.2633807063102722, "memory_gb": 7.715639114379883, "step_time_ms": 3355.144739151001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:49] (step=0005257) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.1021570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5258, "loss": 0.334197461605072, "memory_gb": 7.721559524536133, "step_time_ms": 3383.5391998291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:52] (step=0005258) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.10217644772638942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:16:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5259, "loss": 0.2425345778465271, "memory_gb": 7.721559524536133, "step_time_ms": 3382.7197551727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:16:56] (step=0005259) Train Loss: 0.2281, Train Steps/Sec: 0.26, Epoch: 0.10219588029537505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5260, "loss": 0.1250397264957428, "memory_gb": 7.721559524536133, "step_time_ms": 3378.573417663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:00] (step=0005260) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.10221531286436067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5261, "loss": 0.11771728098392487, "memory_gb": 7.721559524536133, "step_time_ms": 3379.187822341919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:03] (step=0005261) Train Loss: 0.1483, Train Steps/Sec: 0.28, Epoch: 0.10223474543334629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5262, "loss": 0.1884247362613678, "memory_gb": 7.721559524536133, "step_time_ms": 3381.805658340454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:07] (step=0005262) Train Loss: 0.1940, Train Steps/Sec: 0.28, Epoch: 0.10225417800233191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5263, "loss": 0.24070331454277039, "memory_gb": 7.721559524536133, "step_time_ms": 3385.1492404937744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:11] (step=0005263) Train Loss: 0.2368, Train Steps/Sec: 0.27, Epoch: 0.10227361057131752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5264, "loss": 0.1930660605430603, "memory_gb": 7.721559524536133, "step_time_ms": 3375.9915828704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:14] (step=0005264) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.10229304314030314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5265, "loss": 0.31741660833358765, "memory_gb": 7.721559524536133, "step_time_ms": 3378.2567977905273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:18] (step=0005265) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.10231247570928877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5266, "loss": 0.23341596126556396, "memory_gb": 7.721559524536133, "step_time_ms": 3380.8674812316895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:21] (step=0005266) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.10233190827827439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5267, "loss": 0.25153547525405884, "memory_gb": 7.721559524536133, "step_time_ms": 3383.293628692627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:25] (step=0005267) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.10235134084726001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5268, "loss": 0.2766263484954834, "memory_gb": 7.721559524536133, "step_time_ms": 3517.3420906066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:29] (step=0005268) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.10237077341624563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5269, "loss": 0.26991182565689087, "memory_gb": 7.721559524536133, "step_time_ms": 3387.049436569214, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:32] (step=0005269) Train Loss: 0.2475, Train Steps/Sec: 0.27, Epoch: 0.10239020598523124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5270, "loss": 0.2594338655471802, "memory_gb": 7.721559524536133, "step_time_ms": 3385.3824138641357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:36] (step=0005270) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.10240963855421686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5271, "loss": 0.24645107984542847, "memory_gb": 7.721559524536133, "step_time_ms": 3384.587287902832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:40] (step=0005271) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.10242907112320249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5272, "loss": 0.2534545958042145, "memory_gb": 7.721559524536133, "step_time_ms": 3383.906841278076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:43] (step=0005272) Train Loss: 0.2407, Train Steps/Sec: 0.27, Epoch: 0.10244850369218811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5273, "loss": 0.3032216429710388, "memory_gb": 7.721559524536133, "step_time_ms": 3383.937120437622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:47] (step=0005273) Train Loss: 0.2289, Train Steps/Sec: 0.28, Epoch: 0.10246793626117373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 5274, "loss": 0.19884942471981049, "memory_gb": 7.721559524536133, "step_time_ms": 3377.7620792388916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:51] (step=0005274) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.10248736883015935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5275, "loss": 0.21042566001415253, "memory_gb": 7.721559524536133, "step_time_ms": 3383.6586475372314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:54] (step=0005275) Train Loss: 0.2306, Train Steps/Sec: 0.27, Epoch: 0.10250680139914496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:17:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5276, "loss": 0.28529563546180725, "memory_gb": 7.721559524536133, "step_time_ms": 3377.190113067627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:17:58] (step=0005276) Train Loss: 0.3031, Train Steps/Sec: 0.28, Epoch: 0.10252623396813058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5277, "loss": 0.2443368136882782, "memory_gb": 7.721559524536133, "step_time_ms": 3382.9166889190674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:01] (step=0005277) Train Loss: 0.2769, Train Steps/Sec: 0.27, Epoch: 0.1025456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5278, "loss": 0.20532268285751343, "memory_gb": 7.721559524536133, "step_time_ms": 3381.8726539611816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:05] (step=0005278) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.10256509910610183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5279, "loss": 0.1921515166759491, "memory_gb": 7.721559524536133, "step_time_ms": 3374.950408935547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:09] (step=0005279) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.10258453167508745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5280, "loss": 0.27423134446144104, "memory_gb": 7.715639114379883, "step_time_ms": 3350.508689880371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:12] (step=0005280) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.10260396424407307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5281, "loss": 0.23805978894233704, "memory_gb": 7.721559524536133, "step_time_ms": 3383.0997943878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:16] (step=0005281) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.10262339681305868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5282, "loss": 0.31129372119903564, "memory_gb": 7.721559524536133, "step_time_ms": 3377.6068687438965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:20] (step=0005282) Train Loss: 0.3237, Train Steps/Sec: 0.28, Epoch: 0.1026428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5283, "loss": 0.37674587965011597, "memory_gb": 7.721559524536133, "step_time_ms": 3375.7405281066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:23] (step=0005283) Train Loss: 0.3346, Train Steps/Sec: 0.28, Epoch: 0.10266226195102993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5284, "loss": 0.2875227928161621, "memory_gb": 7.721559524536133, "step_time_ms": 3372.152328491211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:27] (step=0005284) Train Loss: 0.2527, Train Steps/Sec: 0.28, Epoch: 0.10268169452001555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5285, "loss": 0.16554006934165955, "memory_gb": 7.721559524536133, "step_time_ms": 3374.842882156372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:31] (step=0005285) Train Loss: 0.1573, Train Steps/Sec: 0.28, Epoch: 0.10270112708900117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5286, "loss": 0.2629818916320801, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5288639068604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:34] (step=0005286) Train Loss: 0.2715, Train Steps/Sec: 0.28, Epoch: 0.10272055965798678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5287, "loss": 0.19127705693244934, "memory_gb": 7.721559524536133, "step_time_ms": 3373.8138675689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:38] (step=0005287) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.1027399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5288, "loss": 0.21959134936332703, "memory_gb": 7.721559524536133, "step_time_ms": 3371.770143508911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:41] (step=0005288) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.10275942479595802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5289, "loss": 0.2938977777957916, "memory_gb": 7.721559524536133, "step_time_ms": 3376.5501976013184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:45] (step=0005289) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.10277885736494365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5290, "loss": 0.28164762258529663, "memory_gb": 7.721559524536133, "step_time_ms": 3370.701789855957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:49] (step=0005290) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.10279828993392927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5291, "loss": 0.25464245676994324, "memory_gb": 7.721559524536133, "step_time_ms": 3371.532678604126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:52] (step=0005291) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.10281772250291489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:18:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5292, "loss": 0.16172301769256592, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4839782714844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:18:56] (step=0005292) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.1028371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 5293, "loss": 0.20385605096817017, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0264015197754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:00] (step=0005293) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.10285658764088612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5294, "loss": 0.18510374426841736, "memory_gb": 7.721559524536133, "step_time_ms": 3367.666244506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:03] (step=0005294) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.10287602020987174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5295, "loss": 0.20656245946884155, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1095581054688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:07] (step=0005295) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.10289545277885737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 5296, "loss": 0.1556856632232666, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0755157470703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:11] (step=0005296) Train Loss: 0.1549, Train Steps/Sec: 0.28, Epoch: 0.10291488534784299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5297, "loss": 0.13085396587848663, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9420738220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:14] (step=0005297) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.10293431791682861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5298, "loss": 0.18179918825626373, "memory_gb": 7.721559524536133, "step_time_ms": 3366.086006164551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:18] (step=0005298) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.10295375048581422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5299, "loss": 0.15331849455833435, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8688583374023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:21] (step=0005299) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.10297318305479984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5300, "loss": 0.24326440691947937, "memory_gb": 7.721559524536133, "step_time_ms": 3346.4033603668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:25] (step=0005300) Train Loss: 0.2791, Train Steps/Sec: 0.27, Epoch: 0.10299261562378546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5301, "loss": 0.20531730353832245, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8500232696533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:29] (step=0005301) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.10301204819277109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5302, "loss": 0.24579200148582458, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2868061065674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:32] (step=0005302) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.10303148076175671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5303, "loss": 0.3336105942726135, "memory_gb": 7.721559524536133, "step_time_ms": 3367.049217224121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:36] (step=0005303) Train Loss: 0.3108, Train Steps/Sec: 0.28, Epoch: 0.10305091333074233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5304, "loss": 0.29411181807518005, "memory_gb": 7.721559524536133, "step_time_ms": 3366.065740585327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:40] (step=0005304) Train Loss: 0.2551, Train Steps/Sec: 0.28, Epoch: 0.10307034589972794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5305, "loss": 0.27341923117637634, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9228343963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:43] (step=0005305) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.10308977846871356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5306, "loss": 0.21564209461212158, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0436611175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:47] (step=0005306) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.10310921103769918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5307, "loss": 0.33422034978866577, "memory_gb": 7.721559524536133, "step_time_ms": 3368.596315383911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:50] (step=0005307) Train Loss: 0.3050, Train Steps/Sec: 0.28, Epoch: 0.1031286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5308, "loss": 0.2855478525161743, "memory_gb": 7.721559524536133, "step_time_ms": 3511.415958404541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:54] (step=0005308) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.10314807617567043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:19:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5309, "loss": 0.26728737354278564, "memory_gb": 7.721559524536133, "step_time_ms": 3363.539457321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:19:58] (step=0005309) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.10316750874465605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5310, "loss": 0.18462145328521729, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9607639312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:01] (step=0005310) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.10318694131364166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5311, "loss": 0.155053973197937, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8511638641357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:05] (step=0005311) Train Loss: 0.1633, Train Steps/Sec: 0.28, Epoch: 0.10320637388262728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5312, "loss": 0.3239196240901947, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8165607452393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:09] (step=0005312) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.1032258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5313, "loss": 0.30655407905578613, "memory_gb": 7.721559524536133, "step_time_ms": 3364.957809448242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:12] (step=0005313) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.10324523902059853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5314, "loss": 0.23841574788093567, "memory_gb": 7.721559524536133, "step_time_ms": 3364.56036567688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:16] (step=0005314) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.10326467158958415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5315, "loss": 0.2770654261112213, "memory_gb": 7.721559524536133, "step_time_ms": 3367.457866668701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:19] (step=0005315) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.10328410415856976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5316, "loss": 0.26568901538848877, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8200550079346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:23] (step=0005316) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.10330353672755538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5317, "loss": 0.18417774140834808, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9209995269775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:27] (step=0005317) Train Loss: 0.1671, Train Steps/Sec: 0.28, Epoch: 0.103322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5318, "loss": 0.256614089012146, "memory_gb": 7.721559524536133, "step_time_ms": 3363.826274871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:30] (step=0005318) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.10334240186552662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5319, "loss": 0.2173808217048645, "memory_gb": 7.721559524536133, "step_time_ms": 3369.208812713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:34] (step=0005319) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.10336183443451225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5320, "loss": 0.2964719235897064, "memory_gb": 7.721559524536133, "step_time_ms": 3367.208242416382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:38] (step=0005320) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.10338126700349787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5321, "loss": 0.2620737850666046, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4748668670654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:41] (step=0005321) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.10340069957248348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5322, "loss": 0.26825305819511414, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1750507354736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:45] (step=0005322) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.1034201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5323, "loss": 0.1478608250617981, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7010021209717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:48] (step=0005323) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.10343956471045472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5324, "loss": 0.16565585136413574, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7371788024902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:52] (step=0005324) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.10345899727944034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5325, "loss": 0.2414713352918625, "memory_gb": 7.721559524536133, "step_time_ms": 3364.47811126709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:56] (step=0005325) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.10347842984842597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:20:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5326, "loss": 0.24583065509796143, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8652305603027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:20:59] (step=0005326) Train Loss: 0.2059, Train Steps/Sec: 0.28, Epoch: 0.10349786241741159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5327, "loss": 0.25265824794769287, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0034198760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:03] (step=0005327) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.1035172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5328, "loss": 0.1854875683784485, "memory_gb": 7.721559524536133, "step_time_ms": 3367.427349090576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:07] (step=0005328) Train Loss: 0.2670, Train Steps/Sec: 0.28, Epoch: 0.10353672755538282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5329, "loss": 0.2576773166656494, "memory_gb": 7.721559524536133, "step_time_ms": 3365.537643432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:10] (step=0005329) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.10355616012436844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5330, "loss": 0.22887936234474182, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5623455047607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:14] (step=0005330) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.10357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5331, "loss": 0.22485363483428955, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1748847961426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:18] (step=0005331) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.10359502526233969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5332, "loss": 0.24622057378292084, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8741035461426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:21] (step=0005332) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.10361445783132531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5333, "loss": 0.23205749690532684, "memory_gb": 7.721559524536133, "step_time_ms": 3366.215705871582, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:25] (step=0005333) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.10363389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5334, "loss": 0.24112817645072937, "memory_gb": 7.721559524536133, "step_time_ms": 3369.4167137145996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:28] (step=0005334) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.10365332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5335, "loss": 0.24649834632873535, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2942390441895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:32] (step=0005335) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.10367275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5336, "loss": 0.2883073091506958, "memory_gb": 7.721559524536133, "step_time_ms": 3366.410255432129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:36] (step=0005336) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.10369218810726778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5337, "loss": 0.19794140756130219, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8808937072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:39] (step=0005337) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.1037116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5338, "loss": 0.235444575548172, "memory_gb": 7.721559524536133, "step_time_ms": 3367.164134979248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:43] (step=0005338) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.10373105324523903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5339, "loss": 0.24062508344650269, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8950023651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:47] (step=0005339) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.10375048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5340, "loss": 0.275736927986145, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4184551239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:50] (step=0005340) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.10376991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5341, "loss": 0.24639829993247986, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7259635925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:54] (step=0005341) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.10378935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5342, "loss": 0.25532570481300354, "memory_gb": 7.721559524536133, "step_time_ms": 3368.227481842041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:21:57] (step=0005342) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.1038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5343, "loss": 0.275300532579422, "memory_gb": 7.721559524536133, "step_time_ms": 3377.774238586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:01] (step=0005343) Train Loss: 0.2414, Train Steps/Sec: 0.27, Epoch: 0.10382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5344, "loss": 0.2638220191001892, "memory_gb": 7.721559524536133, "step_time_ms": 3370.300054550171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:05] (step=0005344) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.10384764865915273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5345, "loss": 0.19962404668331146, "memory_gb": 7.721559524536133, "step_time_ms": 3368.647336959839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:08] (step=0005345) Train Loss: 0.2034, Train Steps/Sec: 0.28, Epoch: 0.10386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5346, "loss": 0.186461940407753, "memory_gb": 7.721559524536133, "step_time_ms": 3372.472047805786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:12] (step=0005346) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.10388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5347, "loss": 0.2397485375404358, "memory_gb": 7.721559524536133, "step_time_ms": 3371.584177017212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:16] (step=0005347) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.1039059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5348, "loss": 0.3599085509777069, "memory_gb": 7.715639114379883, "step_time_ms": 3353.1200885772705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:19] (step=0005348) Train Loss: 0.2768, Train Steps/Sec: 0.26, Epoch: 0.10392537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5349, "loss": 0.27762138843536377, "memory_gb": 7.721559524536133, "step_time_ms": 3367.792844772339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:23] (step=0005349) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.10394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5350, "loss": 0.2583597004413605, "memory_gb": 7.721559524536133, "step_time_ms": 3371.104955673218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:27] (step=0005350) Train Loss: 0.2928, Train Steps/Sec: 0.28, Epoch: 0.10396424407306645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5351, "loss": 0.2295164167881012, "memory_gb": 7.721559524536133, "step_time_ms": 3369.154453277588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:30] (step=0005351) Train Loss: 0.2555, Train Steps/Sec: 0.27, Epoch: 0.10398367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5352, "loss": 0.18745329976081848, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6089515686035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:34] (step=0005352) Train Loss: 0.1787, Train Steps/Sec: 0.28, Epoch: 0.1040031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5353, "loss": 0.1503879427909851, "memory_gb": 7.721559524536133, "step_time_ms": 3368.976354598999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:38] (step=0005353) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.10402254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5354, "loss": 0.2717442214488983, "memory_gb": 7.721559524536133, "step_time_ms": 3372.676372528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:41] (step=0005354) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.10404197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5355, "loss": 0.29934900999069214, "memory_gb": 7.721559524536133, "step_time_ms": 3520.2696323394775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:45] (step=0005355) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.10406140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5356, "loss": 0.29725757241249084, "memory_gb": 7.721559524536133, "step_time_ms": 3376.934766769409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:48] (step=0005356) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.10408083948698017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5357, "loss": 0.34005624055862427, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0028800964355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:52] (step=0005357) Train Loss: 0.3067, Train Steps/Sec: 0.28, Epoch: 0.1041002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5358, "loss": 0.32934603095054626, "memory_gb": 7.721559524536133, "step_time_ms": 3380.3224563598633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:56] (step=0005358) Train Loss: 0.3157, Train Steps/Sec: 0.28, Epoch: 0.10411970462495142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5359, "loss": 0.19632911682128906, "memory_gb": 7.721559524536133, "step_time_ms": 3382.075071334839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:22:59] (step=0005359) Train Loss: 0.1783, Train Steps/Sec: 0.28, Epoch: 0.10413913719393704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5360, "loss": 0.20431174337863922, "memory_gb": 7.721559524536133, "step_time_ms": 3374.885320663452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:03] (step=0005360) Train Loss: 0.2452, Train Steps/Sec: 0.28, Epoch: 0.10415856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5361, "loss": 0.2864285111427307, "memory_gb": 7.721559524536133, "step_time_ms": 3378.6840438842773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:07] (step=0005361) Train Loss: 0.2960, Train Steps/Sec: 0.28, Epoch: 0.10417800233190828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5362, "loss": 0.1766381859779358, "memory_gb": 7.721559524536133, "step_time_ms": 3375.3795623779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:10] (step=0005362) Train Loss: 0.2944, Train Steps/Sec: 0.28, Epoch: 0.10419743490089389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5363, "loss": 0.15754801034927368, "memory_gb": 7.721559524536133, "step_time_ms": 3377.5060176849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:14] (step=0005363) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.10421686746987951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5364, "loss": 0.25322282314300537, "memory_gb": 7.721559524536133, "step_time_ms": 3373.5034465789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:18] (step=0005364) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.10423630003886514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5365, "loss": 0.2021368443965912, "memory_gb": 7.721559524536133, "step_time_ms": 3374.8974800109863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:21] (step=0005365) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.10425573260785076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5366, "loss": 0.20156992971897125, "memory_gb": 7.721559524536133, "step_time_ms": 3369.811534881592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:25] (step=0005366) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.10427516517683638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5367, "loss": 0.17698697745800018, "memory_gb": 7.721559524536133, "step_time_ms": 3357.163906097412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:28] (step=0005367) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.104294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5368, "loss": 0.2887357473373413, "memory_gb": 7.721559524536133, "step_time_ms": 3378.1402111053467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:32] (step=0005368) Train Loss: 0.3126, Train Steps/Sec: 0.28, Epoch: 0.10431403031480761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5369, "loss": 0.3263985514640808, "memory_gb": 7.721559524536133, "step_time_ms": 3376.40643119812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:36] (step=0005369) Train Loss: 0.2989, Train Steps/Sec: 0.28, Epoch: 0.10433346288379323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5370, "loss": 0.3636062741279602, "memory_gb": 7.721559524536133, "step_time_ms": 3383.636474609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:39] (step=0005370) Train Loss: 0.3276, Train Steps/Sec: 0.27, Epoch: 0.10435289545277886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5371, "loss": 0.2244994342327118, "memory_gb": 7.721559524536133, "step_time_ms": 3381.6051483154297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:43] (step=0005371) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.10437232802176448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5372, "loss": 0.30375412106513977, "memory_gb": 7.721559524536133, "step_time_ms": 3382.805585861206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:47] (step=0005372) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.1043917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5373, "loss": 0.2906946539878845, "memory_gb": 7.721559524536133, "step_time_ms": 3383.2998275756836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:50] (step=0005373) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.10441119315973571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5374, "loss": 0.3284130394458771, "memory_gb": 7.721559524536133, "step_time_ms": 3379.258155822754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:54] (step=0005374) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.10443062572872133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:23:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5375, "loss": 0.2673877477645874, "memory_gb": 7.721559524536133, "step_time_ms": 3377.133369445801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:23:57] (step=0005375) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.10445005829770695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5376, "loss": 0.22044654190540314, "memory_gb": 7.721559524536133, "step_time_ms": 3378.7403106689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:01] (step=0005376) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.10446949086669258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5377, "loss": 0.17210936546325684, "memory_gb": 7.721559524536133, "step_time_ms": 3373.371362686157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:05] (step=0005377) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.1044889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5378, "loss": 0.16415837407112122, "memory_gb": 7.721559524536133, "step_time_ms": 3381.4284801483154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:08] (step=0005378) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.10450835600466382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5379, "loss": 0.33589354157447815, "memory_gb": 7.721559524536133, "step_time_ms": 3381.5975189208984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:12] (step=0005379) Train Loss: 0.3220, Train Steps/Sec: 0.28, Epoch: 0.10452778857364943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5380, "loss": 0.3751217722892761, "memory_gb": 7.721559524536133, "step_time_ms": 3379.426956176758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:16] (step=0005380) Train Loss: 0.3144, Train Steps/Sec: 0.28, Epoch: 0.10454722114263505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5381, "loss": 0.19029882550239563, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3063468933105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:19] (step=0005381) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.10456665371162067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5382, "loss": 0.21066050231456757, "memory_gb": 7.721559524536133, "step_time_ms": 3381.6943168640137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:23] (step=0005382) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.1045860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5383, "loss": 0.29369333386421204, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5083084106445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:27] (step=0005383) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.10460551884959192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5384, "loss": 0.17204387485980988, "memory_gb": 7.721559524536133, "step_time_ms": 3373.447895050049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:30] (step=0005384) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.10462495141857754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5385, "loss": 0.21280798316001892, "memory_gb": 7.721559524536133, "step_time_ms": 3375.4146099090576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:34] (step=0005385) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.10464438398756315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5386, "loss": 0.17721670866012573, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5506324768066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:37] (step=0005386) Train Loss: 0.1814, Train Steps/Sec: 0.28, Epoch: 0.10466381655654877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5387, "loss": 0.18524116277694702, "memory_gb": 7.721559524536133, "step_time_ms": 3373.711347579956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:41] (step=0005387) Train Loss: 0.1656, Train Steps/Sec: 0.28, Epoch: 0.1046832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5388, "loss": 0.18709373474121094, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3112812042236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:45] (step=0005388) Train Loss: 0.2299, Train Steps/Sec: 0.27, Epoch: 0.10470268169452002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5389, "loss": 0.294572114944458, "memory_gb": 7.721559524536133, "step_time_ms": 3374.0265369415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:48] (step=0005389) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.10472211426350564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5390, "loss": 0.2896232604980469, "memory_gb": 7.721559524536133, "step_time_ms": 3370.090961456299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:52] (step=0005390) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.10474154683249126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5391, "loss": 0.29792892932891846, "memory_gb": 7.721559524536133, "step_time_ms": 3375.3762245178223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:56] (step=0005391) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.10476097940147687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:24:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5392, "loss": 0.1613529920578003, "memory_gb": 7.721559524536133, "step_time_ms": 3368.619918823242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:24:59] (step=0005392) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.10478041197046249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5393, "loss": 0.14717473089694977, "memory_gb": 7.721559524536133, "step_time_ms": 3372.912883758545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:03] (step=0005393) Train Loss: 0.1792, Train Steps/Sec: 0.28, Epoch: 0.10479984453944811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5394, "loss": 0.3748796284198761, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1207637786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:07] (step=0005394) Train Loss: 0.3039, Train Steps/Sec: 0.28, Epoch: 0.10481927710843374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5395, "loss": 0.158977672457695, "memory_gb": 7.721559524536133, "step_time_ms": 3369.011402130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:10] (step=0005395) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.10483870967741936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5396, "loss": 0.1425754278898239, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4722442626953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:14] (step=0005396) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.10485814224640498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5397, "loss": 0.19519701600074768, "memory_gb": 7.721559524536133, "step_time_ms": 3363.272190093994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:17] (step=0005397) Train Loss: 0.1921, Train Steps/Sec: 0.28, Epoch: 0.10487757481539059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5398, "loss": 0.17616905272006989, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0381050109863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:21] (step=0005398) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.10489700738437621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5399, "loss": 0.1740005910396576, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6259727478027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:25] (step=0005399) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.10491643995336183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5400, "loss": 0.1678900122642517, "memory_gb": 7.721559524536133, "step_time_ms": 3370.682954788208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:28] (step=0005400) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.10493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5401, "loss": 0.28302323818206787, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1556453704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:32] (step=0005401) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.10495530509133308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5402, "loss": 0.16215001046657562, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7431812286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:36] (step=0005402) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.1049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5403, "loss": 0.28918519616127014, "memory_gb": 7.721559524536133, "step_time_ms": 3519.9670791625977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:39] (step=0005403) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.10499417022930431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5404, "loss": 0.10460422188043594, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5788383483887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:43] (step=0005404) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.10501360279828993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5405, "loss": 0.30018699169158936, "memory_gb": 7.721559524536133, "step_time_ms": 3371.872663497925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:47] (step=0005405) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.10503303536727555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5406, "loss": 0.1536804735660553, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4469203948975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:50] (step=0005406) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.10505246793626118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5407, "loss": 0.22934788465499878, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9962158203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:54] (step=0005407) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.1050719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:25:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5408, "loss": 0.25484123826026917, "memory_gb": 7.721559524536133, "step_time_ms": 3374.267101287842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:25:57] (step=0005408) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.10509133307423241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5409, "loss": 0.1501920074224472, "memory_gb": 7.721559524536133, "step_time_ms": 3373.196840286255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:01] (step=0005409) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.10511076564321803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5410, "loss": 0.1548619270324707, "memory_gb": 7.721559524536133, "step_time_ms": 3371.596336364746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:05] (step=0005410) Train Loss: 0.1496, Train Steps/Sec: 0.28, Epoch: 0.10513019821220365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5411, "loss": 0.3221951127052307, "memory_gb": 7.721559524536133, "step_time_ms": 3372.204542160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:08] (step=0005411) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.10514963078118927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5412, "loss": 0.3061332702636719, "memory_gb": 7.721559524536133, "step_time_ms": 3369.246006011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:12] (step=0005412) Train Loss: 0.2852, Train Steps/Sec: 0.28, Epoch: 0.1051690633501749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5413, "loss": 0.11326495558023453, "memory_gb": 7.721559524536133, "step_time_ms": 3368.907690048218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:16] (step=0005413) Train Loss: 0.1548, Train Steps/Sec: 0.28, Epoch: 0.10518849591916052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5414, "loss": 0.2700197696685791, "memory_gb": 7.721559524536133, "step_time_ms": 3373.835563659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:19] (step=0005414) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.10520792848814613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5415, "loss": 0.21581313014030457, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8279132843018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:23] (step=0005415) Train Loss: 0.2815, Train Steps/Sec: 0.27, Epoch: 0.10522736105713175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5416, "loss": 0.28005683422088623, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5952167510986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:27] (step=0005416) Train Loss: 0.3059, Train Steps/Sec: 0.28, Epoch: 0.10524679362611737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5417, "loss": 0.32501229643821716, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3155727386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:30] (step=0005417) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.105266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5418, "loss": 0.18266268074512482, "memory_gb": 7.721559524536133, "step_time_ms": 3371.016502380371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:34] (step=0005418) Train Loss: 0.1827, Train Steps/Sec: 0.28, Epoch: 0.10528565876408862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5419, "loss": 0.24109449982643127, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1464920043945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:37] (step=0005419) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.10530509133307424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5420, "loss": 0.242405965924263, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7798976898193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:41] (step=0005420) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.10532452390205985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5421, "loss": 0.32855480909347534, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0848350524902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:45] (step=0005421) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.10534395647104547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5422, "loss": 0.17597901821136475, "memory_gb": 7.721559524536133, "step_time_ms": 3372.021198272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:48] (step=0005422) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.10536338904003109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5423, "loss": 0.24004778265953064, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0907230377197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:52] (step=0005423) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.10538282160901671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5424, "loss": 0.2584249675273895, "memory_gb": 7.721559524536133, "step_time_ms": 3368.50905418396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:56] (step=0005424) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.10540225417800234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5425, "loss": 0.23656004667282104, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0616664886475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:26:59] (step=0005425) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.10542168674698796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5426, "loss": 0.30906128883361816, "memory_gb": 7.721559524536133, "step_time_ms": 3348.39129447937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:03] (step=0005426) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.10544111931597357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5427, "loss": 0.29125428199768066, "memory_gb": 7.721559524536133, "step_time_ms": 3364.219903945923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:06] (step=0005427) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.10546055188495919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5428, "loss": 0.26049861311912537, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2285079956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:10] (step=0005428) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.10547998445394481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5429, "loss": 0.2665352523326874, "memory_gb": 7.721559524536133, "step_time_ms": 3372.666120529175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:14] (step=0005429) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.10549941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5430, "loss": 0.17939026653766632, "memory_gb": 7.721559524536133, "step_time_ms": 3368.472099304199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:17] (step=0005430) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.10551884959191606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5431, "loss": 0.14671745896339417, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9398555755615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:21] (step=0005431) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.10553828216090168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5432, "loss": 0.11815890669822693, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6628341674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:25] (step=0005432) Train Loss: 0.1728, Train Steps/Sec: 0.28, Epoch: 0.10555771472988729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5433, "loss": 0.17075082659721375, "memory_gb": 7.721559524536133, "step_time_ms": 3370.433807373047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:28] (step=0005433) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.10557714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5434, "loss": 0.24337901175022125, "memory_gb": 7.721559524536133, "step_time_ms": 3375.6988048553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:32] (step=0005434) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.10559657986785853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5435, "loss": 0.3382711410522461, "memory_gb": 7.721559524536133, "step_time_ms": 3371.570348739624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:36] (step=0005435) Train Loss: 0.3467, Train Steps/Sec: 0.26, Epoch: 0.10561601243684415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5436, "loss": 0.18772241473197937, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6367321014404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:39] (step=0005436) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.10563544500582978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5437, "loss": 0.18378575146198273, "memory_gb": 7.721559524536133, "step_time_ms": 3375.4396438598633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:43] (step=0005437) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.10565487757481538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5438, "loss": 0.24698711931705475, "memory_gb": 7.721559524536133, "step_time_ms": 3366.72043800354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:47] (step=0005438) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.105674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5439, "loss": 0.21969757974147797, "memory_gb": 7.721559524536133, "step_time_ms": 3373.8811016082764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:50] (step=0005439) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.10569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5440, "loss": 0.2414858192205429, "memory_gb": 7.715639114379883, "step_time_ms": 3333.4851264953613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:54] (step=0005440) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.10571317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5441, "loss": 0.2852247357368469, "memory_gb": 7.721559524536133, "step_time_ms": 3372.6351261138916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:27:57] (step=0005441) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.10573260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5442, "loss": 0.3268283009529114, "memory_gb": 7.721559524536133, "step_time_ms": 3368.760585784912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:01] (step=0005442) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.1057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5443, "loss": 0.20610564947128296, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3742027282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:05] (step=0005443) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.1057714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5444, "loss": 0.19779559969902039, "memory_gb": 7.721559524536133, "step_time_ms": 3512.36629486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:08] (step=0005444) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.10579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5445, "loss": 0.2842217981815338, "memory_gb": 7.721559524536133, "step_time_ms": 3372.69926071167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:12] (step=0005445) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.10581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5446, "loss": 0.17168909311294556, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8903789520264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:16] (step=0005446) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.10582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5447, "loss": 0.22832515835762024, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8691596984863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:19] (step=0005447) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.10584920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5448, "loss": 0.2578168511390686, "memory_gb": 7.721559524536133, "step_time_ms": 3370.399236679077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:23] (step=0005448) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.10586863583365722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5449, "loss": 0.18928897380828857, "memory_gb": 7.721559524536133, "step_time_ms": 3373.737096786499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:27] (step=0005449) Train Loss: 0.2527, Train Steps/Sec: 0.28, Epoch: 0.10588806840264282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5450, "loss": 0.2561454474925995, "memory_gb": 7.721559524536133, "step_time_ms": 3373.2714653015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:30] (step=0005450) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.10590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5451, "loss": 0.18279367685317993, "memory_gb": 7.721559524536133, "step_time_ms": 3374.3457794189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:34] (step=0005451) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.10592693354061407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5452, "loss": 0.2307530641555786, "memory_gb": 7.721559524536133, "step_time_ms": 3373.7990856170654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:37] (step=0005452) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.10594636610959969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5453, "loss": 0.20001035928726196, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3315467834473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:41] (step=0005453) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.10596579867858531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5454, "loss": 0.18362541496753693, "memory_gb": 7.721559524536133, "step_time_ms": 3372.0710277557373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:45] (step=0005454) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.10598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5455, "loss": 0.3330690264701843, "memory_gb": 7.721559524536133, "step_time_ms": 3374.8955726623535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:48] (step=0005455) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.10600466381655654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5456, "loss": 0.26640114188194275, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1741695404053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:52] (step=0005456) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.10602409638554217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5457, "loss": 0.23382428288459778, "memory_gb": 7.721559524536133, "step_time_ms": 3373.7499713897705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:56] (step=0005457) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.10604352895452779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:28:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5458, "loss": 0.24788348376750946, "memory_gb": 7.721559524536133, "step_time_ms": 3376.1556148529053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:28:59] (step=0005458) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.10606296152351341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5459, "loss": 0.32329535484313965, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9984760284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:03] (step=0005459) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.10608239409249903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5460, "loss": 0.3168012499809265, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4120178222656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:07] (step=0005460) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.10610182666148466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5461, "loss": 0.19316032528877258, "memory_gb": 7.721559524536133, "step_time_ms": 3373.868703842163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:10] (step=0005461) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.10612125923047026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5462, "loss": 0.25404781103134155, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3441829681396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:14] (step=0005462) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.10614069179945589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5463, "loss": 0.16020719707012177, "memory_gb": 7.721559524536133, "step_time_ms": 3374.366521835327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:17] (step=0005463) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.10616012436844151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5464, "loss": 0.305698037147522, "memory_gb": 7.721559524536133, "step_time_ms": 3375.9357929229736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:21] (step=0005464) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.10617955693742713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5465, "loss": 0.1638527512550354, "memory_gb": 7.721559524536133, "step_time_ms": 3377.216100692749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:25] (step=0005465) Train Loss: 0.1877, Train Steps/Sec: 0.28, Epoch: 0.10619898950641275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5466, "loss": 0.22178222239017487, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9593753814697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:28] (step=0005466) Train Loss: 0.2862, Train Steps/Sec: 0.28, Epoch: 0.10621842207539836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5467, "loss": 0.24185532331466675, "memory_gb": 7.721559524536133, "step_time_ms": 3379.7006607055664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:32] (step=0005467) Train Loss: 0.2515, Train Steps/Sec: 0.27, Epoch: 0.10623785464438398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5468, "loss": 0.16426917910575867, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9121685028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:36] (step=0005468) Train Loss: 0.1753, Train Steps/Sec: 0.28, Epoch: 0.1062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5469, "loss": 0.27765679359436035, "memory_gb": 7.721559524536133, "step_time_ms": 3381.558895111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:39] (step=0005469) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.10627671978235523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5470, "loss": 0.18547296524047852, "memory_gb": 7.721559524536133, "step_time_ms": 3379.6870708465576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:43] (step=0005470) Train Loss: 0.1606, Train Steps/Sec: 0.28, Epoch: 0.10629615235134085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5471, "loss": 0.19312971830368042, "memory_gb": 7.721559524536133, "step_time_ms": 3379.865884780884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:47] (step=0005471) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.10631558492032647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5472, "loss": 0.29221245646476746, "memory_gb": 7.721559524536133, "step_time_ms": 3374.497413635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:50] (step=0005472) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.10633501748931208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5473, "loss": 0.2342788577079773, "memory_gb": 7.721559524536133, "step_time_ms": 3381.1120986938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:54] (step=0005473) Train Loss: 0.2164, Train Steps/Sec: 0.28, Epoch: 0.1063544500582977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:29:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5474, "loss": 0.18628208339214325, "memory_gb": 7.721559524536133, "step_time_ms": 3380.06854057312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:29:57] (step=0005474) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.10637388262728333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5475, "loss": 0.22814813256263733, "memory_gb": 7.721559524536133, "step_time_ms": 3379.3845176696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:01] (step=0005475) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.10639331519626895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5476, "loss": 0.20705276727676392, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6987113952637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:05] (step=0005476) Train Loss: 0.2044, Train Steps/Sec: 0.27, Epoch: 0.10641274776525457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5477, "loss": 0.2839716076850891, "memory_gb": 7.721559524536133, "step_time_ms": 3372.251033782959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:08] (step=0005477) Train Loss: 0.2776, Train Steps/Sec: 0.28, Epoch: 0.10643218033424019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5478, "loss": 0.24839690327644348, "memory_gb": 7.721559524536133, "step_time_ms": 3376.78599357605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:12] (step=0005478) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.1064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5479, "loss": 0.18004527688026428, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1068840026855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:16] (step=0005479) Train Loss: 0.2246, Train Steps/Sec: 0.28, Epoch: 0.10647104547221142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5480, "loss": 0.24889260530471802, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5438117980957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:19] (step=0005480) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.10649047804119705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5481, "loss": 0.17454726994037628, "memory_gb": 7.721559524536133, "step_time_ms": 3371.802568435669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:23] (step=0005481) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.10650991061018267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5482, "loss": 0.2750312089920044, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4657554626465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:27] (step=0005482) Train Loss: 0.2689, Train Steps/Sec: 0.28, Epoch: 0.10652934317916829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5483, "loss": 0.21211780607700348, "memory_gb": 7.721559524536133, "step_time_ms": 3372.0755577087402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:30] (step=0005483) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.10654877574815391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5484, "loss": 0.11401063948869705, "memory_gb": 7.721559524536133, "step_time_ms": 3374.063491821289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:34] (step=0005484) Train Loss: 0.1498, Train Steps/Sec: 0.28, Epoch: 0.10656820831713952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5485, "loss": 0.19406366348266602, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9306926727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:37] (step=0005485) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.10658764088612514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5486, "loss": 0.16990146040916443, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7197647094727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:41] (step=0005486) Train Loss: 0.1860, Train Steps/Sec: 0.27, Epoch: 0.10660707345511077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5487, "loss": 0.28514474630355835, "memory_gb": 7.721559524536133, "step_time_ms": 3374.34458732605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:45] (step=0005487) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.10662650602409639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5488, "loss": 0.2715345025062561, "memory_gb": 7.721559524536133, "step_time_ms": 3370.607852935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:48] (step=0005488) Train Loss: 0.2120, Train Steps/Sec: 0.28, Epoch: 0.10664593859308201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5489, "loss": 0.16525760293006897, "memory_gb": 7.721559524536133, "step_time_ms": 3374.9349117279053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:52] (step=0005489) Train Loss: 0.1624, Train Steps/Sec: 0.28, Epoch: 0.10666537116206763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5490, "loss": 0.15905718505382538, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9517974853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:56] (step=0005490) Train Loss: 0.1743, Train Steps/Sec: 0.28, Epoch: 0.10668480373105324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:30:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5491, "loss": 0.18642722070217133, "memory_gb": 7.721559524536133, "step_time_ms": 3373.324155807495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:30:59] (step=0005491) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.10670423630003886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5492, "loss": 0.30514684319496155, "memory_gb": 7.721559524536133, "step_time_ms": 3513.699769973755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:03] (step=0005492) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.10672366886902449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5493, "loss": 0.16788659989833832, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4512119293213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:07] (step=0005493) Train Loss: 0.2306, Train Steps/Sec: 0.27, Epoch: 0.10674310143801011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5494, "loss": 0.3096010982990265, "memory_gb": 7.721559524536133, "step_time_ms": 3366.39142036438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:10] (step=0005494) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.10676253400699573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5495, "loss": 0.32981863617897034, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1349544525146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:14] (step=0005495) Train Loss: 0.3088, Train Steps/Sec: 0.28, Epoch: 0.10678196657598134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5496, "loss": 0.2522212862968445, "memory_gb": 7.721559524536133, "step_time_ms": 3361.71293258667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:17] (step=0005496) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.10680139914496696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5497, "loss": 0.33095675706863403, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0004348754883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:21] (step=0005497) Train Loss: 0.3226, Train Steps/Sec: 0.28, Epoch: 0.10682083171395258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5498, "loss": 0.20465537905693054, "memory_gb": 7.721559524536133, "step_time_ms": 3367.969751358032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:25] (step=0005498) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.1068402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5499, "loss": 0.21579892933368683, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2025089263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:28] (step=0005499) Train Loss: 0.2497, Train Steps/Sec: 0.28, Epoch: 0.10685969685192383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5500, "loss": 0.20299893617630005, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4822998046875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:32] (step=0005500) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.10687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5501, "loss": 0.24740517139434814, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2424297332764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:36] (step=0005501) Train Loss: 0.2826, Train Steps/Sec: 0.28, Epoch: 0.10689856198989506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5502, "loss": 0.2566414177417755, "memory_gb": 7.721559524536133, "step_time_ms": 3368.445634841919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:39] (step=0005502) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.10691799455888068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5503, "loss": 0.18422020971775055, "memory_gb": 7.721559524536133, "step_time_ms": 3367.176294326782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:43] (step=0005503) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.1069374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5504, "loss": 0.15546056628227234, "memory_gb": 7.721559524536133, "step_time_ms": 3363.595962524414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:46] (step=0005504) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.10695685969685192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5505, "loss": 0.26652759313583374, "memory_gb": 7.721559524536133, "step_time_ms": 3367.238998413086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:50] (step=0005505) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.10697629226583755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5506, "loss": 0.15639425814151764, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2917613983154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:54] (step=0005506) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.10699572483482317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5507, "loss": 0.3405720591545105, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7666912078857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:31:57] (step=0005507) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.10701515740380878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5508, "loss": 0.32505956292152405, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7055835723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:01] (step=0005508) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.1070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5509, "loss": 0.14989890158176422, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5930500030518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:05] (step=0005509) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.10705402254178002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5510, "loss": 0.17535197734832764, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0825958251953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:08] (step=0005510) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.10707345511076564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5511, "loss": 0.2577337622642517, "memory_gb": 7.721559524536133, "step_time_ms": 3367.570638656616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:12] (step=0005511) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.10709288767975127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5512, "loss": 0.2339494228363037, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3759956359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:16] (step=0005512) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.10711232024873689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5513, "loss": 0.2953619956970215, "memory_gb": 7.721559524536133, "step_time_ms": 3365.123987197876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:19] (step=0005513) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.1071317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5514, "loss": 0.2788737416267395, "memory_gb": 7.721559524536133, "step_time_ms": 3366.424083709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:23] (step=0005514) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.10715118538670812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5515, "loss": 0.24206919968128204, "memory_gb": 7.721559524536133, "step_time_ms": 3367.058515548706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:26] (step=0005515) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.10717061795569374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5516, "loss": 0.1927632838487625, "memory_gb": 7.721559524536133, "step_time_ms": 3365.938901901245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:30] (step=0005516) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.10719005052467936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5517, "loss": 0.26573121547698975, "memory_gb": 7.721559524536133, "step_time_ms": 3365.000009536743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:34] (step=0005517) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.10720948309366499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5518, "loss": 0.27993184328079224, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4958782196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:37] (step=0005518) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.10722891566265061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5519, "loss": 0.2623636722564697, "memory_gb": 7.721559524536133, "step_time_ms": 3361.767053604126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:41] (step=0005519) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.10724834823163622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5520, "loss": 0.27179843187332153, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7471714019775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:45] (step=0005520) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.10726778080062184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5521, "loss": 0.25263267755508423, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6528301239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:48] (step=0005521) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.10728721336960746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5522, "loss": 0.26027318835258484, "memory_gb": 7.721559524536133, "step_time_ms": 3360.307216644287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:52] (step=0005522) Train Loss: 0.2768, Train Steps/Sec: 0.28, Epoch: 0.10730664593859308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5523, "loss": 0.16292768716812134, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7142486572266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:55] (step=0005523) Train Loss: 0.1420, Train Steps/Sec: 0.28, Epoch: 0.1073260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:32:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5524, "loss": 0.25835713744163513, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2352352142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:32:59] (step=0005524) Train Loss: 0.2682, Train Steps/Sec: 0.26, Epoch: 0.10734551107656432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5525, "loss": 0.295387864112854, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4964485168457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:03] (step=0005525) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.10736494364554994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5526, "loss": 0.23600146174430847, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5687408447266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:07] (step=0005526) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.10738437621453556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5527, "loss": 0.22553972899913788, "memory_gb": 7.721559524536133, "step_time_ms": 3357.914447784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:10] (step=0005527) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.10740380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5528, "loss": 0.23518376052379608, "memory_gb": 7.721559524536133, "step_time_ms": 3359.419822692871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:14] (step=0005528) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.1074232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5529, "loss": 0.3048330545425415, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8532581329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:17] (step=0005529) Train Loss: 0.2331, Train Steps/Sec: 0.27, Epoch: 0.10744267392149243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5530, "loss": 0.2344646155834198, "memory_gb": 7.721559524536133, "step_time_ms": 3380.2385330200195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:21] (step=0005530) Train Loss: 0.2107, Train Steps/Sec: 0.27, Epoch: 0.10746210649047803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5531, "loss": 0.1683061420917511, "memory_gb": 7.721559524536133, "step_time_ms": 3363.584041595459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:25] (step=0005531) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.10748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5532, "loss": 0.21062129735946655, "memory_gb": 7.721559524536133, "step_time_ms": 3507.371425628662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:28] (step=0005532) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.10750097162844928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5533, "loss": 0.15214109420776367, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9588565826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:32] (step=0005533) Train Loss: 0.1625, Train Steps/Sec: 0.27, Epoch: 0.1075204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5534, "loss": 0.2233816683292389, "memory_gb": 7.721559524536133, "step_time_ms": 3359.341859817505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:36] (step=0005534) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.10753983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5535, "loss": 0.26336929202079773, "memory_gb": 7.721559524536133, "step_time_ms": 3362.532615661621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:39] (step=0005535) Train Loss: 0.2421, Train Steps/Sec: 0.27, Epoch: 0.10755926933540615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5536, "loss": 0.21469272673130035, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8571453094482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:43] (step=0005536) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.10757870190439175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5537, "loss": 0.29036182165145874, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7519607543945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:47] (step=0005537) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.10759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5538, "loss": 0.20887646079063416, "memory_gb": 7.721559524536133, "step_time_ms": 3344.2602157592773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:50] (step=0005538) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.107617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5539, "loss": 0.2741239666938782, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1484298706055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:54] (step=0005539) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.10763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5540, "loss": 0.22413107752799988, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5663566589355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:33:57] (step=0005540) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.10765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5541, "loss": 0.21502640843391418, "memory_gb": 7.721559524536133, "step_time_ms": 3362.75315284729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:01] (step=0005541) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.10767586474931987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5542, "loss": 0.2685968279838562, "memory_gb": 7.721559524536133, "step_time_ms": 3364.677667617798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:05] (step=0005542) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.10769529731830547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5543, "loss": 0.21371307969093323, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8249378204346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:08] (step=0005543) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.1077147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5544, "loss": 0.30821362137794495, "memory_gb": 7.721559524536133, "step_time_ms": 3363.309621810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:12] (step=0005544) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.10773416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5545, "loss": 0.34411218762397766, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4704513549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:16] (step=0005545) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.10775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5546, "loss": 0.23624512553215027, "memory_gb": 7.721559524536133, "step_time_ms": 3367.856502532959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:19] (step=0005546) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.10777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5547, "loss": 0.13597184419631958, "memory_gb": 7.721559524536133, "step_time_ms": 3367.154121398926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:23] (step=0005547) Train Loss: 0.1531, Train Steps/Sec: 0.28, Epoch: 0.10779246016323359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5548, "loss": 0.19530725479125977, "memory_gb": 7.721559524536133, "step_time_ms": 3366.696834564209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:26] (step=0005548) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.1078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5549, "loss": 0.19535928964614868, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0202960968018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:30] (step=0005549) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.10783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5550, "loss": 0.23658375442028046, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1715259552, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:34] (step=0005550) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.10785075787019044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5551, "loss": 0.2220664620399475, "memory_gb": 7.721559524536133, "step_time_ms": 3367.640256881714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:37] (step=0005551) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.10787019043917606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5552, "loss": 0.23226934671401978, "memory_gb": 7.721559524536133, "step_time_ms": 3371.9475269317627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:41] (step=0005552) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.10788962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5553, "loss": 0.3016108274459839, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2872524261475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:45] (step=0005553) Train Loss: 0.2890, Train Steps/Sec: 0.28, Epoch: 0.10790905557714729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5554, "loss": 0.2210082709789276, "memory_gb": 7.715639114379883, "step_time_ms": 3344.5041179656982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:48] (step=0005554) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.10792848814613291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5555, "loss": 0.28242024779319763, "memory_gb": 7.721559524536133, "step_time_ms": 3373.4936714172363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:52] (step=0005555) Train Loss: 0.2517, Train Steps/Sec: 0.27, Epoch: 0.10794792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5556, "loss": 0.23512253165245056, "memory_gb": 7.721559524536133, "step_time_ms": 3374.3274211883545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:56] (step=0005556) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.10796735328410416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5557, "loss": 0.24889308214187622, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9870834350586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:34:59] (step=0005557) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.10798678585308978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5558, "loss": 0.1696091592311859, "memory_gb": 7.721559524536133, "step_time_ms": 3372.6611137390137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:03] (step=0005558) Train Loss: 0.1583, Train Steps/Sec: 0.28, Epoch: 0.1080062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5559, "loss": 0.34150633215904236, "memory_gb": 7.721559524536133, "step_time_ms": 3373.0039596557617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:06] (step=0005559) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.10802565099106101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5560, "loss": 0.24881663918495178, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6467361450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:10] (step=0005560) Train Loss: 0.2229, Train Steps/Sec: 0.28, Epoch: 0.10804508356004663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5561, "loss": 0.1542813777923584, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8855381011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:14] (step=0005561) Train Loss: 0.1667, Train Steps/Sec: 0.28, Epoch: 0.10806451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5562, "loss": 0.23454298079013824, "memory_gb": 7.721559524536133, "step_time_ms": 3373.741388320923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:17] (step=0005562) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.10808394869801788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5563, "loss": 0.28102946281433105, "memory_gb": 7.721559524536133, "step_time_ms": 3371.387004852295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:21] (step=0005563) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.1081033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5564, "loss": 0.29358699917793274, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2001571655273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:25] (step=0005564) Train Loss: 0.2597, Train Steps/Sec: 0.27, Epoch: 0.10812281383598912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5565, "loss": 0.28297942876815796, "memory_gb": 7.721559524536133, "step_time_ms": 3368.119716644287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:28] (step=0005565) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.10814224640497473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5566, "loss": 0.1965356022119522, "memory_gb": 7.721559524536133, "step_time_ms": 3374.236583709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:32] (step=0005566) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.10816167897396035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5567, "loss": 0.29197296500205994, "memory_gb": 7.721559524536133, "step_time_ms": 3375.741958618164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:36] (step=0005567) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.10818111154294598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5568, "loss": 0.30696627497673035, "memory_gb": 7.721559524536133, "step_time_ms": 3376.4874935150146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:39] (step=0005568) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.1082005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5569, "loss": 0.2496759295463562, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7093467712402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:43] (step=0005569) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.10821997668091722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5570, "loss": 0.16407020390033722, "memory_gb": 7.721559524536133, "step_time_ms": 3378.2365322113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:47] (step=0005570) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.10823940924990284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5571, "loss": 0.25246521830558777, "memory_gb": 7.721559524536133, "step_time_ms": 3371.460437774658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:50] (step=0005571) Train Loss: 0.2946, Train Steps/Sec: 0.28, Epoch: 0.10825884181888845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5572, "loss": 0.2563573718070984, "memory_gb": 7.721559524536133, "step_time_ms": 3373.0926513671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:54] (step=0005572) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.10827827438787407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:35:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5573, "loss": 0.3407706022262573, "memory_gb": 7.721559524536133, "step_time_ms": 3376.002550125122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:35:57] (step=0005573) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.1082977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5574, "loss": 0.3880198001861572, "memory_gb": 7.721559524536133, "step_time_ms": 3374.0193843841553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:01] (step=0005574) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.10831713952584532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5575, "loss": 0.111243337392807, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5310096740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:05] (step=0005575) Train Loss: 0.1775, Train Steps/Sec: 0.28, Epoch: 0.10833657209483094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5576, "loss": 0.2402651458978653, "memory_gb": 7.721559524536133, "step_time_ms": 3380.744457244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:08] (step=0005576) Train Loss: 0.1979, Train Steps/Sec: 0.28, Epoch: 0.10835600466381656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5577, "loss": 0.1509823352098465, "memory_gb": 7.721559524536133, "step_time_ms": 3374.8204708099365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:12] (step=0005577) Train Loss: 0.1761, Train Steps/Sec: 0.28, Epoch: 0.10837543723280217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5578, "loss": 0.23516815900802612, "memory_gb": 7.721559524536133, "step_time_ms": 3379.3773651123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:16] (step=0005578) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.1083948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5579, "loss": 0.15382477641105652, "memory_gb": 7.721559524536133, "step_time_ms": 3505.08451461792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:19] (step=0005579) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.10841430237077342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5580, "loss": 0.25076207518577576, "memory_gb": 7.721559524536133, "step_time_ms": 3372.832775115967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:23] (step=0005580) Train Loss: 0.2728, Train Steps/Sec: 0.28, Epoch: 0.10843373493975904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5581, "loss": 0.2403155416250229, "memory_gb": 7.721559524536133, "step_time_ms": 3375.6930828094482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:26] (step=0005581) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.10845316750874466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5582, "loss": 0.18566815555095673, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9235401153564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:30] (step=0005582) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.10847260007773027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5583, "loss": 0.2202332317829132, "memory_gb": 7.721559524536133, "step_time_ms": 3376.5501976013184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:34] (step=0005583) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.10849203264671589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5584, "loss": 0.2196565717458725, "memory_gb": 7.721559524536133, "step_time_ms": 3377.9237270355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:37] (step=0005584) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.10851146521570151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5585, "loss": 0.25669893622398376, "memory_gb": 7.715639114379883, "step_time_ms": 3352.8671264648438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:41] (step=0005585) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.10853089778468714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5586, "loss": 0.20542067289352417, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0919151306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:45] (step=0005586) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.10855033035367276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5587, "loss": 0.1134149432182312, "memory_gb": 7.721559524536133, "step_time_ms": 3373.382568359375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:48] (step=0005587) Train Loss: 0.1494, Train Steps/Sec: 0.28, Epoch: 0.10856976292265838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5588, "loss": 0.27511730790138245, "memory_gb": 7.721559524536133, "step_time_ms": 3375.4284381866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:52] (step=0005588) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.10858919549164399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5589, "loss": 0.30313897132873535, "memory_gb": 7.721559524536133, "step_time_ms": 3373.533010482788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:56] (step=0005589) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.10860862806062961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5590, "loss": 0.2587319016456604, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9052543640137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:36:59] (step=0005590) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.10862806062961523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5591, "loss": 0.20070481300354004, "memory_gb": 7.721559524536133, "step_time_ms": 3375.396966934204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:03] (step=0005591) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.10864749319860086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5592, "loss": 0.1861121654510498, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4966773986816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:06] (step=0005592) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.10866692576758648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5593, "loss": 0.2143448442220688, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5971240997314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:10] (step=0005593) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.1086863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5594, "loss": 0.3089675307273865, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9044456481934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:14] (step=0005594) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.10870579090555771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5595, "loss": 0.3013850450515747, "memory_gb": 7.721559524536133, "step_time_ms": 3368.366003036499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:17] (step=0005595) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.10872522347454333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5596, "loss": 0.2174052894115448, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1116104125977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:21] (step=0005596) Train Loss: 0.1917, Train Steps/Sec: 0.28, Epoch: 0.10874465604352895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5597, "loss": 0.26566082239151, "memory_gb": 7.721559524536133, "step_time_ms": 3369.776487350464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:25] (step=0005597) Train Loss: 0.1988, Train Steps/Sec: 0.28, Epoch: 0.10876408861251458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5598, "loss": 0.33759498596191406, "memory_gb": 7.715639114379883, "step_time_ms": 3351.4785766601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:28] (step=0005598) Train Loss: 0.2629, Train Steps/Sec: 0.27, Epoch: 0.1087835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5599, "loss": 0.2738686203956604, "memory_gb": 7.721559524536133, "step_time_ms": 3357.888698577881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:32] (step=0005599) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.10880295375048582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5600, "loss": 0.28157490491867065, "memory_gb": 7.721559524536133, "step_time_ms": 3375.852108001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:35] (step=0005600) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.10882238631947143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5601, "loss": 0.1997886300086975, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1631298065186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:39] (step=0005601) Train Loss: 0.2452, Train Steps/Sec: 0.28, Epoch: 0.10884181888845705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5602, "loss": 0.2037312090396881, "memory_gb": 7.721559524536133, "step_time_ms": 3366.921901702881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:43] (step=0005602) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.10886125145744267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5603, "loss": 0.19970376789569855, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0636882781982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:46] (step=0005603) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.1088806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5604, "loss": 0.29400497674942017, "memory_gb": 7.721559524536133, "step_time_ms": 3369.635820388794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:50] (step=0005604) Train Loss: 0.2968, Train Steps/Sec: 0.28, Epoch: 0.10890011659541392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5605, "loss": 0.22205369174480438, "memory_gb": 7.721559524536133, "step_time_ms": 3375.551462173462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:54] (step=0005605) Train Loss: 0.2025, Train Steps/Sec: 0.28, Epoch: 0.10891954916439954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:37:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5606, "loss": 0.2684968113899231, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8134632110596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:37:57] (step=0005606) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.10893898173338515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5607, "loss": 0.27239659428596497, "memory_gb": 7.721559524536133, "step_time_ms": 3372.203826904297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:01] (step=0005607) Train Loss: 0.2886, Train Steps/Sec: 0.28, Epoch: 0.10895841430237077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5608, "loss": 0.22733332216739655, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8716678619385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:05] (step=0005608) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.10897784687135639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5609, "loss": 0.2668198049068451, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5410232543945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:08] (step=0005609) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.10899727944034202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5610, "loss": 0.2365088313817978, "memory_gb": 7.721559524536133, "step_time_ms": 3370.7828521728516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:12] (step=0005610) Train Loss: 0.1952, Train Steps/Sec: 0.28, Epoch: 0.10901671200932764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5611, "loss": 0.09079095721244812, "memory_gb": 7.721559524536133, "step_time_ms": 3371.3788986206055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:16] (step=0005611) Train Loss: 0.1651, Train Steps/Sec: 0.26, Epoch: 0.10903614457831326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5612, "loss": 0.2541651129722595, "memory_gb": 7.721559524536133, "step_time_ms": 3367.133140563965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:19] (step=0005612) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.10905557714729887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5613, "loss": 0.1452043056488037, "memory_gb": 7.721559524536133, "step_time_ms": 3370.967388153076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:23] (step=0005613) Train Loss: 0.1562, Train Steps/Sec: 0.28, Epoch: 0.10907500971628449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5614, "loss": 0.22201687097549438, "memory_gb": 7.721559524536133, "step_time_ms": 3368.593454360962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:26] (step=0005614) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.10909444228527011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5615, "loss": 0.16923893988132477, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3629970550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:30] (step=0005615) Train Loss: 0.1931, Train Steps/Sec: 0.27, Epoch: 0.10911387485425574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5616, "loss": 0.3155539631843567, "memory_gb": 7.721559524536133, "step_time_ms": 3364.426851272583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:34] (step=0005616) Train Loss: 0.2758, Train Steps/Sec: 0.28, Epoch: 0.10913330742324136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5617, "loss": 0.2521442770957947, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0478591918945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:37] (step=0005617) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.10915273999222697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5618, "loss": 0.14544259011745453, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0080642700195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:41] (step=0005618) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.10917217256121259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5619, "loss": 0.1643790900707245, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8503551483154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:45] (step=0005619) Train Loss: 0.1543, Train Steps/Sec: 0.28, Epoch: 0.10919160513019821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5620, "loss": 0.21080142259597778, "memory_gb": 7.721559524536133, "step_time_ms": 3510.5159282684326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:48] (step=0005620) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.10921103769918383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5621, "loss": 0.2665003538131714, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9660816192627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:52] (step=0005621) Train Loss: 0.2000, Train Steps/Sec: 0.28, Epoch: 0.10923047026816946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5622, "loss": 0.2789563536643982, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7659549713135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:56] (step=0005622) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.10924990283715508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5623, "loss": 0.345508337020874, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6137924194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:38:59] (step=0005623) Train Loss: 0.2844, Train Steps/Sec: 0.28, Epoch: 0.10926933540614069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5624, "loss": 0.18103915452957153, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8859519958496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:03] (step=0005624) Train Loss: 0.1718, Train Steps/Sec: 0.28, Epoch: 0.10928876797512631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5625, "loss": 0.26258185505867004, "memory_gb": 7.721559524536133, "step_time_ms": 3370.185375213623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:06] (step=0005625) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.10930820054411193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5626, "loss": 0.20419268310070038, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3602809906006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:10] (step=0005626) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.10932763311309755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5627, "loss": 0.296697735786438, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2175617218018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:14] (step=0005627) Train Loss: 0.3133, Train Steps/Sec: 0.28, Epoch: 0.10934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5628, "loss": 0.17649638652801514, "memory_gb": 7.721559524536133, "step_time_ms": 3364.487409591675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:17] (step=0005628) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.1093664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5629, "loss": 0.27070197463035583, "memory_gb": 7.721559524536133, "step_time_ms": 3367.521047592163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:21] (step=0005629) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.1093859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5630, "loss": 0.14523030817508698, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1797046661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:25] (step=0005630) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.10940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5631, "loss": 0.27669450640678406, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2306537628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:28] (step=0005631) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.10942479595802565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5632, "loss": 0.20662221312522888, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9209480285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:32] (step=0005632) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.10944422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5633, "loss": 0.22097620368003845, "memory_gb": 7.721559524536133, "step_time_ms": 3368.109941482544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:36] (step=0005633) Train Loss: 0.1665, Train Steps/Sec: 0.28, Epoch: 0.1094636610959969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5634, "loss": 0.2997286915779114, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1720027923584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:39] (step=0005634) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.10948309366498252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5635, "loss": 0.18910479545593262, "memory_gb": 7.721559524536133, "step_time_ms": 3368.305444717407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:43] (step=0005635) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.10950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5636, "loss": 0.20593325793743134, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0012950897217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:46] (step=0005636) Train Loss: 0.1782, Train Steps/Sec: 0.28, Epoch: 0.10952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5637, "loss": 0.24800370633602142, "memory_gb": 7.721559524536133, "step_time_ms": 3366.272449493408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:50] (step=0005637) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.10954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5638, "loss": 0.15994128584861755, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8765296936035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:54] (step=0005638) Train Loss: 0.1940, Train Steps/Sec: 0.28, Epoch: 0.10956082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5639, "loss": 0.22129113972187042, "memory_gb": 7.721559524536133, "step_time_ms": 3365.550756454468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:39:57] (step=0005639) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.10958025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5640, "loss": 0.21187612414360046, "memory_gb": 7.721559524536133, "step_time_ms": 3367.952346801758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:01] (step=0005640) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.10959968907889624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5641, "loss": 0.327546089887619, "memory_gb": 7.721559524536133, "step_time_ms": 3359.788179397583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:05] (step=0005641) Train Loss: 0.2786, Train Steps/Sec: 0.28, Epoch: 0.10961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5642, "loss": 0.2162405550479889, "memory_gb": 7.721559524536133, "step_time_ms": 3371.965169906616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:08] (step=0005642) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.10963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5643, "loss": 0.2028498351573944, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3275661468506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:12] (step=0005643) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.10965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5644, "loss": 0.20791316032409668, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2995567321777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:15] (step=0005644) Train Loss: 0.2101, Train Steps/Sec: 0.28, Epoch: 0.10967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5645, "loss": 0.27705591917037964, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5925216674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:19] (step=0005645) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.10969685192382433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5646, "loss": 0.2687864899635315, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6862506866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:23] (step=0005646) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.10971628449280994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5647, "loss": 0.3166084885597229, "memory_gb": 7.721559524536133, "step_time_ms": 3369.3230152130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:26] (step=0005647) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.10973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5648, "loss": 0.32503196597099304, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1046752929688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:30] (step=0005648) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.10975514963078119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5649, "loss": 0.20603632926940918, "memory_gb": 7.721559524536133, "step_time_ms": 3370.860815048218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:34] (step=0005649) Train Loss: 0.1896, Train Steps/Sec: 0.27, Epoch: 0.10977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5650, "loss": 0.24680769443511963, "memory_gb": 7.721559524536133, "step_time_ms": 3366.215944290161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:37] (step=0005650) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.10979401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5651, "loss": 0.3031434416770935, "memory_gb": 7.721559524536133, "step_time_ms": 3374.3886947631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:41] (step=0005651) Train Loss: 0.2512, Train Steps/Sec: 0.27, Epoch: 0.10981344733773805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5652, "loss": 0.29086732864379883, "memory_gb": 7.721559524536133, "step_time_ms": 3449.1777420043945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:45] (step=0005652) Train Loss: 0.3150, Train Steps/Sec: 0.25, Epoch: 0.10983287990672366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5653, "loss": 0.15563195943832397, "memory_gb": 7.721559524536133, "step_time_ms": 3395.9908485412598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:49] (step=0005653) Train Loss: 0.2215, Train Steps/Sec: 0.27, Epoch: 0.10985231247570929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5654, "loss": 0.17265664041042328, "memory_gb": 7.721559524536133, "step_time_ms": 3371.3629245758057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:52] (step=0005654) Train Loss: 0.2796, Train Steps/Sec: 0.27, Epoch: 0.10987174504469491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5655, "loss": 0.274090051651001, "memory_gb": 7.721559524536133, "step_time_ms": 3367.338180541992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:40:56] (step=0005655) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.10989117761368053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:40:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5656, "loss": 0.18317991495132446, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6865100860596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:00] (step=0005656) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.10991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5657, "loss": 0.3353038728237152, "memory_gb": 7.721559524536133, "step_time_ms": 3367.520570755005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:03] (step=0005657) Train Loss: 0.3107, Train Steps/Sec: 0.28, Epoch: 0.10993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5658, "loss": 0.2725277543067932, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8506355285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:07] (step=0005658) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.10994947532063738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5659, "loss": 0.19301632046699524, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7449016571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:10] (step=0005659) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.109968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5660, "loss": 0.17910724878311157, "memory_gb": 7.721559524536133, "step_time_ms": 3368.906259536743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:14] (step=0005660) Train Loss: 0.1735, Train Steps/Sec: 0.28, Epoch: 0.10998834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5661, "loss": 0.2418871521949768, "memory_gb": 7.721559524536133, "step_time_ms": 3365.43607711792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:18] (step=0005661) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.11000777302759425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5662, "loss": 0.23016656935214996, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1389770507812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:21] (step=0005662) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.11002720559657987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5663, "loss": 0.19720327854156494, "memory_gb": 7.721559524536133, "step_time_ms": 3371.710777282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:25] (step=0005663) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.1100466381655655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 5664, "loss": 0.21494266390800476, "memory_gb": 7.721559524536133, "step_time_ms": 3370.943307876587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:29] (step=0005664) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.1100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5665, "loss": 0.18975123763084412, "memory_gb": 7.721559524536133, "step_time_ms": 3364.382028579712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:32] (step=0005665) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.11008550330353672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5666, "loss": 0.24231523275375366, "memory_gb": 7.721559524536133, "step_time_ms": 3374.218225479126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:36] (step=0005666) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.11010493587252235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5667, "loss": 0.3103644847869873, "memory_gb": 7.721559524536133, "step_time_ms": 3372.21360206604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:39] (step=0005667) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.11012436844150797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5668, "loss": 0.24659399688243866, "memory_gb": 7.721559524536133, "step_time_ms": 3513.028621673584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:43] (step=0005668) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.11014380101049359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5669, "loss": 0.2529284656047821, "memory_gb": 7.721559524536133, "step_time_ms": 3371.100664138794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:47] (step=0005669) Train Loss: 0.2529, Train Steps/Sec: 0.28, Epoch: 0.11016323357947921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5670, "loss": 0.2744441032409668, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1114654541016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:50] (step=0005670) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.11018266614846482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5671, "loss": 0.25010132789611816, "memory_gb": 7.721559524536133, "step_time_ms": 3368.302822113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:54] (step=0005671) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.11020209871745044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:41:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5672, "loss": 0.285348504781723, "memory_gb": 7.721559524536133, "step_time_ms": 3381.2263011932373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:41:58] (step=0005672) Train Loss: 0.2531, Train Steps/Sec: 0.27, Epoch: 0.11022153128643607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5673, "loss": 0.3086891174316406, "memory_gb": 7.721559524536133, "step_time_ms": 3379.4620037078857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:01] (step=0005673) Train Loss: 0.2553, Train Steps/Sec: 0.27, Epoch: 0.11024096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5674, "loss": 0.25394874811172485, "memory_gb": 7.721559524536133, "step_time_ms": 3376.2760162353516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:05] (step=0005674) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.11026039642440731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5675, "loss": 0.24933737516403198, "memory_gb": 7.721559524536133, "step_time_ms": 3371.6795444488525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:09] (step=0005675) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.11027982899339292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5676, "loss": 0.21664921939373016, "memory_gb": 7.721559524536133, "step_time_ms": 3374.156713485718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:12] (step=0005676) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.11029926156237854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5677, "loss": 0.270478755235672, "memory_gb": 7.721559524536133, "step_time_ms": 3371.638298034668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:16] (step=0005677) Train Loss: 0.2653, Train Steps/Sec: 0.28, Epoch: 0.11031869413136416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5678, "loss": 0.16205711662769318, "memory_gb": 7.721559524536133, "step_time_ms": 3376.253843307495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:19] (step=0005678) Train Loss: 0.1521, Train Steps/Sec: 0.28, Epoch: 0.11033812670034979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5679, "loss": 0.2799238860607147, "memory_gb": 7.721559524536133, "step_time_ms": 3376.0790824890137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:23] (step=0005679) Train Loss: 0.2915, Train Steps/Sec: 0.27, Epoch: 0.11035755926933541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5680, "loss": 0.16100092232227325, "memory_gb": 7.721559524536133, "step_time_ms": 3379.8089027404785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:27] (step=0005680) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.11037699183832103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5681, "loss": 0.18558332324028015, "memory_gb": 7.721559524536133, "step_time_ms": 3372.028112411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:30] (step=0005681) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.11039642440730664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5682, "loss": 0.245525062084198, "memory_gb": 7.721559524536133, "step_time_ms": 3379.8933029174805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:34] (step=0005682) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.11041585697629226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5683, "loss": 0.14690828323364258, "memory_gb": 7.721559524536133, "step_time_ms": 3379.0674209594727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:38] (step=0005683) Train Loss: 0.1969, Train Steps/Sec: 0.28, Epoch: 0.11043528954527788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5684, "loss": 0.27813616394996643, "memory_gb": 7.721559524536133, "step_time_ms": 3378.403425216675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:41] (step=0005684) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.1104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5685, "loss": 0.2523577809333801, "memory_gb": 7.721559524536133, "step_time_ms": 3374.7689723968506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:45] (step=0005685) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.11047415468324913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 5686, "loss": 0.2225244641304016, "memory_gb": 7.721559524536133, "step_time_ms": 3380.9070587158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:49] (step=0005686) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.11049358725223475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5687, "loss": 0.19025355577468872, "memory_gb": 7.721559524536133, "step_time_ms": 3379.5251846313477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:52] (step=0005687) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.11051301982122036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5688, "loss": 0.25476837158203125, "memory_gb": 7.721559524536133, "step_time_ms": 3376.9400119781494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:56] (step=0005688) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.11053245239020598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:42:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5689, "loss": 0.15925316512584686, "memory_gb": 7.721559524536133, "step_time_ms": 3381.915330886841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:42:59] (step=0005689) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.1105518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5690, "loss": 0.18362924456596375, "memory_gb": 7.721559524536133, "step_time_ms": 3388.209342956543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:03] (step=0005690) Train Loss: 0.1748, Train Steps/Sec: 0.28, Epoch: 0.11057131752817723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5691, "loss": 0.20284482836723328, "memory_gb": 7.721559524536133, "step_time_ms": 3381.6423416137695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:07] (step=0005691) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.11059075009716285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5692, "loss": 0.1558472216129303, "memory_gb": 7.721559524536133, "step_time_ms": 3381.20698928833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:10] (step=0005692) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.11061018266614847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5693, "loss": 0.23581936955451965, "memory_gb": 7.721559524536133, "step_time_ms": 3379.3439865112305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:14] (step=0005693) Train Loss: 0.2885, Train Steps/Sec: 0.28, Epoch: 0.11062961523513408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 5694, "loss": 0.2876504063606262, "memory_gb": 7.721559524536133, "step_time_ms": 3381.385564804077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:18] (step=0005694) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.1106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5695, "loss": 0.24378040432929993, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5815029144287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:21] (step=0005695) Train Loss: 0.2917, Train Steps/Sec: 0.28, Epoch: 0.11066848037310532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5696, "loss": 0.25014787912368774, "memory_gb": 7.721559524536133, "step_time_ms": 3375.44846534729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:25] (step=0005696) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.11068791294209095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5697, "loss": 0.17446419596672058, "memory_gb": 7.721559524536133, "step_time_ms": 3376.3692378997803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:29] (step=0005697) Train Loss: 0.1657, Train Steps/Sec: 0.28, Epoch: 0.11070734551107657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5698, "loss": 0.19307540357112885, "memory_gb": 7.721559524536133, "step_time_ms": 3374.387741088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:32] (step=0005698) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.11072677808006219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5699, "loss": 0.30946090817451477, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5078830718994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:36] (step=0005699) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.1107462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 5700, "loss": 0.23067417740821838, "memory_gb": 7.721559524536133, "step_time_ms": 3376.9285678863525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:40] (step=0005700) Train Loss: 0.2618, Train Steps/Sec: 0.27, Epoch: 0.11076564321803342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5701, "loss": 0.1254914402961731, "memory_gb": 7.721559524536133, "step_time_ms": 3374.9966621398926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:43] (step=0005701) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.11078507578701904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 5702, "loss": 0.30803781747817993, "memory_gb": 7.721559524536133, "step_time_ms": 3374.5687007904053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:47] (step=0005702) Train Loss: 0.3083, Train Steps/Sec: 0.28, Epoch: 0.11080450835600467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5703, "loss": 0.21308177709579468, "memory_gb": 7.721559524536133, "step_time_ms": 3367.873191833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:50] (step=0005703) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.11082394092499029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5704, "loss": 0.16240689158439636, "memory_gb": 7.721559524536133, "step_time_ms": 3364.928960800171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:54] (step=0005704) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.1108433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:43:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 5705, "loss": 0.1750199943780899, "memory_gb": 7.721559524536133, "step_time_ms": 3365.428686141968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:43:58] (step=0005705) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.11086280606296152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5706, "loss": 0.27520525455474854, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7283267974854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:01] (step=0005706) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.11088223863194714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5707, "loss": 0.15862995386123657, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7750358581543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:05] (step=0005707) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.11090167120093276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 5708, "loss": 0.30688273906707764, "memory_gb": 7.721559524536133, "step_time_ms": 3506.896495819092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:09] (step=0005708) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.11092110376991839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5709, "loss": 0.28109437227249146, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5806522369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:12] (step=0005709) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.11094053633890401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5710, "loss": 0.23583339154720306, "memory_gb": 7.721559524536133, "step_time_ms": 3365.302085876465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:16] (step=0005710) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.11095996890788962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5711, "loss": 0.20110508799552917, "memory_gb": 7.721559524536133, "step_time_ms": 3367.756128311157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:19] (step=0005711) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.11097940147687524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5712, "loss": 0.15049751102924347, "memory_gb": 7.721559524536133, "step_time_ms": 3365.394115447998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:23] (step=0005712) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.11099883404586086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 5713, "loss": 0.3321521580219269, "memory_gb": 7.721559524536133, "step_time_ms": 3368.299722671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:27] (step=0005713) Train Loss: 0.2536, Train Steps/Sec: 0.28, Epoch: 0.11101826661484648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5714, "loss": 0.22378955781459808, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7782077789307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:30] (step=0005714) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.1110376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5715, "loss": 0.3114137351512909, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8696460723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:34] (step=0005715) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.11105713175281773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5716, "loss": 0.2629340887069702, "memory_gb": 7.721559524536133, "step_time_ms": 3362.313985824585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:38] (step=0005716) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.11107656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5717, "loss": 0.19314363598823547, "memory_gb": 7.721559524536133, "step_time_ms": 3367.260694503784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:41] (step=0005717) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.11109599689078896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5718, "loss": 0.23244579136371613, "memory_gb": 7.721559524536133, "step_time_ms": 3359.711170196533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:45] (step=0005718) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.11111542945977458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5719, "loss": 0.1542106568813324, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2537384033203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:48] (step=0005719) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.1111348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5720, "loss": 0.27157554030418396, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6835346221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:52] (step=0005720) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.11115429459774583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5721, "loss": 0.23732240498065948, "memory_gb": 7.721559524536133, "step_time_ms": 3362.166404724121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:56] (step=0005721) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.11117372716673145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:44:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5722, "loss": 0.216114342212677, "memory_gb": 7.721559524536133, "step_time_ms": 3360.653877258301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:44:59] (step=0005722) Train Loss: 0.1841, Train Steps/Sec: 0.28, Epoch: 0.11119315973571706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5723, "loss": 0.26950594782829285, "memory_gb": 7.721559524536133, "step_time_ms": 3361.398220062256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:03] (step=0005723) Train Loss: 0.2857, Train Steps/Sec: 0.28, Epoch: 0.11121259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 5724, "loss": 0.23718008399009705, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8553409576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:07] (step=0005724) Train Loss: 0.2450, Train Steps/Sec: 0.28, Epoch: 0.1112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5725, "loss": 0.19000615179538727, "memory_gb": 7.721559524536133, "step_time_ms": 3361.708641052246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:10] (step=0005725) Train Loss: 0.1870, Train Steps/Sec: 0.28, Epoch: 0.11125145744267392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5726, "loss": 0.3271724581718445, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4133853912354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:14] (step=0005726) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.11127089001165955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5727, "loss": 0.22295725345611572, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3693504333496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:17] (step=0005727) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.11129032258064517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5728, "loss": 0.2541981339454651, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8246994018555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:21] (step=0005728) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.11130975514963078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5729, "loss": 0.23659634590148926, "memory_gb": 7.721559524536133, "step_time_ms": 3360.018014907837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:25] (step=0005729) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.1113291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5730, "loss": 0.14446884393692017, "memory_gb": 7.721559524536133, "step_time_ms": 3369.13800239563, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:28] (step=0005730) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.11134862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5731, "loss": 0.21119236946105957, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9330863952637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:32] (step=0005731) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.11136805285658764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5732, "loss": 0.22431448101997375, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9324645996094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:36] (step=0005732) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.11138748542557327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5733, "loss": 0.19722725450992584, "memory_gb": 7.721559524536133, "step_time_ms": 3367.020606994629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:39] (step=0005733) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.11140691799455887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5734, "loss": 0.18886563181877136, "memory_gb": 7.721559524536133, "step_time_ms": 3359.335422515869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:43] (step=0005734) Train Loss: 0.1947, Train Steps/Sec: 0.28, Epoch: 0.1114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5735, "loss": 0.2157924771308899, "memory_gb": 7.721559524536133, "step_time_ms": 3359.835624694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:46] (step=0005735) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.11144578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5736, "loss": 0.242008775472641, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7849349975586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:50] (step=0005736) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.11146521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5737, "loss": 0.2371601164340973, "memory_gb": 7.721559524536133, "step_time_ms": 3356.151342391968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:54] (step=0005737) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.11148464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5738, "loss": 0.2547267973423004, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9608154296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:45:57] (step=0005738) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.11150408083948699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5739, "loss": 0.3303740620613098, "memory_gb": 7.721559524536133, "step_time_ms": 3364.229917526245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:01] (step=0005739) Train Loss: 0.3002, Train Steps/Sec: 0.28, Epoch: 0.1115235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5740, "loss": 0.240987166762352, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3289337158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:05] (step=0005740) Train Loss: 0.2541, Train Steps/Sec: 0.26, Epoch: 0.11154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5741, "loss": 0.17045092582702637, "memory_gb": 7.721559524536133, "step_time_ms": 3360.422134399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:08] (step=0005741) Train Loss: 0.1962, Train Steps/Sec: 0.28, Epoch: 0.11156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5742, "loss": 0.11343984305858612, "memory_gb": 7.721559524536133, "step_time_ms": 3363.034725189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:12] (step=0005742) Train Loss: 0.1254, Train Steps/Sec: 0.28, Epoch: 0.11158181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 5743, "loss": 0.1953764408826828, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0871982574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:16] (step=0005743) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.11160124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5744, "loss": 0.2251097708940506, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6723289489746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:19] (step=0005744) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.1116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5745, "loss": 0.2452254593372345, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1745319366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:23] (step=0005745) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.11164010882238631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5746, "loss": 0.22120073437690735, "memory_gb": 7.721559524536133, "step_time_ms": 3364.668369293213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:26] (step=0005746) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.11165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5747, "loss": 0.21353402733802795, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2009239196777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:30] (step=0005747) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.11167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5748, "loss": 0.21998150646686554, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2566413879395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:34] (step=0005748) Train Loss: 0.3083, Train Steps/Sec: 0.28, Epoch: 0.11169840652934318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5749, "loss": 0.21690067648887634, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9888038635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:37] (step=0005749) Train Loss: 0.2669, Train Steps/Sec: 0.27, Epoch: 0.1117178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5750, "loss": 0.17635421454906464, "memory_gb": 7.721559524536133, "step_time_ms": 3360.166072845459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:41] (step=0005750) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.11173727166731443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5751, "loss": 0.19273746013641357, "memory_gb": 7.721559524536133, "step_time_ms": 3365.022659301758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:45] (step=0005751) Train Loss: 0.1916, Train Steps/Sec: 0.28, Epoch: 0.11175670423630003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5752, "loss": 0.2755221426486969, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6044216156006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:48] (step=0005752) Train Loss: 0.2896, Train Steps/Sec: 0.28, Epoch: 0.11177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5753, "loss": 0.1668590009212494, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9001331329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:52] (step=0005753) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.11179556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5754, "loss": 0.2087590992450714, "memory_gb": 7.721559524536133, "step_time_ms": 3363.996744155884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:56] (step=0005754) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.1118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5755, "loss": 0.1932276487350464, "memory_gb": 7.721559524536133, "step_time_ms": 3511.349678039551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:46:59] (step=0005755) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.11183443451224252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5756, "loss": 0.24078156054019928, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2918548583984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:03] (step=0005756) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.11185386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5757, "loss": 0.21306729316711426, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2750930786133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:06] (step=0005757) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.11187329965021375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5758, "loss": 0.24024786055088043, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5568103790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:10] (step=0005758) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.11189273221919938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5759, "loss": 0.19432877004146576, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0860271453857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:14] (step=0005759) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.111912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5760, "loss": 0.2739325165748596, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4300861358643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:17] (step=0005760) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.11193159735717062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5761, "loss": 0.15218597650527954, "memory_gb": 7.721559524536133, "step_time_ms": 3372.448444366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:21] (step=0005761) Train Loss: 0.1947, Train Steps/Sec: 0.28, Epoch: 0.11195102992615624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5762, "loss": 0.18880930542945862, "memory_gb": 7.721559524536133, "step_time_ms": 3374.18794631958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:25] (step=0005762) Train Loss: 0.2617, Train Steps/Sec: 0.27, Epoch: 0.11197046249514185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5763, "loss": 0.17929355800151825, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9306926727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:28] (step=0005763) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.11198989506412747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5764, "loss": 0.26618456840515137, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1816329956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:32] (step=0005764) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.1120093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5765, "loss": 0.3312668800354004, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5262413024902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:36] (step=0005765) Train Loss: 0.2718, Train Steps/Sec: 0.27, Epoch: 0.11202876020209872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5766, "loss": 0.3565148711204529, "memory_gb": 7.721559524536133, "step_time_ms": 3366.438150405884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:39] (step=0005766) Train Loss: 0.2830, Train Steps/Sec: 0.28, Epoch: 0.11204819277108434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5767, "loss": 0.1733788698911667, "memory_gb": 7.721559524536133, "step_time_ms": 3370.617151260376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:43] (step=0005767) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.11206762534006996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5768, "loss": 0.1463290899991989, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9702491760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:46] (step=0005768) Train Loss: 0.1709, Train Steps/Sec: 0.28, Epoch: 0.11208705790905557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5769, "loss": 0.1968560665845871, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4340648651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:50] (step=0005769) Train Loss: 0.1926, Train Steps/Sec: 0.28, Epoch: 0.11210649047804119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5770, "loss": 0.3153402507305145, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5660572052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:54] (step=0005770) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.11212592304702682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:47:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5771, "loss": 0.30310124158859253, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7196502685547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:47:57] (step=0005771) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.11214535561601244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5772, "loss": 0.2092593014240265, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9869384765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:01] (step=0005772) Train Loss: 0.1666, Train Steps/Sec: 0.28, Epoch: 0.11216478818499806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5773, "loss": 0.27542853355407715, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4630393981934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:05] (step=0005773) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.11218422075398368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5774, "loss": 0.1894914209842682, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1139430999756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:08] (step=0005774) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.11220365332296929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5775, "loss": 0.18728503584861755, "memory_gb": 7.721559524536133, "step_time_ms": 3371.208906173706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:12] (step=0005775) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.11222308589195491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5776, "loss": 0.20967170596122742, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8883991241455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:15] (step=0005776) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.11224251846094054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5777, "loss": 0.21592144668102264, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7351474761963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:19] (step=0005777) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.11226195102992616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5778, "loss": 0.2019737809896469, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4394569396973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:23] (step=0005778) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.11228138359891178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5779, "loss": 0.2715694308280945, "memory_gb": 7.721559524536133, "step_time_ms": 3377.512216567993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:26] (step=0005779) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.1123008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5780, "loss": 0.38996076583862305, "memory_gb": 7.721559524536133, "step_time_ms": 3376.255512237549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:30] (step=0005780) Train Loss: 0.2941, Train Steps/Sec: 0.28, Epoch: 0.11232024873688301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5781, "loss": 0.3490028977394104, "memory_gb": 7.721559524536133, "step_time_ms": 3357.179880142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:34] (step=0005781) Train Loss: 0.3293, Train Steps/Sec: 0.28, Epoch: 0.11233968130586863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5782, "loss": 0.18949498236179352, "memory_gb": 7.721559524536133, "step_time_ms": 3375.8881092071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:37] (step=0005782) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.11235911387485426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5783, "loss": 0.20077037811279297, "memory_gb": 7.715639114379883, "step_time_ms": 3354.6528816223145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:41] (step=0005783) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.11237854644383988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5784, "loss": 0.2685410976409912, "memory_gb": 7.721559524536133, "step_time_ms": 3373.8009929656982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:45] (step=0005784) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.1123979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5785, "loss": 0.315471887588501, "memory_gb": 7.721559524536133, "step_time_ms": 3376.3179779052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:48] (step=0005785) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.11241741158181112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5786, "loss": 0.15803097188472748, "memory_gb": 7.721559524536133, "step_time_ms": 3374.537944793701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:52] (step=0005786) Train Loss: 0.1595, Train Steps/Sec: 0.28, Epoch: 0.11243684415079673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 5787, "loss": 0.2744187116622925, "memory_gb": 7.721559524536133, "step_time_ms": 3376.3837814331055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:56] (step=0005787) Train Loss: 0.2585, Train Steps/Sec: 0.26, Epoch: 0.11245627671978235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:48:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5788, "loss": 0.1838526725769043, "memory_gb": 7.721559524536133, "step_time_ms": 3376.033067703247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:48:59] (step=0005788) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.11247570928876797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5789, "loss": 0.19443461298942566, "memory_gb": 7.721559524536133, "step_time_ms": 3376.2025833129883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:03] (step=0005789) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.1124951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5790, "loss": 0.13604052364826202, "memory_gb": 7.721559524536133, "step_time_ms": 3376.2340545654297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:06] (step=0005790) Train Loss: 0.1755, Train Steps/Sec: 0.28, Epoch: 0.11251457442673922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5791, "loss": 0.2090882807970047, "memory_gb": 7.721559524536133, "step_time_ms": 3377.847194671631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:10] (step=0005791) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.11253400699572483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5792, "loss": 0.3584054708480835, "memory_gb": 7.721559524536133, "step_time_ms": 3381.5770149230957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:14] (step=0005792) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.11255343956471045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5793, "loss": 0.21801921725273132, "memory_gb": 7.721559524536133, "step_time_ms": 3379.101514816284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:17] (step=0005793) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.11257287213369607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5794, "loss": 0.2721403241157532, "memory_gb": 7.721559524536133, "step_time_ms": 3376.408100128174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:21] (step=0005794) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.1125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5795, "loss": 0.1513763666152954, "memory_gb": 7.721559524536133, "step_time_ms": 3382.324695587158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:25] (step=0005795) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.11261173727166732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5796, "loss": 0.19210538268089294, "memory_gb": 7.721559524536133, "step_time_ms": 3516.570806503296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:28] (step=0005796) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.11263116984065294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5797, "loss": 0.16541680693626404, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9640712738037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:32] (step=0005797) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.11265060240963855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 5798, "loss": 0.3173728585243225, "memory_gb": 7.721559524536133, "step_time_ms": 3378.8461685180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:36] (step=0005798) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.11267003497862417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5799, "loss": 0.27830713987350464, "memory_gb": 7.721559524536133, "step_time_ms": 3378.465175628662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:39] (step=0005799) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.11268946754760979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5800, "loss": 0.22274982929229736, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7683486938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:43] (step=0005800) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.11270890011659541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5801, "loss": 0.21238218247890472, "memory_gb": 7.721559524536133, "step_time_ms": 3376.1534690856934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:46] (step=0005801) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.11272833268558104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5802, "loss": 0.2976834177970886, "memory_gb": 7.721559524536133, "step_time_ms": 3369.69256401062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:50] (step=0005802) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.11274776525456666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5803, "loss": 0.1528874635696411, "memory_gb": 7.721559524536133, "step_time_ms": 3377.5267601013184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:54] (step=0005803) Train Loss: 0.1509, Train Steps/Sec: 0.28, Epoch: 0.11276719782355227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:49:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5804, "loss": 0.3091517686843872, "memory_gb": 7.721559524536133, "step_time_ms": 3373.5809326171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:49:57] (step=0005804) Train Loss: 0.3104, Train Steps/Sec: 0.28, Epoch: 0.11278663039253789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5805, "loss": 0.3784840703010559, "memory_gb": 7.721559524536133, "step_time_ms": 3372.762680053711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:01] (step=0005805) Train Loss: 0.3178, Train Steps/Sec: 0.28, Epoch: 0.11280606296152351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5806, "loss": 0.31001919507980347, "memory_gb": 7.721559524536133, "step_time_ms": 3372.6720809936523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:05] (step=0005806) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.11282549553050913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5807, "loss": 0.17023006081581116, "memory_gb": 7.721559524536133, "step_time_ms": 3371.940851211548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:08] (step=0005807) Train Loss: 0.1804, Train Steps/Sec: 0.28, Epoch: 0.11284492809949476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5808, "loss": 0.3115156292915344, "memory_gb": 7.721559524536133, "step_time_ms": 3373.760938644409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:12] (step=0005808) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.11286436066848038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5809, "loss": 0.2700890302658081, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2880821228027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:15] (step=0005809) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.11288379323746599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5810, "loss": 0.25758591294288635, "memory_gb": 7.721559524536133, "step_time_ms": 3366.88494682312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:19] (step=0005810) Train Loss: 0.2776, Train Steps/Sec: 0.28, Epoch: 0.11290322580645161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5811, "loss": 0.25409623980522156, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6601390838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:23] (step=0005811) Train Loss: 0.1970, Train Steps/Sec: 0.28, Epoch: 0.11292265837543723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5812, "loss": 0.1919279396533966, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9849891662598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:26] (step=0005812) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.11294209094442285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5813, "loss": 0.29326367378234863, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2898540496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:30] (step=0005813) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.11296152351340848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5814, "loss": 0.23286521434783936, "memory_gb": 7.721559524536133, "step_time_ms": 3373.90398979187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:34] (step=0005814) Train Loss: 0.2263, Train Steps/Sec: 0.28, Epoch: 0.1129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5815, "loss": 0.33863234519958496, "memory_gb": 7.721559524536133, "step_time_ms": 3372.248649597168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:37] (step=0005815) Train Loss: 0.3062, Train Steps/Sec: 0.28, Epoch: 0.11300038865137971, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5816, "loss": 0.25218790769577026, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6723709106445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:41] (step=0005816) Train Loss: 0.2864, Train Steps/Sec: 0.28, Epoch: 0.11301982122036533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5817, "loss": 0.1460125595331192, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5826110839844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:45] (step=0005817) Train Loss: 0.1728, Train Steps/Sec: 0.28, Epoch: 0.11303925378935095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5818, "loss": 0.12548984587192535, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0334300994873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:48] (step=0005818) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.11305868635833657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5819, "loss": 0.32629552483558655, "memory_gb": 7.721559524536133, "step_time_ms": 3373.2879161834717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:52] (step=0005819) Train Loss: 0.3361, Train Steps/Sec: 0.28, Epoch: 0.1130781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5820, "loss": 0.24736283719539642, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2995777130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:55] (step=0005820) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.11309755149630782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:50:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5821, "loss": 0.10741744190454483, "memory_gb": 7.721559524536133, "step_time_ms": 3365.135431289673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:50:59] (step=0005821) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.11311698406529343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5822, "loss": 0.2906286120414734, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4098720550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:03] (step=0005822) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.11313641663427905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5823, "loss": 0.14627856016159058, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9685592651367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:06] (step=0005823) Train Loss: 0.1726, Train Steps/Sec: 0.28, Epoch: 0.11315584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5824, "loss": 0.24508440494537354, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0508136749268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:10] (step=0005824) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.1131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5825, "loss": 0.15951767563819885, "memory_gb": 7.721559524536133, "step_time_ms": 3362.450361251831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:14] (step=0005825) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.11319471434123592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5826, "loss": 0.1173870712518692, "memory_gb": 7.721559524536133, "step_time_ms": 3364.299774169922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:17] (step=0005826) Train Loss: 0.1740, Train Steps/Sec: 0.28, Epoch: 0.11321414691022152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5827, "loss": 0.37703800201416016, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7735118865967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:21] (step=0005827) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.11323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 5828, "loss": 0.19891853630542755, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4027919769287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:25] (step=0005828) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.11325301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5829, "loss": 0.234217569231987, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5285625457764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:28] (step=0005829) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.11327244461717839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5830, "loss": 0.3541795313358307, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6656742095947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:32] (step=0005830) Train Loss: 0.2613, Train Steps/Sec: 0.28, Epoch: 0.11329187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5831, "loss": 0.10921812057495117, "memory_gb": 7.721559524536133, "step_time_ms": 3366.556406021118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:35] (step=0005831) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.11331130975514964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5832, "loss": 0.20776695013046265, "memory_gb": 7.721559524536133, "step_time_ms": 3368.196487426758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:39] (step=0005832) Train Loss: 0.1858, Train Steps/Sec: 0.28, Epoch: 0.11333074232413524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5833, "loss": 0.36866989731788635, "memory_gb": 7.715639114379883, "step_time_ms": 3333.3675861358643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:43] (step=0005833) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.11335017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5834, "loss": 0.1822071075439453, "memory_gb": 7.721559524536133, "step_time_ms": 3356.318712234497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:46] (step=0005834) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.11336960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5835, "loss": 0.28390711545944214, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5385761260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:50] (step=0005835) Train Loss: 0.2691, Train Steps/Sec: 0.27, Epoch: 0.11338904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5836, "loss": 0.21803592145442963, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0690574645996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:54] (step=0005836) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.11340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:51:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5837, "loss": 0.22060194611549377, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9400939941406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:51:57] (step=0005837) Train Loss: 0.1936, Train Steps/Sec: 0.28, Epoch: 0.11342790516906336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5838, "loss": 0.30343571305274963, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8132343292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:01] (step=0005838) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.11344733773804896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 5839, "loss": 0.14086781442165375, "memory_gb": 7.721559524536133, "step_time_ms": 3362.673759460449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:05] (step=0005839) Train Loss: 0.1578, Train Steps/Sec: 0.28, Epoch: 0.11346677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5840, "loss": 0.28156203031539917, "memory_gb": 7.721559524536133, "step_time_ms": 3360.968589782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:08] (step=0005840) Train Loss: 0.2725, Train Steps/Sec: 0.28, Epoch: 0.11348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5841, "loss": 0.26842525601387024, "memory_gb": 7.721559524536133, "step_time_ms": 3366.713047027588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:12] (step=0005841) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.11350563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5842, "loss": 0.22764797508716583, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2628498077393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:15] (step=0005842) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.11352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5843, "loss": 0.17232713103294373, "memory_gb": 7.721559524536133, "step_time_ms": 3361.828088760376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:19] (step=0005843) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.11354450058297708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5844, "loss": 0.24118581414222717, "memory_gb": 7.721559524536133, "step_time_ms": 3505.5012702941895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:23] (step=0005844) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.11356393315196268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5845, "loss": 0.2673858404159546, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4441833496094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:26] (step=0005845) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.1135833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5846, "loss": 0.29430603981018066, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2987480163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:30] (step=0005846) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.11360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5847, "loss": 0.2459920197725296, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5240840911865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:34] (step=0005847) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.11362223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5848, "loss": 0.3603275716304779, "memory_gb": 7.721559524536133, "step_time_ms": 3359.780788421631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:37] (step=0005848) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.11364166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5849, "loss": 0.27660810947418213, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5692386627197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:41] (step=0005849) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.1136610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5850, "loss": 0.22912058234214783, "memory_gb": 7.721559524536133, "step_time_ms": 3360.966444015503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:44] (step=0005850) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.1136805285658764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5851, "loss": 0.24778887629508972, "memory_gb": 7.721559524536133, "step_time_ms": 3362.459897994995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:48] (step=0005851) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.11369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5852, "loss": 0.22224947810173035, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7798042297363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:52] (step=0005852) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.11371939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5853, "loss": 0.13415202498435974, "memory_gb": 7.721559524536133, "step_time_ms": 3360.85844039917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:55] (step=0005853) Train Loss: 0.1863, Train Steps/Sec: 0.28, Epoch: 0.11373882627283327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:52:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5854, "loss": 0.2852841913700104, "memory_gb": 7.721559524536133, "step_time_ms": 3361.898899078369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:52:59] (step=0005854) Train Loss: 0.2918, Train Steps/Sec: 0.28, Epoch: 0.1137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5855, "loss": 0.24527375400066376, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2024993896484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:03] (step=0005855) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.1137776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5856, "loss": 0.29748356342315674, "memory_gb": 7.721559524536133, "step_time_ms": 3359.78627204895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:06] (step=0005856) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.11379712397979012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5857, "loss": 0.36951521039009094, "memory_gb": 7.715639114379883, "step_time_ms": 3333.401679992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:10] (step=0005857) Train Loss: 0.2893, Train Steps/Sec: 0.28, Epoch: 0.11381655654877575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5858, "loss": 0.16635000705718994, "memory_gb": 7.721559524536133, "step_time_ms": 3363.827705383301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:14] (step=0005858) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.11383598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5859, "loss": 0.2548399865627289, "memory_gb": 7.721559524536133, "step_time_ms": 3366.760730743408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:17] (step=0005859) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.11385542168674699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5860, "loss": 0.22528204321861267, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5499687194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:21] (step=0005860) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.11387485425573261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5861, "loss": 0.1902824193239212, "memory_gb": 7.721559524536133, "step_time_ms": 3366.426467895508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:24] (step=0005861) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.11389428682471822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5862, "loss": 0.18755969405174255, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2379302978516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:28] (step=0005862) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.11391371939370384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5863, "loss": 0.29817771911621094, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8765811920166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:32] (step=0005863) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.11393315196268947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5864, "loss": 0.31366366147994995, "memory_gb": 7.721559524536133, "step_time_ms": 3369.797706604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:35] (step=0005864) Train Loss: 0.2935, Train Steps/Sec: 0.28, Epoch: 0.11395258453167509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5865, "loss": 0.24967090785503387, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5666370391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:39] (step=0005865) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.11397201710066071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5866, "loss": 0.23478640615940094, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5716857910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:43] (step=0005866) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.11399144966964633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5867, "loss": 0.32203906774520874, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2848472595215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:46] (step=0005867) Train Loss: 0.3372, Train Steps/Sec: 0.28, Epoch: 0.11401088223863194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5868, "loss": 0.30008649826049805, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1560497283936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:50] (step=0005868) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.11403031480761756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5869, "loss": 0.2602502405643463, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2690086364746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:53] (step=0005869) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.11404974737660319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5870, "loss": 0.11526951193809509, "memory_gb": 7.721559524536133, "step_time_ms": 3372.419834136963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:53:57] (step=0005870) Train Loss: 0.1629, Train Steps/Sec: 0.28, Epoch: 0.11406917994558881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5871, "loss": 0.17438329756259918, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8745708465576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:01] (step=0005871) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.11408861251457443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5872, "loss": 0.2335207760334015, "memory_gb": 7.721559524536133, "step_time_ms": 3366.213083267212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:04] (step=0005872) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.11410804508356005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5873, "loss": 0.28604593873023987, "memory_gb": 7.721559524536133, "step_time_ms": 3362.36834526062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:08] (step=0005873) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.11412747765254566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5874, "loss": 0.21316510438919067, "memory_gb": 7.721559524536133, "step_time_ms": 3369.412660598755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:12] (step=0005874) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.11414691022153128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5875, "loss": 0.30964788794517517, "memory_gb": 7.721559524536133, "step_time_ms": 3373.2504844665527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:15] (step=0005875) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.1141663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5876, "loss": 0.2745988070964813, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2611923217773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:19] (step=0005876) Train Loss: 0.2993, Train Steps/Sec: 0.27, Epoch: 0.11418577535950253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5877, "loss": 0.1966802477836609, "memory_gb": 7.721559524536133, "step_time_ms": 3364.774227142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:23] (step=0005877) Train Loss: 0.1494, Train Steps/Sec: 0.28, Epoch: 0.11420520792848815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5878, "loss": 0.28526273369789124, "memory_gb": 7.721559524536133, "step_time_ms": 3368.925094604492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:26] (step=0005878) Train Loss: 0.2107, Train Steps/Sec: 0.28, Epoch: 0.11422464049747377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5879, "loss": 0.30529525876045227, "memory_gb": 7.721559524536133, "step_time_ms": 3373.5318183898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:30] (step=0005879) Train Loss: 0.2893, Train Steps/Sec: 0.28, Epoch: 0.11424407306645938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 5880, "loss": 0.3028900623321533, "memory_gb": 7.721559524536133, "step_time_ms": 3374.195337295532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:34] (step=0005880) Train Loss: 0.3027, Train Steps/Sec: 0.28, Epoch: 0.114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5881, "loss": 0.2122502326965332, "memory_gb": 7.721559524536133, "step_time_ms": 3373.380422592163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:37] (step=0005881) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.11428293820443063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5882, "loss": 0.25119271874427795, "memory_gb": 7.721559524536133, "step_time_ms": 3375.596284866333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:41] (step=0005882) Train Loss: 0.3077, Train Steps/Sec: 0.28, Epoch: 0.11430237077341625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5883, "loss": 0.13641013205051422, "memory_gb": 7.721559524536133, "step_time_ms": 3375.479221343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:44] (step=0005883) Train Loss: 0.1769, Train Steps/Sec: 0.28, Epoch: 0.11432180334240187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5884, "loss": 0.20628108084201813, "memory_gb": 7.721559524536133, "step_time_ms": 3499.298572540283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:48] (step=0005884) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.11434123591138748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5885, "loss": 0.27982276678085327, "memory_gb": 7.721559524536133, "step_time_ms": 3381.0505867004395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:52] (step=0005885) Train Loss: 0.3050, Train Steps/Sec: 0.28, Epoch: 0.1143606684803731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5886, "loss": 0.32580870389938354, "memory_gb": 7.721559524536133, "step_time_ms": 3380.735158920288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:55] (step=0005886) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.11438010104935872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:54:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5887, "loss": 0.14934775233268738, "memory_gb": 7.721559524536133, "step_time_ms": 3376.178741455078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:54:59] (step=0005887) Train Loss: 0.1592, Train Steps/Sec: 0.28, Epoch: 0.11439953361834435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5888, "loss": 0.22438345849514008, "memory_gb": 7.721559524536133, "step_time_ms": 3379.455089569092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:03] (step=0005888) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.11441896618732997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5889, "loss": 0.262410432100296, "memory_gb": 7.721559524536133, "step_time_ms": 3378.4754276275635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:06] (step=0005889) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.11443839875631559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5890, "loss": 0.21401827037334442, "memory_gb": 7.721559524536133, "step_time_ms": 3375.762939453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:10] (step=0005890) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.1144578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5891, "loss": 0.2737783193588257, "memory_gb": 7.721559524536133, "step_time_ms": 3381.254196166992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:13] (step=0005891) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.11447726389428682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5892, "loss": 0.2504040598869324, "memory_gb": 7.721559524536133, "step_time_ms": 3378.373384475708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:17] (step=0005892) Train Loss: 0.2082, Train Steps/Sec: 0.28, Epoch: 0.11449669646327244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5893, "loss": 0.19792987406253815, "memory_gb": 7.721559524536133, "step_time_ms": 3382.9522132873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:21] (step=0005893) Train Loss: 0.2599, Train Steps/Sec: 0.28, Epoch: 0.11451612903225807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5894, "loss": 0.28309938311576843, "memory_gb": 7.721559524536133, "step_time_ms": 3378.7529468536377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:24] (step=0005894) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.11453556160124369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5895, "loss": 0.173412024974823, "memory_gb": 7.721559524536133, "step_time_ms": 3381.1025619506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:28] (step=0005895) Train Loss: 0.1874, Train Steps/Sec: 0.28, Epoch: 0.11455499417022931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5896, "loss": 0.24166326224803925, "memory_gb": 7.721559524536133, "step_time_ms": 3379.2166709899902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:32] (step=0005896) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.11457442673921492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5897, "loss": 0.1367165446281433, "memory_gb": 7.721559524536133, "step_time_ms": 3376.437187194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:35] (step=0005897) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.11459385930820054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5898, "loss": 0.21277287602424622, "memory_gb": 7.721559524536133, "step_time_ms": 3379.61745262146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:39] (step=0005898) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.11461329187718616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5899, "loss": 0.22561736404895782, "memory_gb": 7.721559524536133, "step_time_ms": 3373.241901397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:43] (step=0005899) Train Loss: 0.1779, Train Steps/Sec: 0.28, Epoch: 0.11463272444617179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5900, "loss": 0.30147141218185425, "memory_gb": 7.721559524536133, "step_time_ms": 3379.789113998413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:46] (step=0005900) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.11465215701515741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5901, "loss": 0.10523636639118195, "memory_gb": 7.721559524536133, "step_time_ms": 3375.310182571411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:50] (step=0005901) Train Loss: 0.1696, Train Steps/Sec: 0.28, Epoch: 0.11467158958414303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 5902, "loss": 0.16547992825508118, "memory_gb": 7.721559524536133, "step_time_ms": 3391.963243484497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:53] (step=0005902) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.11469102215312864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:55:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5903, "loss": 0.25032711029052734, "memory_gb": 7.721559524536133, "step_time_ms": 3381.775140762329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:55:57] (step=0005903) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.11471045472211426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5904, "loss": 0.27111148834228516, "memory_gb": 7.715639114379883, "step_time_ms": 3359.346866607666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:01] (step=0005904) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.11472988729109988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5905, "loss": 0.2392364889383316, "memory_gb": 7.721559524536133, "step_time_ms": 3437.7224445343018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:04] (step=0005905) Train Loss: 0.2197, Train Steps/Sec: 0.27, Epoch: 0.1147493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5906, "loss": 0.27998512983322144, "memory_gb": 7.721559524536133, "step_time_ms": 3433.220148086548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:08] (step=0005906) Train Loss: 0.2722, Train Steps/Sec: 0.27, Epoch: 0.11476875242907113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5907, "loss": 0.3203776180744171, "memory_gb": 7.721559524536133, "step_time_ms": 3380.110502243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:12] (step=0005907) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.11478818499805675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5908, "loss": 0.1856188029050827, "memory_gb": 7.721559524536133, "step_time_ms": 3381.2098503112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:15] (step=0005908) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.11480761756704236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5909, "loss": 0.3261992633342743, "memory_gb": 7.715639114379883, "step_time_ms": 3356.348991394043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:19] (step=0005909) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.11482705013602798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5910, "loss": 0.28059929609298706, "memory_gb": 7.721559524536133, "step_time_ms": 3375.312089920044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:23] (step=0005910) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.1148464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5911, "loss": 0.32380375266075134, "memory_gb": 7.721559524536133, "step_time_ms": 3376.9173622131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:26] (step=0005911) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.11486591527399923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5912, "loss": 0.270685613155365, "memory_gb": 7.721559524536133, "step_time_ms": 3377.807140350342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:30] (step=0005912) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.11488534784298485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5913, "loss": 0.25778746604919434, "memory_gb": 7.721559524536133, "step_time_ms": 3380.392074584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:33] (step=0005913) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.11490478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5914, "loss": 0.2272934764623642, "memory_gb": 7.721559524536133, "step_time_ms": 3378.133535385132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:37] (step=0005914) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.11492421298095608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5915, "loss": 0.21424056589603424, "memory_gb": 7.721559524536133, "step_time_ms": 3379.855155944824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:41] (step=0005915) Train Loss: 0.1875, Train Steps/Sec: 0.28, Epoch: 0.1149436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5916, "loss": 0.3316904604434967, "memory_gb": 7.721559524536133, "step_time_ms": 3377.185583114624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:44] (step=0005916) Train Loss: 0.3047, Train Steps/Sec: 0.28, Epoch: 0.11496307811892732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5917, "loss": 0.17910848557949066, "memory_gb": 7.721559524536133, "step_time_ms": 3376.474618911743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:48] (step=0005917) Train Loss: 0.1722, Train Steps/Sec: 0.28, Epoch: 0.11498251068791294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5918, "loss": 0.2190292775630951, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9870319366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:52] (step=0005918) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.11500194325689857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5919, "loss": 0.21466492116451263, "memory_gb": 7.721559524536133, "step_time_ms": 3376.270532608032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:55] (step=0005919) Train Loss: 0.2395, Train Steps/Sec: 0.27, Epoch: 0.11502137582588418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:56:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5920, "loss": 0.2617743909358978, "memory_gb": 7.721559524536133, "step_time_ms": 3375.890016555786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:56:59] (step=0005920) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.1150408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5921, "loss": 0.2750825881958008, "memory_gb": 7.721559524536133, "step_time_ms": 3380.5229663848877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:03] (step=0005921) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.11506024096385542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5922, "loss": 0.2087959349155426, "memory_gb": 7.721559524536133, "step_time_ms": 3377.3398399353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:06] (step=0005922) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.11507967353284104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5923, "loss": 0.20125506818294525, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8997707366943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:10] (step=0005923) Train Loss: 0.1847, Train Steps/Sec: 0.28, Epoch: 0.11509910610182666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 5924, "loss": 0.20617030560970306, "memory_gb": 7.721559524536133, "step_time_ms": 3374.6540546417236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:14] (step=0005924) Train Loss: 0.2130, Train Steps/Sec: 0.27, Epoch: 0.11511853867081229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5925, "loss": 0.24911203980445862, "memory_gb": 7.721559524536133, "step_time_ms": 3388.8397216796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:17] (step=0005925) Train Loss: 0.2413, Train Steps/Sec: 0.27, Epoch: 0.1151379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5926, "loss": 0.1617102175951004, "memory_gb": 7.721559524536133, "step_time_ms": 3381.0949325561523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:21] (step=0005926) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.11515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5927, "loss": 0.24557143449783325, "memory_gb": 7.721559524536133, "step_time_ms": 3374.83811378479, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:25] (step=0005927) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.11517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5928, "loss": 0.22927068173885345, "memory_gb": 7.721559524536133, "step_time_ms": 3377.9523372650146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:28] (step=0005928) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.11519626894675476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5929, "loss": 0.2676328122615814, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1559982299805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:32] (step=0005929) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.11521570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5930, "loss": 0.22460408508777618, "memory_gb": 7.721559524536133, "step_time_ms": 3381.321907043457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:35] (step=0005930) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.11523513408472601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5931, "loss": 0.18625128269195557, "memory_gb": 7.721559524536133, "step_time_ms": 3525.423049926758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:39] (step=0005931) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.11525456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5932, "loss": 0.12224990874528885, "memory_gb": 7.721559524536133, "step_time_ms": 3381.9377422332764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:43] (step=0005932) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.11527399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5933, "loss": 0.169132798910141, "memory_gb": 7.721559524536133, "step_time_ms": 3376.260995864868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:46] (step=0005933) Train Loss: 0.1751, Train Steps/Sec: 0.28, Epoch: 0.11529343179168286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5934, "loss": 0.1620825082063675, "memory_gb": 7.721559524536133, "step_time_ms": 3376.7387866973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:50] (step=0005934) Train Loss: 0.1671, Train Steps/Sec: 0.28, Epoch: 0.11531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5935, "loss": 0.28629249334335327, "memory_gb": 7.721559524536133, "step_time_ms": 3384.533166885376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:54] (step=0005935) Train Loss: 0.3075, Train Steps/Sec: 0.28, Epoch: 0.1153322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:57:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5936, "loss": 0.20834854245185852, "memory_gb": 7.721559524536133, "step_time_ms": 3381.110668182373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:57:57] (step=0005936) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.11535172949863973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5937, "loss": 0.24537281692028046, "memory_gb": 7.721559524536133, "step_time_ms": 3378.4074783325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:01] (step=0005937) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.11537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5938, "loss": 0.2853745222091675, "memory_gb": 7.721559524536133, "step_time_ms": 3374.5317459106445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:04] (step=0005938) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.11539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5939, "loss": 0.23244765400886536, "memory_gb": 7.721559524536133, "step_time_ms": 3377.8445720672607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:08] (step=0005939) Train Loss: 0.1834, Train Steps/Sec: 0.28, Epoch: 0.11541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5940, "loss": 0.2514186203479767, "memory_gb": 7.721559524536133, "step_time_ms": 3359.304189682007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:12] (step=0005940) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.1154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5941, "loss": 0.2170284390449524, "memory_gb": 7.715639114379883, "step_time_ms": 3357.969045639038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:15] (step=0005941) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.11544889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5942, "loss": 0.1634993553161621, "memory_gb": 7.721559524536133, "step_time_ms": 3381.727457046509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:19] (step=0005942) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.11546832491255343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5943, "loss": 0.21348971128463745, "memory_gb": 7.721559524536133, "step_time_ms": 3393.6944007873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:23] (step=0005943) Train Loss: 0.1947, Train Steps/Sec: 0.28, Epoch: 0.11548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5944, "loss": 0.23295657336711884, "memory_gb": 7.721559524536133, "step_time_ms": 3394.908905029297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:26] (step=0005944) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.11550719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5945, "loss": 0.16206708550453186, "memory_gb": 7.721559524536133, "step_time_ms": 3386.117935180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:30] (step=0005945) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.1155266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5946, "loss": 0.3056119680404663, "memory_gb": 7.721559524536133, "step_time_ms": 3383.1987380981445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:33] (step=0005946) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.11554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5947, "loss": 0.1923455148935318, "memory_gb": 7.721559524536133, "step_time_ms": 3383.484125137329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:37] (step=0005947) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.11556548775748154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5948, "loss": 0.21256642043590546, "memory_gb": 7.721559524536133, "step_time_ms": 3384.5393657684326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:41] (step=0005948) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.11558492032646715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5949, "loss": 0.23218324780464172, "memory_gb": 7.721559524536133, "step_time_ms": 3388.072729110718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:44] (step=0005949) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.11560435289545277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5950, "loss": 0.09414222836494446, "memory_gb": 7.721559524536133, "step_time_ms": 3381.65545463562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:48] (step=0005950) Train Loss: 0.1673, Train Steps/Sec: 0.28, Epoch: 0.1156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5951, "loss": 0.29011452198028564, "memory_gb": 7.721559524536133, "step_time_ms": 3382.4994564056396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:52] (step=0005951) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.11564321803342402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5952, "loss": 0.23645342886447906, "memory_gb": 7.721559524536133, "step_time_ms": 3385.7531547546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:55] (step=0005952) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.11566265060240964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:58:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5953, "loss": 0.2727510929107666, "memory_gb": 7.721559524536133, "step_time_ms": 3384.7687244415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:58:59] (step=0005953) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.11568208317139526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 5954, "loss": 0.23618602752685547, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4566230773926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:03] (step=0005954) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.11570151574038087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5955, "loss": 0.14538609981536865, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0550327301025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:06] (step=0005955) Train Loss: 0.1809, Train Steps/Sec: 0.28, Epoch: 0.1157209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5956, "loss": 0.19792433083057404, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7806339263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:10] (step=0005956) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.11574038087835212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5957, "loss": 0.2618813216686249, "memory_gb": 7.721559524536133, "step_time_ms": 3370.758533477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:13] (step=0005957) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.11575981344733774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5958, "loss": 0.3353457450866699, "memory_gb": 7.721559524536133, "step_time_ms": 3374.0456104278564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:17] (step=0005958) Train Loss: 0.3051, Train Steps/Sec: 0.28, Epoch: 0.11577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 5959, "loss": 0.1640641987323761, "memory_gb": 7.721559524536133, "step_time_ms": 3378.265857696533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:21] (step=0005959) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.11579867858530898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5960, "loss": 0.24230214953422546, "memory_gb": 7.721559524536133, "step_time_ms": 3376.99818611145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:24] (step=0005960) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.11581811115429459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5961, "loss": 0.21616855263710022, "memory_gb": 7.721559524536133, "step_time_ms": 3375.1437664031982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:28] (step=0005961) Train Loss: 0.1835, Train Steps/Sec: 0.28, Epoch: 0.11583754372328021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 5962, "loss": 0.1872791349887848, "memory_gb": 7.721559524536133, "step_time_ms": 3373.8486766815186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:32] (step=0005962) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.11585697629226584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5963, "loss": 0.30121344327926636, "memory_gb": 7.721559524536133, "step_time_ms": 3376.973867416382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:35] (step=0005963) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.11587640886125146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 5964, "loss": 0.2543211579322815, "memory_gb": 7.721559524536133, "step_time_ms": 3370.01895904541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:39] (step=0005964) Train Loss: 0.2290, Train Steps/Sec: 0.27, Epoch: 0.11589584143023708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 5965, "loss": 0.2792455852031708, "memory_gb": 7.721559524536133, "step_time_ms": 3369.697093963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:43] (step=0005965) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.1159152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 5966, "loss": 0.23424726724624634, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7123737335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:46] (step=0005966) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.11593470656820831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 5967, "loss": 0.13454672694206238, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1791038513184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:50] (step=0005967) Train Loss: 0.1408, Train Steps/Sec: 0.28, Epoch: 0.11595413913719393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 5968, "loss": 0.19672587513923645, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5764751434326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:54] (step=0005968) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.11597357170617956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 05:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 5969, "loss": 0.16012859344482422, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6846027374268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 05:59:57] (step=0005969) Train Loss: 0.2065, Train Steps/Sec: 0.28, Epoch: 0.11599300427516518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 5970, "loss": 0.14527371525764465, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2569007873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:01] (step=0005970) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.1160124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 5971, "loss": 0.21221692860126495, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5224056243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:04] (step=0005971) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.11603186941313641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 5972, "loss": 0.258428156375885, "memory_gb": 7.721559524536133, "step_time_ms": 3366.819143295288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:08] (step=0005972) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.11605130198212203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 5973, "loss": 0.20177114009857178, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7706718444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:12] (step=0005973) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.11607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 5974, "loss": 0.29914790391921997, "memory_gb": 7.721559524536133, "step_time_ms": 3369.845151901245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:15] (step=0005974) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.11609016712009328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 5975, "loss": 0.3144475519657135, "memory_gb": 7.721559524536133, "step_time_ms": 3372.2736835479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:19] (step=0005975) Train Loss: 0.3101, Train Steps/Sec: 0.28, Epoch: 0.1161095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 5976, "loss": 0.2599656581878662, "memory_gb": 7.721559524536133, "step_time_ms": 3367.607593536377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:23] (step=0005976) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.11612903225806452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 5977, "loss": 0.26020222902297974, "memory_gb": 7.721559524536133, "step_time_ms": 3372.589349746704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:26] (step=0005977) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.11614846482705013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 5978, "loss": 0.262279212474823, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2664165496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:30] (step=0005978) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.11616789739603575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 5979, "loss": 0.2509145140647888, "memory_gb": 7.721559524536133, "step_time_ms": 3511.7666721343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:33] (step=0005979) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.11618732996502137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 5980, "loss": 0.21920043230056763, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2924976348877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:37] (step=0005980) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 5981, "loss": 0.1694936454296112, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6772842407227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:41] (step=0005981) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.11622619510299262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 5982, "loss": 0.2201302945613861, "memory_gb": 7.715639114379883, "step_time_ms": 3333.379030227661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:44] (step=0005982) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.11624562767197824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 5983, "loss": 0.16524958610534668, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3810234069824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:48] (step=0005983) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.11626506024096385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 5984, "loss": 0.21657098829746246, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7233505249023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:52] (step=0005984) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.11628449280994947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 5985, "loss": 0.24264204502105713, "memory_gb": 7.721559524536133, "step_time_ms": 3366.147756576538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:55] (step=0005985) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.1163039253789351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:00:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 5986, "loss": 0.1895401030778885, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1996898651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:00:59] (step=0005986) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.11632335794792072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 5987, "loss": 0.1364317089319229, "memory_gb": 7.721559524536133, "step_time_ms": 3359.923839569092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:02] (step=0005987) Train Loss: 0.1844, Train Steps/Sec: 0.28, Epoch: 0.11634279051690634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 5988, "loss": 0.2762719392776489, "memory_gb": 7.721559524536133, "step_time_ms": 3357.173442840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:06] (step=0005988) Train Loss: 0.2215, Train Steps/Sec: 0.28, Epoch: 0.11636222308589196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 5989, "loss": 0.30364614725112915, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6362857818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:10] (step=0005989) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.11638165565487757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 5990, "loss": 0.28269141912460327, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2487831115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:13] (step=0005990) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.11640108822386319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 5991, "loss": 0.17507970333099365, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7164669036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:17] (step=0005991) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.11642052079284881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 5992, "loss": 0.13297511637210846, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6835136413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:20] (step=0005992) Train Loss: 0.1844, Train Steps/Sec: 0.28, Epoch: 0.11643995336183444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 5993, "loss": 0.28775694966316223, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4107627868652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:24] (step=0005993) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.11645938593082006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 5994, "loss": 0.177134171128273, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8789234161377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:28] (step=0005994) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.11647881849980568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 5995, "loss": 0.25818175077438354, "memory_gb": 7.721559524536133, "step_time_ms": 3358.917474746704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:31] (step=0005995) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.11649825106879129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 5996, "loss": 0.15291918814182281, "memory_gb": 7.721559524536133, "step_time_ms": 3357.053756713867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:35] (step=0005996) Train Loss: 0.2034, Train Steps/Sec: 0.28, Epoch: 0.11651768363777691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 5997, "loss": 0.2094918191432953, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6714267730713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:38] (step=0005997) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.11653711620676253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 5998, "loss": 0.19191256165504456, "memory_gb": 7.721559524536133, "step_time_ms": 3354.226589202881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:42] (step=0005998) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.11655654877574816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 5999, "loss": 0.27481526136398315, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6889972686768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:45] (step=0005999) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.11657598134473378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6000, "loss": 0.327021062374115, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9140434265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:49] (step=0006000) Train Loss: 0.3110, Train Steps/Sec: 0.28, Epoch: 0.11659541391371939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:49] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0006000/ [2025-07-29 06:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6001, "loss": 0.160682812333107, "memory_gb": 7.721559524536133, "step_time_ms": 3351.1011600494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:52] (step=0006001) Train Loss: 0.1865, Train Steps/Sec: 0.28, Epoch: 0.11661484648270501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:01:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6002, "loss": 0.24530331790447235, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4423999786377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:01:56] (step=0006002) Train Loss: 0.2860, Train Steps/Sec: 0.28, Epoch: 0.11663427905169063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6003, "loss": 0.13261285424232483, "memory_gb": 7.721559524536133, "step_time_ms": 3359.522819519043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:00] (step=0006003) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.11665371162067625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6004, "loss": 0.3073020577430725, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8007221221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:03] (step=0006004) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.11667314418966188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6005, "loss": 0.28061044216156006, "memory_gb": 7.721559524536133, "step_time_ms": 3358.370780944824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:07] (step=0006005) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.1166925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6006, "loss": 0.310617595911026, "memory_gb": 7.721559524536133, "step_time_ms": 3361.16099357605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:10] (step=0006006) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.1167120093276331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6007, "loss": 0.23337149620056152, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3996295928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:14] (step=0006007) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.11673144189661873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6008, "loss": 0.20165109634399414, "memory_gb": 7.721559524536133, "step_time_ms": 3357.604742050171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:17] (step=0006008) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.11675087446560435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6009, "loss": 0.22273661196231842, "memory_gb": 7.721559524536133, "step_time_ms": 3358.250379562378, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:21] (step=0006009) Train Loss: 0.1917, Train Steps/Sec: 0.28, Epoch: 0.11677030703458997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6010, "loss": 0.22206330299377441, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6153049468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:24] (step=0006010) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.1167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6011, "loss": 0.27244311571121216, "memory_gb": 7.721559524536133, "step_time_ms": 3361.206293106079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:28] (step=0006011) Train Loss: 0.2952, Train Steps/Sec: 0.28, Epoch: 0.11680917217256122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6012, "loss": 0.2194610983133316, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6697788238525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:32] (step=0006012) Train Loss: 0.2311, Train Steps/Sec: 0.27, Epoch: 0.11682860474154683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6013, "loss": 0.25773757696151733, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9184284210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:35] (step=0006013) Train Loss: 0.3046, Train Steps/Sec: 0.28, Epoch: 0.11684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6014, "loss": 0.22755444049835205, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0494441986084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:39] (step=0006014) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.11686746987951807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6015, "loss": 0.2791520953178406, "memory_gb": 7.721559524536133, "step_time_ms": 3357.616901397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:42] (step=0006015) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.1168869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6016, "loss": 0.19229096174240112, "memory_gb": 7.721559524536133, "step_time_ms": 3358.159303665161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:46] (step=0006016) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.11690633501748932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6017, "loss": 0.27469420433044434, "memory_gb": 7.721559524536133, "step_time_ms": 3359.189748764038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:49] (step=0006017) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.11692576758647494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6018, "loss": 0.15481820702552795, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2399826049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:53] (step=0006018) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.11694520015546055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6019, "loss": 0.2153608351945877, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1039905548096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:02:56] (step=0006019) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.11696463272444617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6020, "loss": 0.32150691747665405, "memory_gb": 7.721559524536133, "step_time_ms": 3356.734275817871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:00] (step=0006020) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.11698406529343179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6021, "loss": 0.2454935908317566, "memory_gb": 7.721559524536133, "step_time_ms": 3357.393264770508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:03] (step=0006021) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.11700349786241741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6022, "loss": 0.15987816452980042, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9447269439697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:07] (step=0006022) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.11702293043140304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6023, "loss": 0.251203328371048, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3357334136963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:11] (step=0006023) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.11704236300038866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6024, "loss": 0.2921144962310791, "memory_gb": 7.721559524536133, "step_time_ms": 3500.7948875427246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:14] (step=0006024) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.11706179556937427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6025, "loss": 0.21965469419956207, "memory_gb": 7.721559524536133, "step_time_ms": 3347.3355770111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:18] (step=0006025) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.11708122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6026, "loss": 0.3069659173488617, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0706844329834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:21] (step=0006026) Train Loss: 0.3325, Train Steps/Sec: 0.28, Epoch: 0.11710066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6027, "loss": 0.24032112956047058, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7067127227783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:25] (step=0006027) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.11712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6028, "loss": 0.24016742408275604, "memory_gb": 7.721559524536133, "step_time_ms": 3354.365110397339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:28] (step=0006028) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.11713952584531676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6029, "loss": 0.24249379336833954, "memory_gb": 7.721559524536133, "step_time_ms": 3354.151725769043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:32] (step=0006029) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.11715895841430238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6030, "loss": 0.24152716994285583, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5098514556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:35] (step=0006030) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.11717839098328799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6031, "loss": 0.2805825471878052, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7517738342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:39] (step=0006031) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.11719782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6032, "loss": 0.2353389859199524, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5727519989014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:42] (step=0006032) Train Loss: 0.3028, Train Steps/Sec: 0.28, Epoch: 0.11721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6033, "loss": 0.32480332255363464, "memory_gb": 7.721559524536133, "step_time_ms": 3352.764368057251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:46] (step=0006033) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.11723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6034, "loss": 0.13126541674137115, "memory_gb": 7.721559524536133, "step_time_ms": 3355.488061904907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:49] (step=0006034) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.11725612125923048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6035, "loss": 0.1961841732263565, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1182956695557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:53] (step=0006035) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.11727555382821608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6036, "loss": 0.2847565710544586, "memory_gb": 7.721559524536133, "step_time_ms": 3359.508991241455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:03:57] (step=0006036) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.1172949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6037, "loss": 0.23898936808109283, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5516986846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:00] (step=0006037) Train Loss: 0.2922, Train Steps/Sec: 0.28, Epoch: 0.11731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6038, "loss": 0.23721377551555634, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1925678253174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:04] (step=0006038) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.11733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6039, "loss": 0.2208804190158844, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5234413146973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:07] (step=0006039) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.11735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6040, "loss": 0.22262537479400635, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8638820648193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:11] (step=0006040) Train Loss: 0.1890, Train Steps/Sec: 0.28, Epoch: 0.1173727166731442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6041, "loss": 0.1977061927318573, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0826778411865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:14] (step=0006041) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.1173921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6042, "loss": 0.20449820160865784, "memory_gb": 7.721559524536133, "step_time_ms": 3353.43599319458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:18] (step=0006042) Train Loss: 0.1558, Train Steps/Sec: 0.28, Epoch: 0.11741158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6043, "loss": 0.17490898072719574, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5033416748047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:22] (step=0006043) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.11743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6044, "loss": 0.2819100022315979, "memory_gb": 7.721559524536133, "step_time_ms": 3357.347249984741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:25] (step=0006044) Train Loss: 0.2946, Train Steps/Sec: 0.28, Epoch: 0.11745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6045, "loss": 0.2858870327472687, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4484844207764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:29] (step=0006045) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.11746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6046, "loss": 0.28418228030204773, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4463596343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:32] (step=0006046) Train Loss: 0.2816, Train Steps/Sec: 0.28, Epoch: 0.11748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6047, "loss": 0.2973039746284485, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1806678771973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:36] (step=0006047) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.11750874465604352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6048, "loss": 0.30686646699905396, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0831756591797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:40] (step=0006048) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.11752817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6049, "loss": 0.2651704251766205, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6105365753174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:43] (step=0006049) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.11754760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6050, "loss": 0.23360972106456757, "memory_gb": 7.721559524536133, "step_time_ms": 3359.774351119995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:47] (step=0006050) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.11756704236300039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6051, "loss": 0.33684730529785156, "memory_gb": 7.721559524536133, "step_time_ms": 3362.497329711914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:50] (step=0006051) Train Loss: 0.3483, Train Steps/Sec: 0.28, Epoch: 0.11758647493198601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6052, "loss": 0.21915212273597717, "memory_gb": 7.721559524536133, "step_time_ms": 3359.018564224243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:54] (step=0006052) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.11760590750097163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:04:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6053, "loss": 0.128501296043396, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1085205078125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:04:58] (step=0006053) Train Loss: 0.1491, Train Steps/Sec: 0.27, Epoch: 0.11762534006995724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6054, "loss": 0.25839436054229736, "memory_gb": 7.721559524536133, "step_time_ms": 3361.661911010742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:01] (step=0006054) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.11764477263894287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6055, "loss": 0.2453785538673401, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8692417144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:05] (step=0006055) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.11766420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6056, "loss": 0.18604600429534912, "memory_gb": 7.721559524536133, "step_time_ms": 3354.905366897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:09] (step=0006056) Train Loss: 0.1848, Train Steps/Sec: 0.28, Epoch: 0.11768363777691411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6057, "loss": 0.22769513726234436, "memory_gb": 7.721559524536133, "step_time_ms": 3358.757734298706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:12] (step=0006057) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.11770307034589973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6058, "loss": 0.20386821031570435, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3296699523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:16] (step=0006058) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.11772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6059, "loss": 0.1808699518442154, "memory_gb": 7.721559524536133, "step_time_ms": 3364.449977874756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:20] (step=0006059) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.11774193548387096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6060, "loss": 0.17889556288719177, "memory_gb": 7.721559524536133, "step_time_ms": 3425.5526065826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:23] (step=0006060) Train Loss: 0.2239, Train Steps/Sec: 0.27, Epoch: 0.11776136805285659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6061, "loss": 0.30917203426361084, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5667095184326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:27] (step=0006061) Train Loss: 0.2752, Train Steps/Sec: 0.27, Epoch: 0.11778080062184221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6062, "loss": 0.16324618458747864, "memory_gb": 7.721559524536133, "step_time_ms": 3359.452247619629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:30] (step=0006062) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.11780023319082783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6063, "loss": 0.2345506101846695, "memory_gb": 7.721559524536133, "step_time_ms": 3389.833450317383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:34] (step=0006063) Train Loss: 0.2448, Train Steps/Sec: 0.27, Epoch: 0.11781966575981345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6064, "loss": 0.3485768735408783, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8155460357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:38] (step=0006064) Train Loss: 0.2977, Train Steps/Sec: 0.28, Epoch: 0.11783909832879906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6065, "loss": 0.22799062728881836, "memory_gb": 7.721559524536133, "step_time_ms": 3511.768579483032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:41] (step=0006065) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.11785853089778468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6066, "loss": 0.20465990900993347, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7044219970703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:45] (step=0006066) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.1178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6067, "loss": 0.23322691023349762, "memory_gb": 7.721559524536133, "step_time_ms": 3341.7439460754395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:48] (step=0006067) Train Loss: 0.3048, Train Steps/Sec: 0.29, Epoch: 0.11789739603575593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6068, "loss": 0.23192214965820312, "memory_gb": 7.721559524536133, "step_time_ms": 3348.1650352478027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:52] (step=0006068) Train Loss: 0.2923, Train Steps/Sec: 0.28, Epoch: 0.11791682860474155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6069, "loss": 0.3328338861465454, "memory_gb": 7.721559524536133, "step_time_ms": 3358.82830619812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:56] (step=0006069) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.11793626117372717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:05:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6070, "loss": 0.17815423011779785, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9698543548584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:05:59] (step=0006070) Train Loss: 0.1614, Train Steps/Sec: 0.28, Epoch: 0.11795569374271278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6071, "loss": 0.2966676652431488, "memory_gb": 7.721559524536133, "step_time_ms": 3363.269567489624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:03] (step=0006071) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.1179751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6072, "loss": 0.24249380826950073, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1391525268555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:07] (step=0006072) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.11799455888068403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6073, "loss": 0.1993575096130371, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0079498291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:10] (step=0006073) Train Loss: 0.1756, Train Steps/Sec: 0.28, Epoch: 0.11801399144966965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6074, "loss": 0.29718881845474243, "memory_gb": 7.721559524536133, "step_time_ms": 3362.931728363037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:14] (step=0006074) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.11803342401865527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6075, "loss": 0.295419305562973, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3738498687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:17] (step=0006075) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.11805285658764089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6076, "loss": 0.1932729035615921, "memory_gb": 7.721559524536133, "step_time_ms": 3354.836940765381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:21] (step=0006076) Train Loss: 0.1677, Train Steps/Sec: 0.28, Epoch: 0.1180722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6077, "loss": 0.17673009634017944, "memory_gb": 7.721559524536133, "step_time_ms": 3367.353916168213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:25] (step=0006077) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.11809172172561212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6078, "loss": 0.22263622283935547, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8529357910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:28] (step=0006078) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.11811115429459774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6079, "loss": 0.2428412288427353, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9534034729004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:32] (step=0006079) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.11813058686358337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6080, "loss": 0.21591044962406158, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3200817108154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:35] (step=0006080) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.11815001943256899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6081, "loss": 0.24183455109596252, "memory_gb": 7.721559524536133, "step_time_ms": 3356.158971786499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:39] (step=0006081) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.11816945200155461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6082, "loss": 0.23908761143684387, "memory_gb": 7.721559524536133, "step_time_ms": 3373.070001602173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:43] (step=0006082) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.11818888457054022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6083, "loss": 0.24610450863838196, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8624839782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:46] (step=0006083) Train Loss: 0.2771, Train Steps/Sec: 0.27, Epoch: 0.11820831713952584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6084, "loss": 0.2984924018383026, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7949390411377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:50] (step=0006084) Train Loss: 0.2851, Train Steps/Sec: 0.27, Epoch: 0.11822774970851146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6085, "loss": 0.28158003091812134, "memory_gb": 7.721559524536133, "step_time_ms": 3369.494915008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:54] (step=0006085) Train Loss: 0.2448, Train Steps/Sec: 0.27, Epoch: 0.11824718227749709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:06:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6086, "loss": 0.2457350492477417, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5135536193848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:06:57] (step=0006086) Train Loss: 0.2195, Train Steps/Sec: 0.27, Epoch: 0.11826661484648271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6087, "loss": 0.2960338294506073, "memory_gb": 7.721559524536133, "step_time_ms": 3377.8235912323, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:01] (step=0006087) Train Loss: 0.2552, Train Steps/Sec: 0.27, Epoch: 0.11828604741546833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6088, "loss": 0.21875780820846558, "memory_gb": 7.721559524536133, "step_time_ms": 3382.1072578430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:05] (step=0006088) Train Loss: 0.2451, Train Steps/Sec: 0.27, Epoch: 0.11830547998445394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6089, "loss": 0.23685701191425323, "memory_gb": 7.721559524536133, "step_time_ms": 3380.3014755249023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:08] (step=0006089) Train Loss: 0.2481, Train Steps/Sec: 0.27, Epoch: 0.11832491255343956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6090, "loss": 0.26374661922454834, "memory_gb": 7.721559524536133, "step_time_ms": 3378.244638442993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:12] (step=0006090) Train Loss: 0.2670, Train Steps/Sec: 0.28, Epoch: 0.11834434512242518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6091, "loss": 0.23049969971179962, "memory_gb": 7.721559524536133, "step_time_ms": 3377.94828414917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:16] (step=0006091) Train Loss: 0.1809, Train Steps/Sec: 0.27, Epoch: 0.11836377769141081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6092, "loss": 0.24797481298446655, "memory_gb": 7.721559524536133, "step_time_ms": 3386.5437507629395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:19] (step=0006092) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.11838321026039643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6093, "loss": 0.1404862403869629, "memory_gb": 7.721559524536133, "step_time_ms": 3379.990339279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:23] (step=0006093) Train Loss: 0.1800, Train Steps/Sec: 0.28, Epoch: 0.11840264282938204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6094, "loss": 0.2987302243709564, "memory_gb": 7.721559524536133, "step_time_ms": 3381.805658340454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:27] (step=0006094) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.11842207539836766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6095, "loss": 0.3024809956550598, "memory_gb": 7.721559524536133, "step_time_ms": 3374.258279800415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:30] (step=0006095) Train Loss: 0.2908, Train Steps/Sec: 0.28, Epoch: 0.11844150796735328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6096, "loss": 0.2125779092311859, "memory_gb": 7.721559524536133, "step_time_ms": 3368.980884552002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:34] (step=0006096) Train Loss: 0.1841, Train Steps/Sec: 0.28, Epoch: 0.1184609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6097, "loss": 0.2502623498439789, "memory_gb": 7.721559524536133, "step_time_ms": 3380.9456825256348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:37] (step=0006097) Train Loss: 0.2725, Train Steps/Sec: 0.28, Epoch: 0.11848037310532453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6098, "loss": 0.22313739359378815, "memory_gb": 7.721559524536133, "step_time_ms": 3382.6708793640137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:41] (step=0006098) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.11849980567431015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6099, "loss": 0.24819305539131165, "memory_gb": 7.721559524536133, "step_time_ms": 3374.237298965454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:45] (step=0006099) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.11851923824329576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6100, "loss": 0.28846222162246704, "memory_gb": 7.721559524536133, "step_time_ms": 3380.749225616455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:48] (step=0006100) Train Loss: 0.3187, Train Steps/Sec: 0.28, Epoch: 0.11853867081228138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6101, "loss": 0.26314908266067505, "memory_gb": 7.721559524536133, "step_time_ms": 3387.3631954193115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:52] (step=0006101) Train Loss: 0.2248, Train Steps/Sec: 0.26, Epoch: 0.118558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6102, "loss": 0.2020302712917328, "memory_gb": 7.721559524536133, "step_time_ms": 3493.3505058288574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:56] (step=0006102) Train Loss: 0.2279, Train Steps/Sec: 0.26, Epoch: 0.11857753595025262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:07:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6103, "loss": 0.2328384518623352, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7358112335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:07:59] (step=0006103) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.11859696851923825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6104, "loss": 0.2977215647697449, "memory_gb": 7.721559524536133, "step_time_ms": 3381.1237812042236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:03] (step=0006104) Train Loss: 0.3029, Train Steps/Sec: 0.28, Epoch: 0.11861640108822387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6105, "loss": 0.35704857110977173, "memory_gb": 7.721559524536133, "step_time_ms": 3413.1429195404053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:07] (step=0006105) Train Loss: 0.3281, Train Steps/Sec: 0.27, Epoch: 0.11863583365720948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6106, "loss": 0.21598535776138306, "memory_gb": 7.721559524536133, "step_time_ms": 3388.556480407715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:10] (step=0006106) Train Loss: 0.1629, Train Steps/Sec: 0.27, Epoch: 0.1186552662261951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6107, "loss": 0.27437400817871094, "memory_gb": 7.721559524536133, "step_time_ms": 3375.3604888916016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:14] (step=0006107) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.11867469879518072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6108, "loss": 0.20946744084358215, "memory_gb": 7.721559524536133, "step_time_ms": 3382.336139678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:18] (step=0006108) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.11869413136416634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6109, "loss": 0.22129608690738678, "memory_gb": 7.721559524536133, "step_time_ms": 3376.323938369751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:21] (step=0006109) Train Loss: 0.2988, Train Steps/Sec: 0.28, Epoch: 0.11871356393315197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6110, "loss": 0.21826979517936707, "memory_gb": 7.721559524536133, "step_time_ms": 3379.7881603240967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:25] (step=0006110) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.11873299650213759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6111, "loss": 0.19701316952705383, "memory_gb": 7.721559524536133, "step_time_ms": 3374.7286796569824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:29] (step=0006111) Train Loss: 0.1982, Train Steps/Sec: 0.28, Epoch: 0.1187524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6112, "loss": 0.20603761076927185, "memory_gb": 7.721559524536133, "step_time_ms": 3376.6493797302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:32] (step=0006112) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.11877186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6113, "loss": 0.34497103095054626, "memory_gb": 7.721559524536133, "step_time_ms": 3510.220527648926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:36] (step=0006113) Train Loss: 0.3169, Train Steps/Sec: 0.28, Epoch: 0.11879129420909444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6114, "loss": 0.21614189445972443, "memory_gb": 7.721559524536133, "step_time_ms": 3374.0057945251465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:39] (step=0006114) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.11881072677808006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6115, "loss": 0.1769181340932846, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0010051727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:43] (step=0006115) Train Loss: 0.1787, Train Steps/Sec: 0.28, Epoch: 0.11883015934706569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6116, "loss": 0.2772563695907593, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7188415527344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:47] (step=0006116) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.11884959191605131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6117, "loss": 0.21863943338394165, "memory_gb": 7.721559524536133, "step_time_ms": 3377.819776535034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:50] (step=0006117) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.11886902448503692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6118, "loss": 0.33541303873062134, "memory_gb": 7.721559524536133, "step_time_ms": 3385.219097137451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:54] (step=0006118) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.11888845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6119, "loss": 0.25189483165740967, "memory_gb": 7.721559524536133, "step_time_ms": 3387.4545097351074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:08:58] (step=0006119) Train Loss: 0.2744, Train Steps/Sec: 0.27, Epoch: 0.11890788962300816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6120, "loss": 0.12322120368480682, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8787899017334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:01] (step=0006120) Train Loss: 0.1800, Train Steps/Sec: 0.28, Epoch: 0.11892732219199378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6121, "loss": 0.22997009754180908, "memory_gb": 7.721559524536133, "step_time_ms": 3383.5365772247314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:05] (step=0006121) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.1189467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6122, "loss": 0.15703856945037842, "memory_gb": 7.721559524536133, "step_time_ms": 3380.5694580078125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:08] (step=0006122) Train Loss: 0.1971, Train Steps/Sec: 0.28, Epoch: 0.11896618732996501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6123, "loss": 0.2289896309375763, "memory_gb": 7.721559524536133, "step_time_ms": 3379.477024078369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:12] (step=0006123) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.11898561989895064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6124, "loss": 0.36845892667770386, "memory_gb": 7.721559524536133, "step_time_ms": 3378.7875175476074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:16] (step=0006124) Train Loss: 0.3009, Train Steps/Sec: 0.28, Epoch: 0.11900505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6125, "loss": 0.18078774213790894, "memory_gb": 7.721559524536133, "step_time_ms": 3377.384901046753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:19] (step=0006125) Train Loss: 0.2229, Train Steps/Sec: 0.27, Epoch: 0.11902448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6126, "loss": 0.24660329520702362, "memory_gb": 7.721559524536133, "step_time_ms": 3377.6535987854004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:23] (step=0006126) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.1190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6127, "loss": 0.17708173394203186, "memory_gb": 7.721559524536133, "step_time_ms": 3377.8858184814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:27] (step=0006127) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.11906335017489313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6128, "loss": 0.1650436520576477, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8018531799316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:30] (step=0006128) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.11908278274387873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6129, "loss": 0.19564104080200195, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1064071655273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:34] (step=0006129) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.11910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6130, "loss": 0.1941329836845398, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4825382232666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:37] (step=0006130) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.11912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6131, "loss": 0.3860935568809509, "memory_gb": 7.721559524536133, "step_time_ms": 3369.030475616455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:41] (step=0006131) Train Loss: 0.3021, Train Steps/Sec: 0.28, Epoch: 0.1191410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6132, "loss": 0.16050292551517487, "memory_gb": 7.721559524536133, "step_time_ms": 3372.037887573242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:45] (step=0006132) Train Loss: 0.1543, Train Steps/Sec: 0.28, Epoch: 0.11916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6133, "loss": 0.23630467057228088, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6301288604736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:48] (step=0006133) Train Loss: 0.3198, Train Steps/Sec: 0.28, Epoch: 0.11917994558880685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6134, "loss": 0.206525057554245, "memory_gb": 7.721559524536133, "step_time_ms": 3367.042303085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:52] (step=0006134) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.11919937815779245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6135, "loss": 0.3084166646003723, "memory_gb": 7.721559524536133, "step_time_ms": 3375.01859664917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:56] (step=0006135) Train Loss: 0.2741, Train Steps/Sec: 0.27, Epoch: 0.11921881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6136, "loss": 0.3076675534248352, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5200424194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:09:59] (step=0006136) Train Loss: 0.3269, Train Steps/Sec: 0.28, Epoch: 0.1192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6137, "loss": 0.23281437158584595, "memory_gb": 7.721559524536133, "step_time_ms": 3371.227025985718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:03] (step=0006137) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.11925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6138, "loss": 0.2230251431465149, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6326484680176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:06] (step=0006138) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.11927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6139, "loss": 0.20323887467384338, "memory_gb": 7.721559524536133, "step_time_ms": 3375.02121925354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:10] (step=0006139) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.11929654100272057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6140, "loss": 0.29211103916168213, "memory_gb": 7.721559524536133, "step_time_ms": 3372.298002243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:14] (step=0006140) Train Loss: 0.2994, Train Steps/Sec: 0.28, Epoch: 0.11931597357170617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6141, "loss": 0.22354410588741302, "memory_gb": 7.721559524536133, "step_time_ms": 3374.5367527008057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:17] (step=0006141) Train Loss: 0.2502, Train Steps/Sec: 0.26, Epoch: 0.1193354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6142, "loss": 0.2609415054321289, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5423908233643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:21] (step=0006142) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.11935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6143, "loss": 0.18991142511367798, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7775344848633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:25] (step=0006143) Train Loss: 0.2155, Train Steps/Sec: 0.27, Epoch: 0.11937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6144, "loss": 0.21503108739852905, "memory_gb": 7.721559524536133, "step_time_ms": 3370.637893676758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:28] (step=0006144) Train Loss: 0.1682, Train Steps/Sec: 0.28, Epoch: 0.11939370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6145, "loss": 0.19113188982009888, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9233531951904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:32] (step=0006145) Train Loss: 0.2064, Train Steps/Sec: 0.27, Epoch: 0.11941313641663429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6146, "loss": 0.2434363067150116, "memory_gb": 7.721559524536133, "step_time_ms": 3357.558250427246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:36] (step=0006146) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.1194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6147, "loss": 0.2836487293243408, "memory_gb": 7.721559524536133, "step_time_ms": 3365.961790084839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:39] (step=0006147) Train Loss: 0.2753, Train Steps/Sec: 0.28, Epoch: 0.11945200155460552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6148, "loss": 0.2882903516292572, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7726726531982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:43] (step=0006148) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.11947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6149, "loss": 0.24588236212730408, "memory_gb": 7.721559524536133, "step_time_ms": 3371.98805809021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:47] (step=0006149) Train Loss: 0.2583, Train Steps/Sec: 0.27, Epoch: 0.11949086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6150, "loss": 0.2535278797149658, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3222789764404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:50] (step=0006150) Train Loss: 0.2756, Train Steps/Sec: 0.27, Epoch: 0.11951029926156238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6151, "loss": 0.3504270911216736, "memory_gb": 7.715639114379883, "step_time_ms": 3334.8162174224854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:54] (step=0006151) Train Loss: 0.3195, Train Steps/Sec: 0.28, Epoch: 0.11952973183054799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6152, "loss": 0.17057281732559204, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1296367645264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:10:57] (step=0006152) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.11954916439953361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6153, "loss": 0.29019251465797424, "memory_gb": 7.721559524536133, "step_time_ms": 3503.4587383270264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:01] (step=0006153) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.11956859696851924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6154, "loss": 0.16579987108707428, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4600944519043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:05] (step=0006154) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.11958802953750486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6155, "loss": 0.2712685763835907, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8316135406494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:08] (step=0006155) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.11960746210649048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6156, "loss": 0.15645450353622437, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1598949432373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:12] (step=0006156) Train Loss: 0.2412, Train Steps/Sec: 0.28, Epoch: 0.1196268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6157, "loss": 0.30529606342315674, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4135932922363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:16] (step=0006157) Train Loss: 0.2551, Train Steps/Sec: 0.28, Epoch: 0.11964632724446171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6158, "loss": 0.3663553297519684, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9726219177246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:19] (step=0006158) Train Loss: 0.2780, Train Steps/Sec: 0.28, Epoch: 0.11966575981344733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6159, "loss": 0.23416388034820557, "memory_gb": 7.721559524536133, "step_time_ms": 3363.455295562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:23] (step=0006159) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.11968519238243296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6160, "loss": 0.22230932116508484, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8289909362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:26] (step=0006160) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.11970462495141858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6161, "loss": 0.23331457376480103, "memory_gb": 7.721559524536133, "step_time_ms": 3354.896306991577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:30] (step=0006161) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.1197240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6162, "loss": 0.20346413552761078, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3549518585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:34] (step=0006162) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.11974349008938982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6163, "loss": 0.18235978484153748, "memory_gb": 7.721559524536133, "step_time_ms": 3364.393472671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:37] (step=0006163) Train Loss: 0.2629, Train Steps/Sec: 0.28, Epoch: 0.11976292265837543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6164, "loss": 0.1783650815486908, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1208877563477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:41] (step=0006164) Train Loss: 0.2637, Train Steps/Sec: 0.27, Epoch: 0.11978235522736105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6165, "loss": 0.10018131881952286, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6669387817383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:45] (step=0006165) Train Loss: 0.1862, Train Steps/Sec: 0.28, Epoch: 0.11980178779634668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6166, "loss": 0.3812676668167114, "memory_gb": 7.721559524536133, "step_time_ms": 3365.906000137329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:48] (step=0006166) Train Loss: 0.3023, Train Steps/Sec: 0.28, Epoch: 0.1198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6167, "loss": 0.14733947813510895, "memory_gb": 7.721559524536133, "step_time_ms": 3362.795114517212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:52] (step=0006167) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.11984065293431792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6168, "loss": 0.2073131799697876, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2265796661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:55] (step=0006168) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.11986008550330354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6169, "loss": 0.248548686504364, "memory_gb": 7.721559524536133, "step_time_ms": 3363.401412963867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:11:59] (step=0006169) Train Loss: 0.2893, Train Steps/Sec: 0.28, Epoch: 0.11987951807228915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6170, "loss": 0.22991971671581268, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1131134033203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:03] (step=0006170) Train Loss: 0.1786, Train Steps/Sec: 0.27, Epoch: 0.11989895064127477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6171, "loss": 0.21297577023506165, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7892990112305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:06] (step=0006171) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.1199183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6172, "loss": 0.3080269694328308, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2917404174805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:10] (step=0006172) Train Loss: 0.2683, Train Steps/Sec: 0.27, Epoch: 0.11993781577924602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6173, "loss": 0.32935014367103577, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5222702026367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:14] (step=0006173) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.11995724834823164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6174, "loss": 0.27585121989250183, "memory_gb": 7.721559524536133, "step_time_ms": 3362.640380859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:17] (step=0006174) Train Loss: 0.2893, Train Steps/Sec: 0.28, Epoch: 0.11997668091721726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6175, "loss": 0.2518988847732544, "memory_gb": 7.721559524536133, "step_time_ms": 3360.028028488159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:21] (step=0006175) Train Loss: 0.2106, Train Steps/Sec: 0.27, Epoch: 0.11999611348620287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6176, "loss": 0.36970221996307373, "memory_gb": 7.721559524536133, "step_time_ms": 3364.055871963501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:25] (step=0006176) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.1200155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6177, "loss": 0.3181421756744385, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4862174987793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:28] (step=0006177) Train Loss: 0.2373, Train Steps/Sec: 0.27, Epoch: 0.12003497862417412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6178, "loss": 0.2588537037372589, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1319999694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:32] (step=0006178) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.12005441119315974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6179, "loss": 0.28749722242355347, "memory_gb": 7.715639114379883, "step_time_ms": 3333.678722381592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:36] (step=0006179) Train Loss: 0.2623, Train Steps/Sec: 0.27, Epoch: 0.12007384376214536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6180, "loss": 0.18962843716144562, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4632358551025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:39] (step=0006180) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.12009327633113097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6181, "loss": 0.26004961133003235, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3758087158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:43] (step=0006181) Train Loss: 0.2882, Train Steps/Sec: 0.27, Epoch: 0.12011270890011659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6182, "loss": 0.20019418001174927, "memory_gb": 7.721559524536133, "step_time_ms": 3360.935926437378, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:46] (step=0006182) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.12013214146910221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6183, "loss": 0.19313906133174896, "memory_gb": 7.721559524536133, "step_time_ms": 3370.666980743408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:50] (step=0006183) Train Loss: 0.2690, Train Steps/Sec: 0.27, Epoch: 0.12015157403808784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6184, "loss": 0.24529924988746643, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0900382995605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:54] (step=0006184) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.12017100660707346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:12:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6185, "loss": 0.23109662532806396, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6647930145264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:12:57] (step=0006185) Train Loss: 0.2070, Train Steps/Sec: 0.27, Epoch: 0.12019043917605908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6186, "loss": 0.28389155864715576, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0804290771484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:01] (step=0006186) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.12020987174504469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6187, "loss": 0.21266144514083862, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8622035980225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:05] (step=0006187) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.12022930431403031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6188, "loss": 0.25746792554855347, "memory_gb": 7.721559524536133, "step_time_ms": 3365.25297164917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:08] (step=0006188) Train Loss: 0.2300, Train Steps/Sec: 0.26, Epoch: 0.12024873688301593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6189, "loss": 0.1305031180381775, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8077716827393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:12] (step=0006189) Train Loss: 0.2411, Train Steps/Sec: 0.27, Epoch: 0.12026816945200156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6190, "loss": 0.3262937068939209, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4279499053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:16] (step=0006190) Train Loss: 0.3203, Train Steps/Sec: 0.28, Epoch: 0.12028760202098718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6191, "loss": 0.24452750384807587, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4121627807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:19] (step=0006191) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.1203070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6192, "loss": 0.28136032819747925, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0191040039062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:23] (step=0006192) Train Loss: 0.1868, Train Steps/Sec: 0.28, Epoch: 0.12032646715895841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6193, "loss": 0.23178952932357788, "memory_gb": 7.721559524536133, "step_time_ms": 3367.030620574951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:27] (step=0006193) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.12034589972794403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6194, "loss": 0.26469701528549194, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9462928771973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:30] (step=0006194) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.12036533229692965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6195, "loss": 0.1253843605518341, "memory_gb": 7.721559524536133, "step_time_ms": 3367.673635482788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:34] (step=0006195) Train Loss: 0.1882, Train Steps/Sec: 0.27, Epoch: 0.12038476486591528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6196, "loss": 0.34515658020973206, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7449016571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:38] (step=0006196) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.1204041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6197, "loss": 0.16608618199825287, "memory_gb": 7.721559524536133, "step_time_ms": 3369.617700576782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:41] (step=0006197) Train Loss: 0.1969, Train Steps/Sec: 0.27, Epoch: 0.12042363000388652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6198, "loss": 0.33148467540740967, "memory_gb": 7.721559524536133, "step_time_ms": 3367.97833442688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:45] (step=0006198) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.12044306257287213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6199, "loss": 0.25299742817878723, "memory_gb": 7.721559524536133, "step_time_ms": 3369.978427886963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:48] (step=0006199) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.12046249514185775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6200, "loss": 0.22535505890846252, "memory_gb": 7.721559524536133, "step_time_ms": 3517.0369148254395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:52] (step=0006200) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.12048192771084337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6201, "loss": 0.23227562010288239, "memory_gb": 7.721559524536133, "step_time_ms": 3372.420072555542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:56] (step=0006201) Train Loss: 0.2467, Train Steps/Sec: 0.27, Epoch: 0.120501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:13:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6202, "loss": 0.30950361490249634, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4774169921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:13:59] (step=0006202) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.12052079284881462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6203, "loss": 0.261863648891449, "memory_gb": 7.721559524536133, "step_time_ms": 3367.396831512451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:03] (step=0006203) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.12054022541780024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6204, "loss": 0.3320844769477844, "memory_gb": 7.721559524536133, "step_time_ms": 3367.006301879883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:07] (step=0006204) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.12055965798678585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6205, "loss": 0.26082664728164673, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6741333007812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:10] (step=0006205) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.12057909055577147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6206, "loss": 0.12900809943675995, "memory_gb": 7.721559524536133, "step_time_ms": 3364.370346069336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:14] (step=0006206) Train Loss: 0.1714, Train Steps/Sec: 0.28, Epoch: 0.12059852312475709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6207, "loss": 0.208745539188385, "memory_gb": 7.721559524536133, "step_time_ms": 3364.57896232605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:18] (step=0006207) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.12061795569374271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6208, "loss": 0.2820231020450592, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1837997436523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:21] (step=0006208) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.12063738826272834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6209, "loss": 0.2086297869682312, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8878498077393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:25] (step=0006209) Train Loss: 0.2057, Train Steps/Sec: 0.27, Epoch: 0.12065682083171395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6210, "loss": 0.31889551877975464, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9581413269043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:28] (step=0006210) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.12067625340069957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6211, "loss": 0.18484562635421753, "memory_gb": 7.721559524536133, "step_time_ms": 3373.321771621704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:32] (step=0006211) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.12069568596968519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6212, "loss": 0.2795960307121277, "memory_gb": 7.721559524536133, "step_time_ms": 3356.149673461914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:36] (step=0006212) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.12071511853867081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6213, "loss": 0.300474613904953, "memory_gb": 7.721559524536133, "step_time_ms": 3376.5504360198975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:39] (step=0006213) Train Loss: 0.2977, Train Steps/Sec: 0.27, Epoch: 0.12073455110765643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6214, "loss": 0.26681265234947205, "memory_gb": 7.721559524536133, "step_time_ms": 3376.0979175567627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:43] (step=0006214) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.12075398367664206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6215, "loss": 0.20835661888122559, "memory_gb": 7.721559524536133, "step_time_ms": 3375.7224082946777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:47] (step=0006215) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.12077341624562767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6216, "loss": 0.271798312664032, "memory_gb": 7.721559524536133, "step_time_ms": 3377.115249633789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:50] (step=0006216) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.12079284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6217, "loss": 0.18566745519638062, "memory_gb": 7.721559524536133, "step_time_ms": 3381.2413215637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:54] (step=0006217) Train Loss: 0.1872, Train Steps/Sec: 0.27, Epoch: 0.12081228138359891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:14:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6218, "loss": 0.1291477084159851, "memory_gb": 7.721559524536133, "step_time_ms": 3380.070447921753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:14:58] (step=0006218) Train Loss: 0.1807, Train Steps/Sec: 0.27, Epoch: 0.12083171395258453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6219, "loss": 0.2639803886413574, "memory_gb": 7.715639114379883, "step_time_ms": 3344.0029621124268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:01] (step=0006219) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.12085114652157015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6220, "loss": 0.34250420331954956, "memory_gb": 7.721559524536133, "step_time_ms": 3383.263349533081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:05] (step=0006220) Train Loss: 0.3162, Train Steps/Sec: 0.28, Epoch: 0.12087057909055578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6221, "loss": 0.1968156099319458, "memory_gb": 7.721559524536133, "step_time_ms": 3380.6021213531494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:08] (step=0006221) Train Loss: 0.1767, Train Steps/Sec: 0.28, Epoch: 0.12089001165954139, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6222, "loss": 0.1809539943933487, "memory_gb": 7.721559524536133, "step_time_ms": 3380.69486618042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:12] (step=0006222) Train Loss: 0.1686, Train Steps/Sec: 0.28, Epoch: 0.12090944422852701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6223, "loss": 0.21702411770820618, "memory_gb": 7.721559524536133, "step_time_ms": 3378.8022994995117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:16] (step=0006223) Train Loss: 0.2280, Train Steps/Sec: 0.27, Epoch: 0.12092887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6224, "loss": 0.24876394867897034, "memory_gb": 7.721559524536133, "step_time_ms": 3379.751205444336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:19] (step=0006224) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.12094830936649825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6225, "loss": 0.14719676971435547, "memory_gb": 7.721559524536133, "step_time_ms": 3379.117965698242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:23] (step=0006225) Train Loss: 0.2013, Train Steps/Sec: 0.28, Epoch: 0.12096774193548387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6226, "loss": 0.2286646068096161, "memory_gb": 7.721559524536133, "step_time_ms": 3375.915288925171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:27] (step=0006226) Train Loss: 0.2059, Train Steps/Sec: 0.27, Epoch: 0.1209871745044695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6227, "loss": 0.3196852207183838, "memory_gb": 7.721559524536133, "step_time_ms": 3396.869421005249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:30] (step=0006227) Train Loss: 0.2907, Train Steps/Sec: 0.27, Epoch: 0.1210066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6228, "loss": 0.2454218715429306, "memory_gb": 7.721559524536133, "step_time_ms": 3382.164239883423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:34] (step=0006228) Train Loss: 0.2489, Train Steps/Sec: 0.27, Epoch: 0.12102603964244073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6229, "loss": 0.3297916054725647, "memory_gb": 7.721559524536133, "step_time_ms": 3384.6564292907715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:38] (step=0006229) Train Loss: 0.2478, Train Steps/Sec: 0.26, Epoch: 0.12104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6230, "loss": 0.33328258991241455, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2716312408447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:41] (step=0006230) Train Loss: 0.2847, Train Steps/Sec: 0.27, Epoch: 0.12106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6231, "loss": 0.1592726856470108, "memory_gb": 7.721559524536133, "step_time_ms": 3373.178720474243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:45] (step=0006231) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.1210843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6232, "loss": 0.2166113257408142, "memory_gb": 7.721559524536133, "step_time_ms": 3373.582124710083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:49] (step=0006232) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.12110376991838322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6233, "loss": 0.2996790409088135, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3926544189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:52] (step=0006233) Train Loss: 0.3019, Train Steps/Sec: 0.27, Epoch: 0.12112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6234, "loss": 0.2961103916168213, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2849826812744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:15:56] (step=0006234) Train Loss: 0.3070, Train Steps/Sec: 0.27, Epoch: 0.12114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6235, "loss": 0.2290104180574417, "memory_gb": 7.721559524536133, "step_time_ms": 3371.429204940796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:00] (step=0006235) Train Loss: 0.2689, Train Steps/Sec: 0.28, Epoch: 0.12116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6236, "loss": 0.31930357217788696, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5502490997314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:03] (step=0006236) Train Loss: 0.2542, Train Steps/Sec: 0.27, Epoch: 0.12118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6237, "loss": 0.21078050136566162, "memory_gb": 7.721559524536133, "step_time_ms": 3373.162269592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:07] (step=0006237) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.12120093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6238, "loss": 0.26986926794052124, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2850341796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:11] (step=0006238) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.12122036533229694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6239, "loss": 0.3809790015220642, "memory_gb": 7.721559524536133, "step_time_ms": 3378.291130065918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:14] (step=0006239) Train Loss: 0.3166, Train Steps/Sec: 0.28, Epoch: 0.12123979790128254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6240, "loss": 0.18722768127918243, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6095428466797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:18] (step=0006240) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.12125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6241, "loss": 0.24085088074207306, "memory_gb": 7.721559524536133, "step_time_ms": 3516.1776542663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:21] (step=0006241) Train Loss: 0.2835, Train Steps/Sec: 0.28, Epoch: 0.12127866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6242, "loss": 0.2996543347835541, "memory_gb": 7.721559524536133, "step_time_ms": 3372.858762741089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:25] (step=0006242) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.12129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6243, "loss": 0.34791699051856995, "memory_gb": 7.721559524536133, "step_time_ms": 3375.049114227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:29] (step=0006243) Train Loss: 0.3122, Train Steps/Sec: 0.28, Epoch: 0.12131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6244, "loss": 0.28857314586639404, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7190494537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:32] (step=0006244) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.12133696074621064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6245, "loss": 0.2618837356567383, "memory_gb": 7.721559524536133, "step_time_ms": 3371.866464614868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:36] (step=0006245) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.12135639331519626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6246, "loss": 0.19703897833824158, "memory_gb": 7.721559524536133, "step_time_ms": 3371.09375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:40] (step=0006246) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.12137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6247, "loss": 0.18282181024551392, "memory_gb": 7.721559524536133, "step_time_ms": 3374.404191970825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:43] (step=0006247) Train Loss: 0.1707, Train Steps/Sec: 0.28, Epoch: 0.12139525845316751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6248, "loss": 0.21869242191314697, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9841709136963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:47] (step=0006248) Train Loss: 0.2412, Train Steps/Sec: 0.27, Epoch: 0.12141469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6249, "loss": 0.28484654426574707, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1366653442383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:51] (step=0006249) Train Loss: 0.2410, Train Steps/Sec: 0.27, Epoch: 0.12143412359113875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6250, "loss": 0.2498030960559845, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8506565093994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:54] (step=0006250) Train Loss: 0.2401, Train Steps/Sec: 0.27, Epoch: 0.12145355616012436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:16:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6251, "loss": 0.1862882673740387, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9944744110107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:16:58] (step=0006251) Train Loss: 0.2076, Train Steps/Sec: 0.27, Epoch: 0.12147298872910998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6252, "loss": 0.23167598247528076, "memory_gb": 7.721559524536133, "step_time_ms": 3369.605779647827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:01] (step=0006252) Train Loss: 0.2437, Train Steps/Sec: 0.27, Epoch: 0.12149242129809561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6253, "loss": 0.2961939573287964, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0079498291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:05] (step=0006253) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.12151185386708123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6254, "loss": 0.2850784957408905, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8864917755127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:09] (step=0006254) Train Loss: 0.2938, Train Steps/Sec: 0.28, Epoch: 0.12153128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6255, "loss": 0.29074689745903015, "memory_gb": 7.721559524536133, "step_time_ms": 3373.7919330596924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:12] (step=0006255) Train Loss: 0.2801, Train Steps/Sec: 0.27, Epoch: 0.12155071900505247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6256, "loss": 0.11399523913860321, "memory_gb": 7.721559524536133, "step_time_ms": 3371.6800212860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:16] (step=0006256) Train Loss: 0.1738, Train Steps/Sec: 0.27, Epoch: 0.12157015157403808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6257, "loss": 0.31283652782440186, "memory_gb": 7.721559524536133, "step_time_ms": 3373.748540878296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:20] (step=0006257) Train Loss: 0.2932, Train Steps/Sec: 0.27, Epoch: 0.1215895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6258, "loss": 0.2564261555671692, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1961975097656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:23] (step=0006258) Train Loss: 0.2883, Train Steps/Sec: 0.27, Epoch: 0.12160901671200933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6259, "loss": 0.28954142332077026, "memory_gb": 7.721559524536133, "step_time_ms": 3372.957944869995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:27] (step=0006259) Train Loss: 0.2142, Train Steps/Sec: 0.27, Epoch: 0.12162844928099495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6260, "loss": 0.2997628450393677, "memory_gb": 7.721559524536133, "step_time_ms": 3369.610071182251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:31] (step=0006260) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.12164788184998057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6261, "loss": 0.3124489188194275, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2306747436523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:34] (step=0006261) Train Loss: 0.2478, Train Steps/Sec: 0.27, Epoch: 0.1216673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6262, "loss": 0.21212491393089294, "memory_gb": 7.721559524536133, "step_time_ms": 3368.95751953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:38] (step=0006262) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.1216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6263, "loss": 0.20576128363609314, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9354400634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:42] (step=0006263) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.12170617955693742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6264, "loss": 0.21813026070594788, "memory_gb": 7.721559524536133, "step_time_ms": 3375.6980895996094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:45] (step=0006264) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.12172561212592305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6265, "loss": 0.28154903650283813, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6023902893066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:49] (step=0006265) Train Loss: 0.2495, Train Steps/Sec: 0.27, Epoch: 0.12174504469490867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6266, "loss": 0.21688926219940186, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9910640716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:52] (step=0006266) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.12176447726389429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6267, "loss": 0.1733373999595642, "memory_gb": 7.721559524536133, "step_time_ms": 3370.149850845337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:17:56] (step=0006267) Train Loss: 0.1574, Train Steps/Sec: 0.28, Epoch: 0.12178390983287991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6268, "loss": 0.2152458131313324, "memory_gb": 7.721559524536133, "step_time_ms": 3372.164011001587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:00] (step=0006268) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.12180334240186552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6269, "loss": 0.1979578733444214, "memory_gb": 7.721559524536133, "step_time_ms": 3373.1844425201416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:03] (step=0006269) Train Loss: 0.2162, Train Steps/Sec: 0.27, Epoch: 0.12182277497085114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6270, "loss": 0.32550475001335144, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3433017730713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:07] (step=0006270) Train Loss: 0.2710, Train Steps/Sec: 0.27, Epoch: 0.12184220753983677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6271, "loss": 0.1723865121603012, "memory_gb": 7.721559524536133, "step_time_ms": 3373.0242252349854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:11] (step=0006271) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.12186164010882239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6272, "loss": 0.3216767907142639, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6158657073975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:14] (step=0006272) Train Loss: 0.3103, Train Steps/Sec: 0.28, Epoch: 0.12188107267780801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6273, "loss": 0.2579799294471741, "memory_gb": 7.721559524536133, "step_time_ms": 3368.032455444336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:18] (step=0006273) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.12190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6274, "loss": 0.25829803943634033, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9917068481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:22] (step=0006274) Train Loss: 0.2813, Train Steps/Sec: 0.27, Epoch: 0.12191993781577924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6275, "loss": 0.20537802577018738, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6226348876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:25] (step=0006275) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.12193937038476486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6276, "loss": 0.2529803216457367, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4728870391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:29] (step=0006276) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.12195880295375049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6277, "loss": 0.3364531993865967, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5104122161865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:33] (step=0006277) Train Loss: 0.3303, Train Steps/Sec: 0.27, Epoch: 0.12197823552273611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6278, "loss": 0.18808913230895996, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4728870391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:36] (step=0006278) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.12199766809172173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6279, "loss": 0.2160467654466629, "memory_gb": 7.721559524536133, "step_time_ms": 3367.521047592163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:40] (step=0006279) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.12201710066070734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6280, "loss": 0.32469040155410767, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8065071105957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:44] (step=0006280) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.12203653322969296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6281, "loss": 0.2967880368232727, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2679405212402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:47] (step=0006281) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.12205596579867858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6282, "loss": 0.21374504268169403, "memory_gb": 7.721559524536133, "step_time_ms": 3368.846893310547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:51] (step=0006282) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.1220753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6283, "loss": 0.2790791392326355, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7801876068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:54] (step=0006283) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.12209483093664983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:18:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6284, "loss": 0.24897484481334686, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1822547912598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:18:58] (step=0006284) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.12211426350563545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6285, "loss": 0.21676325798034668, "memory_gb": 7.721559524536133, "step_time_ms": 3372.159957885742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:02] (step=0006285) Train Loss: 0.2042, Train Steps/Sec: 0.27, Epoch: 0.12213369607462106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6286, "loss": 0.2599213123321533, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4651851654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:05] (step=0006286) Train Loss: 0.2615, Train Steps/Sec: 0.27, Epoch: 0.12215312864360668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6287, "loss": 0.22962313890457153, "memory_gb": 7.721559524536133, "step_time_ms": 3367.889642715454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:09] (step=0006287) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.1221725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6288, "loss": 0.22033648192882538, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7409629821777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:13] (step=0006288) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.12219199378157793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6289, "loss": 0.2105085253715515, "memory_gb": 7.721559524536133, "step_time_ms": 3519.024610519409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:16] (step=0006289) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.12221142635056355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6290, "loss": 0.33954283595085144, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1586303710938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:20] (step=0006290) Train Loss: 0.3165, Train Steps/Sec: 0.28, Epoch: 0.12223085891954917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6291, "loss": 0.1752636432647705, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9590950012207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:24] (step=0006291) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.12225029148853478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6292, "loss": 0.1971820592880249, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4612255096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:27] (step=0006292) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.1222697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6293, "loss": 0.21754267811775208, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1071739196777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:31] (step=0006293) Train Loss: 0.2731, Train Steps/Sec: 0.27, Epoch: 0.12228915662650602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6294, "loss": 0.2667781710624695, "memory_gb": 7.721559524536133, "step_time_ms": 3369.462728500366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:34] (step=0006294) Train Loss: 0.2246, Train Steps/Sec: 0.27, Epoch: 0.12230858919549165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6295, "loss": 0.3141961395740509, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9237365722656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:38] (step=0006295) Train Loss: 0.2964, Train Steps/Sec: 0.28, Epoch: 0.12232802176447727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6296, "loss": 0.3208853006362915, "memory_gb": 7.721559524536133, "step_time_ms": 3367.117404937744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:42] (step=0006296) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.12234745433346289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6297, "loss": 0.18550661206245422, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8063201904297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:45] (step=0006297) Train Loss: 0.1821, Train Steps/Sec: 0.28, Epoch: 0.1223668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6298, "loss": 0.2533208131790161, "memory_gb": 7.721559524536133, "step_time_ms": 3362.56742477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:49] (step=0006298) Train Loss: 0.2710, Train Steps/Sec: 0.27, Epoch: 0.12238631947143412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6299, "loss": 0.1296001672744751, "memory_gb": 7.721559524536133, "step_time_ms": 3364.046573638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:53] (step=0006299) Train Loss: 0.1538, Train Steps/Sec: 0.28, Epoch: 0.12240575204041974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:19:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6300, "loss": 0.23039484024047852, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7820739746094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:19:56] (step=0006300) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.12242518460940537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6301, "loss": 0.24729859828948975, "memory_gb": 7.721559524536133, "step_time_ms": 3364.802837371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:00] (step=0006301) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.12244461717839099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6302, "loss": 0.30857646465301514, "memory_gb": 7.721559524536133, "step_time_ms": 3347.055435180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:04] (step=0006302) Train Loss: 0.3203, Train Steps/Sec: 0.28, Epoch: 0.1224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6303, "loss": 0.2048155665397644, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0738258361816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:07] (step=0006303) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.12248348231636222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6304, "loss": 0.3703233599662781, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8756275177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:11] (step=0006304) Train Loss: 0.3159, Train Steps/Sec: 0.28, Epoch: 0.12250291488534784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6305, "loss": 0.26635557413101196, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0940704345703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:14] (step=0006305) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.12252234745433346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6306, "loss": 0.23407433927059174, "memory_gb": 7.721559524536133, "step_time_ms": 3361.111640930176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:18] (step=0006306) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.12254178002331909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6307, "loss": 0.28838032484054565, "memory_gb": 7.721559524536133, "step_time_ms": 3361.440658569336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:22] (step=0006307) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.12256121259230471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6308, "loss": 0.2489023208618164, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2309951782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:25] (step=0006308) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.12258064516129032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6309, "loss": 0.2500919699668884, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7282638549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:29] (step=0006309) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.12260007773027594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6310, "loss": 0.24307331442832947, "memory_gb": 7.721559524536133, "step_time_ms": 3367.255687713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:33] (step=0006310) Train Loss: 0.2885, Train Steps/Sec: 0.27, Epoch: 0.12261951029926156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6311, "loss": 0.2099454253911972, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9426231384277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:36] (step=0006311) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.12263894286824718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6312, "loss": 0.18042831122875214, "memory_gb": 7.721559524536133, "step_time_ms": 3361.074686050415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:40] (step=0006312) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.1226583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6313, "loss": 0.24164779484272003, "memory_gb": 7.721559524536133, "step_time_ms": 3367.687463760376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:44] (step=0006313) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.12267780800621843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6314, "loss": 0.21882349252700806, "memory_gb": 7.721559524536133, "step_time_ms": 3362.940788269043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:47] (step=0006314) Train Loss: 0.1893, Train Steps/Sec: 0.28, Epoch: 0.12269724057520404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6315, "loss": 0.21875852346420288, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5025959014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:51] (step=0006315) Train Loss: 0.2643, Train Steps/Sec: 0.27, Epoch: 0.12271667314418966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6316, "loss": 0.18582578003406525, "memory_gb": 7.721559524536133, "step_time_ms": 3361.232280731201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:54] (step=0006316) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.12273610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:20:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6317, "loss": 0.2961832284927368, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3578758239746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:20:58] (step=0006317) Train Loss: 0.2518, Train Steps/Sec: 0.27, Epoch: 0.1227555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6318, "loss": 0.27443066239356995, "memory_gb": 7.721559524536133, "step_time_ms": 3361.530542373657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:02] (step=0006318) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.12277497085114653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6319, "loss": 0.18920272588729858, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3855113983154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:05] (step=0006319) Train Loss: 0.2104, Train Steps/Sec: 0.27, Epoch: 0.12279440342013215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6320, "loss": 0.20676816999912262, "memory_gb": 7.721559524536133, "step_time_ms": 3365.429639816284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:09] (step=0006320) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.12281383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6321, "loss": 0.2593516707420349, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7334575653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:13] (step=0006321) Train Loss: 0.2320, Train Steps/Sec: 0.27, Epoch: 0.12283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6322, "loss": 0.2721128463745117, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0081157684326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:16] (step=0006322) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.122852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6323, "loss": 0.29129940271377563, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0801181793213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:20] (step=0006323) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.12287213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6324, "loss": 0.17406532168388367, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9046630859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:24] (step=0006324) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.12289156626506025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6325, "loss": 0.24477504193782806, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9117221832275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:27] (step=0006325) Train Loss: 0.2595, Train Steps/Sec: 0.27, Epoch: 0.12291099883404587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6326, "loss": 0.23116296529769897, "memory_gb": 7.721559524536133, "step_time_ms": 3363.405704498291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:31] (step=0006326) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.12293043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6327, "loss": 0.13231784105300903, "memory_gb": 7.721559524536133, "step_time_ms": 3361.45281791687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:35] (step=0006327) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.1229498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6328, "loss": 0.24503794312477112, "memory_gb": 7.721559524536133, "step_time_ms": 3364.525556564331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:38] (step=0006328) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.12296929654100272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6329, "loss": 0.29936960339546204, "memory_gb": 7.721559524536133, "step_time_ms": 3512.4459266662598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:42] (step=0006329) Train Loss: 0.2793, Train Steps/Sec: 0.27, Epoch: 0.12298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6330, "loss": 0.2045743465423584, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7587299346924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:46] (step=0006330) Train Loss: 0.2320, Train Steps/Sec: 0.27, Epoch: 0.12300816167897397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6331, "loss": 0.2694427967071533, "memory_gb": 7.721559524536133, "step_time_ms": 3366.865873336792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:49] (step=0006331) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.12302759424795957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6332, "loss": 0.16897615790367126, "memory_gb": 7.721559524536133, "step_time_ms": 3368.776798248291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:53] (step=0006332) Train Loss: 0.1721, Train Steps/Sec: 0.28, Epoch: 0.1230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:21:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6333, "loss": 0.2844840884208679, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1786785125732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:21:56] (step=0006333) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.12306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6334, "loss": 0.2750091552734375, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7075424194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:00] (step=0006334) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.12308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6335, "loss": 0.27764835953712463, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5427646636963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:04] (step=0006335) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.12310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6336, "loss": 0.20822712779045105, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0266609191895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:07] (step=0006336) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.12312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6337, "loss": 0.15210102498531342, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8629398345947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:11] (step=0006337) Train Loss: 0.1814, Train Steps/Sec: 0.28, Epoch: 0.1231441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6338, "loss": 0.20887801051139832, "memory_gb": 7.721559524536133, "step_time_ms": 3366.154432296753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:15] (step=0006338) Train Loss: 0.2764, Train Steps/Sec: 0.28, Epoch: 0.12316362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6339, "loss": 0.32551369071006775, "memory_gb": 7.721559524536133, "step_time_ms": 3368.401050567627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:18] (step=0006339) Train Loss: 0.2531, Train Steps/Sec: 0.27, Epoch: 0.12318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6340, "loss": 0.31008219718933105, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4940547943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:22] (step=0006340) Train Loss: 0.3561, Train Steps/Sec: 0.28, Epoch: 0.12320248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6341, "loss": 0.30083614587783813, "memory_gb": 7.721559524536133, "step_time_ms": 3368.293523788452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:26] (step=0006341) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.12322191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6342, "loss": 0.21525946259498596, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8457012176514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:29] (step=0006342) Train Loss: 0.1987, Train Steps/Sec: 0.28, Epoch: 0.1232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6343, "loss": 0.2266933023929596, "memory_gb": 7.721559524536133, "step_time_ms": 3369.38214302063, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:33] (step=0006343) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.12326078507578701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6344, "loss": 0.29091954231262207, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1743564605713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:36] (step=0006344) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.12328021764477264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6345, "loss": 0.2171764075756073, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0631275177, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:40] (step=0006345) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.12329965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6346, "loss": 0.20088525116443634, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2514991760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:44] (step=0006346) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.12331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6347, "loss": 0.19027674198150635, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5451488494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:47] (step=0006347) Train Loss: 0.1785, Train Steps/Sec: 0.28, Epoch: 0.1233385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6348, "loss": 0.27953052520751953, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4448680877686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:51] (step=0006348) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.12335794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6349, "loss": 0.3518643379211426, "memory_gb": 7.721559524536133, "step_time_ms": 3373.506784439087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:55] (step=0006349) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.12337738048970073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:22:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6350, "loss": 0.18725869059562683, "memory_gb": 7.721559524536133, "step_time_ms": 3376.380681991577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:22:58] (step=0006350) Train Loss: 0.2406, Train Steps/Sec: 0.27, Epoch: 0.12339681305868636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6351, "loss": 0.22991809248924255, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3229637145996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:02] (step=0006351) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.12341624562767198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6352, "loss": 0.31774771213531494, "memory_gb": 7.721559524536133, "step_time_ms": 3356.271982192993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:05] (step=0006352) Train Loss: 0.3455, Train Steps/Sec: 0.28, Epoch: 0.1234356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6353, "loss": 0.3252650499343872, "memory_gb": 7.721559524536133, "step_time_ms": 3379.8322677612305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:09] (step=0006353) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.12345511076564322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6354, "loss": 0.11755695194005966, "memory_gb": 7.721559524536133, "step_time_ms": 3380.409002304077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:13] (step=0006354) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.12347454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6355, "loss": 0.24750031530857086, "memory_gb": 7.721559524536133, "step_time_ms": 3379.0698051452637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:16] (step=0006355) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.12349397590361445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6356, "loss": 0.3245319128036499, "memory_gb": 7.721559524536133, "step_time_ms": 3386.7526054382324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:20] (step=0006356) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.12351340847260008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6357, "loss": 0.22830572724342346, "memory_gb": 7.721559524536133, "step_time_ms": 3385.0343227386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:24] (step=0006357) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.1235328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6358, "loss": 0.1836216002702713, "memory_gb": 7.721559524536133, "step_time_ms": 3371.3650703430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:27] (step=0006358) Train Loss: 0.1762, Train Steps/Sec: 0.28, Epoch: 0.12355227361057132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6359, "loss": 0.28127333521842957, "memory_gb": 7.721559524536133, "step_time_ms": 3379.1661262512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:31] (step=0006359) Train Loss: 0.2488, Train Steps/Sec: 0.28, Epoch: 0.12357170617955694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6360, "loss": 0.16526947915554047, "memory_gb": 7.721559524536133, "step_time_ms": 3383.9757442474365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:35] (step=0006360) Train Loss: 0.1966, Train Steps/Sec: 0.27, Epoch: 0.12359113874854255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6361, "loss": 0.238165944814682, "memory_gb": 7.721559524536133, "step_time_ms": 3376.2385845184326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:38] (step=0006361) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.12361057131752817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6362, "loss": 0.31244659423828125, "memory_gb": 7.721559524536133, "step_time_ms": 3379.206418991089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:42] (step=0006362) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.1236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6363, "loss": 0.2396571934223175, "memory_gb": 7.721559524536133, "step_time_ms": 3385.427951812744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:45] (step=0006363) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.12364943645549942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6364, "loss": 0.3493567407131195, "memory_gb": 7.721559524536133, "step_time_ms": 3385.8752250671387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:49] (step=0006364) Train Loss: 0.2828, Train Steps/Sec: 0.26, Epoch: 0.12366886902448504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6365, "loss": 0.14457736909389496, "memory_gb": 7.721559524536133, "step_time_ms": 3375.415086746216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:53] (step=0006365) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.12368830159347066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6366, "loss": 0.2391243875026703, "memory_gb": 7.721559524536133, "step_time_ms": 3385.6306076049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:23:56] (step=0006366) Train Loss: 0.2330, Train Steps/Sec: 0.27, Epoch: 0.12370773416245627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6367, "loss": 0.21034866571426392, "memory_gb": 7.721559524536133, "step_time_ms": 3376.9760131835938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:00] (step=0006367) Train Loss: 0.2214, Train Steps/Sec: 0.27, Epoch: 0.12372716673144189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6368, "loss": 0.33644500374794006, "memory_gb": 7.721559524536133, "step_time_ms": 3373.321533203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:04] (step=0006368) Train Loss: 0.3450, Train Steps/Sec: 0.27, Epoch: 0.12374659930042751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6369, "loss": 0.22664806246757507, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5125999450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:07] (step=0006369) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.12376603186941314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6370, "loss": 0.24319282174110413, "memory_gb": 7.721559524536133, "step_time_ms": 3380.5911540985107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:11] (step=0006370) Train Loss: 0.2271, Train Steps/Sec: 0.27, Epoch: 0.12378546443839876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6371, "loss": 0.28405192494392395, "memory_gb": 7.721559524536133, "step_time_ms": 3372.746229171753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:15] (step=0006371) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.12380489700738438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6372, "loss": 0.20846067368984222, "memory_gb": 7.721559524536133, "step_time_ms": 3373.400926589966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:18] (step=0006372) Train Loss: 0.2123, Train Steps/Sec: 0.27, Epoch: 0.12382432957636999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6373, "loss": 0.1769028604030609, "memory_gb": 7.721559524536133, "step_time_ms": 3367.506504058838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:22] (step=0006373) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.12384376214535561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6374, "loss": 0.2382698357105255, "memory_gb": 7.721559524536133, "step_time_ms": 3370.121479034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:26] (step=0006374) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.12386319471434123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6375, "loss": 0.2828575074672699, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2007064819336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:29] (step=0006375) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.12388262728332686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6376, "loss": 0.22587989270687103, "memory_gb": 7.721559524536133, "step_time_ms": 3516.9405937194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:33] (step=0006376) Train Loss: 0.1815, Train Steps/Sec: 0.27, Epoch: 0.12390205985231248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6377, "loss": 0.2458820641040802, "memory_gb": 7.721559524536133, "step_time_ms": 3369.4703578948975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:37] (step=0006377) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.1239214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6378, "loss": 0.2756924629211426, "memory_gb": 7.721559524536133, "step_time_ms": 3368.837833404541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:40] (step=0006378) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.12394092499028371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6379, "loss": 0.1839543581008911, "memory_gb": 7.721559524536133, "step_time_ms": 3369.830846786499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:44] (step=0006379) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.12396035755926933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6380, "loss": 0.28120532631874084, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5014457702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:47] (step=0006380) Train Loss: 0.2559, Train Steps/Sec: 0.27, Epoch: 0.12397979012825495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6381, "loss": 0.2803904116153717, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3550357818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:51] (step=0006381) Train Loss: 0.2483, Train Steps/Sec: 0.27, Epoch: 0.12399922269724058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6382, "loss": 0.22338944673538208, "memory_gb": 7.721559524536133, "step_time_ms": 3365.870475769043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:55] (step=0006382) Train Loss: 0.2103, Train Steps/Sec: 0.27, Epoch: 0.1240186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:24:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6383, "loss": 0.21292850375175476, "memory_gb": 7.721559524536133, "step_time_ms": 3367.178201675415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:24:58] (step=0006383) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.12403808783521182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6384, "loss": 0.19573411345481873, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5008335113525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:02] (step=0006384) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.12405752040419743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6385, "loss": 0.1832723319530487, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9752349853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:06] (step=0006385) Train Loss: 0.2484, Train Steps/Sec: 0.27, Epoch: 0.12407695297318305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6386, "loss": 0.27486178278923035, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9652729034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:09] (step=0006386) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.12409638554216867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6387, "loss": 0.16774138808250427, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3462448120117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:13] (step=0006387) Train Loss: 0.1506, Train Steps/Sec: 0.28, Epoch: 0.1241158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6388, "loss": 0.13121147453784943, "memory_gb": 7.721559524536133, "step_time_ms": 3375.972032546997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:17] (step=0006388) Train Loss: 0.2143, Train Steps/Sec: 0.27, Epoch: 0.12413525068013992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6389, "loss": 0.2954542636871338, "memory_gb": 7.721559524536133, "step_time_ms": 3367.861747741699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:20] (step=0006389) Train Loss: 0.2197, Train Steps/Sec: 0.27, Epoch: 0.12415468324912553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6390, "loss": 0.23951831459999084, "memory_gb": 7.721559524536133, "step_time_ms": 3371.745824813843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:24] (step=0006390) Train Loss: 0.2291, Train Steps/Sec: 0.27, Epoch: 0.12417411581811115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6391, "loss": 0.2560022175312042, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5925216674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:27] (step=0006391) Train Loss: 0.2781, Train Steps/Sec: 0.27, Epoch: 0.12419354838709677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6392, "loss": 0.2647797465324402, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3554611206055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:31] (step=0006392) Train Loss: 0.2676, Train Steps/Sec: 0.27, Epoch: 0.1242129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6393, "loss": 0.35346895456314087, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5483207702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:35] (step=0006393) Train Loss: 0.3363, Train Steps/Sec: 0.28, Epoch: 0.12423241352506802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6394, "loss": 0.18538151681423187, "memory_gb": 7.721559524536133, "step_time_ms": 3371.778726577759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:38] (step=0006394) Train Loss: 0.2852, Train Steps/Sec: 0.28, Epoch: 0.12425184609405364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6395, "loss": 0.34348905086517334, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2938137054443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:42] (step=0006395) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.12427127866303925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6396, "loss": 0.29924315214157104, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4809608459473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:46] (step=0006396) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.12429071123202487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6397, "loss": 0.3088105320930481, "memory_gb": 7.721559524536133, "step_time_ms": 3366.494655609131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:49] (step=0006397) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.12431014380101049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6398, "loss": 0.38688576221466064, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6388988494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:53] (step=0006398) Train Loss: 0.3051, Train Steps/Sec: 0.28, Epoch: 0.12432957636999611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:25:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6399, "loss": 0.2899147570133209, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1441078186035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:25:57] (step=0006399) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.12434900893898174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6400, "loss": 0.19394758343696594, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0930137634277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:00] (step=0006400) Train Loss: 0.1969, Train Steps/Sec: 0.28, Epoch: 0.12436844150796736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6401, "loss": 0.30139559507369995, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4209327697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:04] (step=0006401) Train Loss: 0.3315, Train Steps/Sec: 0.27, Epoch: 0.12438787407695297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6402, "loss": 0.18875491619110107, "memory_gb": 7.721559524536133, "step_time_ms": 3368.271827697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:07] (step=0006402) Train Loss: 0.2490, Train Steps/Sec: 0.27, Epoch: 0.12440730664593859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6403, "loss": 0.19033299386501312, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9497032165527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:11] (step=0006403) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.12442673921492421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6404, "loss": 0.29720044136047363, "memory_gb": 7.721559524536133, "step_time_ms": 3367.202043533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:15] (step=0006404) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.12444617178390983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6405, "loss": 0.2091585099697113, "memory_gb": 7.721559524536133, "step_time_ms": 3362.305164337158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:19] (step=0006405) Train Loss: 0.2637, Train Steps/Sec: 0.26, Epoch: 0.12446560435289546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6406, "loss": 0.1943684220314026, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2189292907715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:22] (step=0006406) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.12448503692188108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6407, "loss": 0.1627541035413742, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0849800109863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:26] (step=0006407) Train Loss: 0.1790, Train Steps/Sec: 0.28, Epoch: 0.12450446949086669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6408, "loss": 0.23860694468021393, "memory_gb": 7.721559524536133, "step_time_ms": 3365.478038787842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:29] (step=0006408) Train Loss: 0.2780, Train Steps/Sec: 0.28, Epoch: 0.12452390205985231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6409, "loss": 0.16453738510608673, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0896549224854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:33] (step=0006409) Train Loss: 0.2154, Train Steps/Sec: 0.27, Epoch: 0.12454333462883793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6410, "loss": 0.3447321057319641, "memory_gb": 7.721559524536133, "step_time_ms": 3367.521047592163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:37] (step=0006410) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.12456276719782355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6411, "loss": 0.30834081768989563, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8411502838135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:40] (step=0006411) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.12458219976680918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6412, "loss": 0.15600796043872833, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1367378234863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:44] (step=0006412) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.1246016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6413, "loss": 0.1640177071094513, "memory_gb": 7.721559524536133, "step_time_ms": 3369.614839553833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:48] (step=0006413) Train Loss: 0.1901, Train Steps/Sec: 0.27, Epoch: 0.12462106490478041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6414, "loss": 0.19893357157707214, "memory_gb": 7.721559524536133, "step_time_ms": 3364.096164703369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:51] (step=0006414) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.12464049747376603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6415, "loss": 0.25171637535095215, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5409812927246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:55] (step=0006415) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.12465993004275165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6416, "loss": 0.2324075847864151, "memory_gb": 7.721559524536133, "step_time_ms": 3365.717887878418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:26:59] (step=0006416) Train Loss: 0.2613, Train Steps/Sec: 0.28, Epoch: 0.12467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6417, "loss": 0.33001720905303955, "memory_gb": 7.721559524536133, "step_time_ms": 3511.7619037628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:02] (step=0006417) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.1246987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6418, "loss": 0.14772678911685944, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2139225006104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:06] (step=0006418) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.1247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6419, "loss": 0.18177148699760437, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8601303100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:09] (step=0006419) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.12473766031869413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6420, "loss": 0.2580329179763794, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5714683532715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:13] (step=0006420) Train Loss: 0.2834, Train Steps/Sec: 0.28, Epoch: 0.12475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6421, "loss": 0.15413427352905273, "memory_gb": 7.721559524536133, "step_time_ms": 3362.858533859253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:17] (step=0006421) Train Loss: 0.1532, Train Steps/Sec: 0.27, Epoch: 0.12477652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6422, "loss": 0.28422629833221436, "memory_gb": 7.721559524536133, "step_time_ms": 3369.220972061157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:20] (step=0006422) Train Loss: 0.3030, Train Steps/Sec: 0.27, Epoch: 0.124795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6423, "loss": 0.193938210606575, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0953979492188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:24] (step=0006423) Train Loss: 0.2366, Train Steps/Sec: 0.27, Epoch: 0.12481539059463662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6424, "loss": 0.2145502120256424, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0411109924316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:28] (step=0006424) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.12483482316362222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6425, "loss": 0.23396092653274536, "memory_gb": 7.721559524536133, "step_time_ms": 3360.361099243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:31] (step=0006425) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.12485425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6426, "loss": 0.2445732057094574, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1815395355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:35] (step=0006426) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.12487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6427, "loss": 0.21094611287117004, "memory_gb": 7.721559524536133, "step_time_ms": 3367.811441421509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:39] (step=0006427) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.12489312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6428, "loss": 0.2678399085998535, "memory_gb": 7.721559524536133, "step_time_ms": 3367.631196975708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:42] (step=0006428) Train Loss: 0.2248, Train Steps/Sec: 0.27, Epoch: 0.12491255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6429, "loss": 0.24876728653907776, "memory_gb": 7.721559524536133, "step_time_ms": 3366.950750350952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:46] (step=0006429) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.12493198600855034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6430, "loss": 0.2268444001674652, "memory_gb": 7.721559524536133, "step_time_ms": 3370.083808898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:49] (step=0006430) Train Loss: 0.2340, Train Steps/Sec: 0.27, Epoch: 0.12495141857753594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6431, "loss": 0.23193681240081787, "memory_gb": 7.721559524536133, "step_time_ms": 3366.786241531372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:53] (step=0006431) Train Loss: 0.2941, Train Steps/Sec: 0.28, Epoch: 0.12497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6432, "loss": 0.2225879430770874, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6555366516113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:27:57] (step=0006432) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.12499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6433, "loss": 0.18488165736198425, "memory_gb": 7.721559524536133, "step_time_ms": 3364.18080329895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:00] (step=0006433) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.1250097162844928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6434, "loss": 0.2464943528175354, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6320056915283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:04] (step=0006434) Train Loss: 0.2503, Train Steps/Sec: 0.27, Epoch: 0.12502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6435, "loss": 0.17231279611587524, "memory_gb": 7.721559524536133, "step_time_ms": 3366.436719894409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:08] (step=0006435) Train Loss: 0.1405, Train Steps/Sec: 0.28, Epoch: 0.12504858142246406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6436, "loss": 0.17785778641700745, "memory_gb": 7.721559524536133, "step_time_ms": 3369.3485260009766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:11] (step=0006436) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.12506801399144968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6437, "loss": 0.22946035861968994, "memory_gb": 7.721559524536133, "step_time_ms": 3373.31485748291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:15] (step=0006437) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.1250874465604353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6438, "loss": 0.32426148653030396, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8511848449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:19] (step=0006438) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.12510687912942092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6439, "loss": 0.3702928423881531, "memory_gb": 7.721559524536133, "step_time_ms": 3368.793249130249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:22] (step=0006439) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.12512631169840652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6440, "loss": 0.2377418875694275, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6493167877197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:26] (step=0006440) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.12514574426739214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6441, "loss": 0.1972929835319519, "memory_gb": 7.721559524536133, "step_time_ms": 3366.114854812622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:29] (step=0006441) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.12516517683637776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6442, "loss": 0.14179740846157074, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8954582214355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:33] (step=0006442) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.12518460940536338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6443, "loss": 0.22190254926681519, "memory_gb": 7.721559524536133, "step_time_ms": 3369.039297103882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:37] (step=0006443) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.125204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6444, "loss": 0.2207106202840805, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8886165618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:40] (step=0006444) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.12522347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6445, "loss": 0.15276791155338287, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4520931243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:44] (step=0006445) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.12524290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6446, "loss": 0.24552102386951447, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8439807891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:48] (step=0006446) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.12526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6447, "loss": 0.22707504034042358, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8322048187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:51] (step=0006447) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.1252817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6448, "loss": 0.23048460483551025, "memory_gb": 7.721559524536133, "step_time_ms": 3353.212594985962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:55] (step=0006448) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.12530120481927712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6449, "loss": 0.2888188064098358, "memory_gb": 7.721559524536133, "step_time_ms": 3373.5506534576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:28:59] (step=0006449) Train Loss: 0.2520, Train Steps/Sec: 0.27, Epoch: 0.12532063738826274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6450, "loss": 0.3567328453063965, "memory_gb": 7.721559524536133, "step_time_ms": 3382.5671672821045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:02] (step=0006450) Train Loss: 0.2962, Train Steps/Sec: 0.27, Epoch: 0.12534006995724833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6451, "loss": 0.2950189411640167, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2008209228516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:06] (step=0006451) Train Loss: 0.2799, Train Steps/Sec: 0.27, Epoch: 0.12535950252623396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6452, "loss": 0.26743125915527344, "memory_gb": 7.721559524536133, "step_time_ms": 3376.5945434570312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:09] (step=0006452) Train Loss: 0.2842, Train Steps/Sec: 0.27, Epoch: 0.12537893509521958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6453, "loss": 0.20471814274787903, "memory_gb": 7.721559524536133, "step_time_ms": 3373.434066772461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:13] (step=0006453) Train Loss: 0.1986, Train Steps/Sec: 0.26, Epoch: 0.1253983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6454, "loss": 0.16192063689231873, "memory_gb": 7.721559524536133, "step_time_ms": 3372.514009475708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:17] (step=0006454) Train Loss: 0.1600, Train Steps/Sec: 0.28, Epoch: 0.12541780023319082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6455, "loss": 0.30409932136535645, "memory_gb": 7.721559524536133, "step_time_ms": 3371.096611022949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:20] (step=0006455) Train Loss: 0.2782, Train Steps/Sec: 0.27, Epoch: 0.12543723280217645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6456, "loss": 0.1943066120147705, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7972507476807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:24] (step=0006456) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.12545666537116207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6457, "loss": 0.2755027115345001, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5745677948, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:28] (step=0006457) Train Loss: 0.2693, Train Steps/Sec: 0.27, Epoch: 0.1254760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6458, "loss": 0.30403316020965576, "memory_gb": 7.721559524536133, "step_time_ms": 3520.5237865448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:31] (step=0006458) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.1254955305091333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6459, "loss": 0.24507403373718262, "memory_gb": 7.721559524536133, "step_time_ms": 3372.5178241729736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:35] (step=0006459) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.12551496307811894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6460, "loss": 0.2924552261829376, "memory_gb": 7.721559524536133, "step_time_ms": 3372.6584911346436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:39] (step=0006460) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.12553439564710456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6461, "loss": 0.23222461342811584, "memory_gb": 7.721559524536133, "step_time_ms": 3375.8418560028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:42] (step=0006461) Train Loss: 0.2088, Train Steps/Sec: 0.27, Epoch: 0.12555382821609018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6462, "loss": 0.18531674146652222, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1533756256104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:46] (step=0006462) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.12557326078507577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6463, "loss": 0.22983142733573914, "memory_gb": 7.721559524536133, "step_time_ms": 3371.528387069702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:50] (step=0006463) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.1255926933540614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6464, "loss": 0.19039058685302734, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4981079101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:53] (step=0006464) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.12561212592304702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:29:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6465, "loss": 0.2658287286758423, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1343536376953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:29:57] (step=0006465) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.12563155849203264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6466, "loss": 0.277386337518692, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6069507598877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:00] (step=0006466) Train Loss: 0.2863, Train Steps/Sec: 0.28, Epoch: 0.12565099106101826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6467, "loss": 0.2885819673538208, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1209087371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:04] (step=0006467) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.12567042363000389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6468, "loss": 0.22676634788513184, "memory_gb": 7.721559524536133, "step_time_ms": 3362.85662651062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:08] (step=0006468) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.1256898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6469, "loss": 0.3251587748527527, "memory_gb": 7.721559524536133, "step_time_ms": 3368.159294128418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:11] (step=0006469) Train Loss: 0.3051, Train Steps/Sec: 0.28, Epoch: 0.12570928876797513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6470, "loss": 0.27708446979522705, "memory_gb": 7.721559524536133, "step_time_ms": 3364.830493927002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:15] (step=0006470) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.12572872133696075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6471, "loss": 0.27300897240638733, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3337955474854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:18] (step=0006471) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.12574815390594637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6472, "loss": 0.22645168006420135, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4772300720215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:22] (step=0006472) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.125767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6473, "loss": 0.28870922327041626, "memory_gb": 7.721559524536133, "step_time_ms": 3365.666627883911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:25] (step=0006473) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.12578701904391762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6474, "loss": 0.18140637874603271, "memory_gb": 7.721559524536133, "step_time_ms": 3364.208698272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:29] (step=0006474) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.12580645161290321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6475, "loss": 0.39253175258636475, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6304607391357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:33] (step=0006475) Train Loss: 0.3535, Train Steps/Sec: 0.28, Epoch: 0.12582588418188884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6476, "loss": 0.3085482120513916, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1273040771484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:36] (step=0006476) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.12584531675087446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6477, "loss": 0.18062256276607513, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7561588287354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:40] (step=0006477) Train Loss: 0.1647, Train Steps/Sec: 0.28, Epoch: 0.12586474931986008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6478, "loss": 0.2625417113304138, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0191345214844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:43] (step=0006478) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.1258841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6479, "loss": 0.23974479734897614, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6841773986816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:47] (step=0006479) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.12590361445783133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6480, "loss": 0.23105305433273315, "memory_gb": 7.721559524536133, "step_time_ms": 3361.705780029297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:50] (step=0006480) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.12592304702681695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6481, "loss": 0.2259257435798645, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5196266174316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:54] (step=0006481) Train Loss: 0.1765, Train Steps/Sec: 0.28, Epoch: 0.12594247959580257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:30:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6482, "loss": 0.25580084323883057, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4632148742676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:30:57] (step=0006482) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.1259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6483, "loss": 0.16982732713222504, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8530292510986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:01] (step=0006483) Train Loss: 0.1637, Train Steps/Sec: 0.28, Epoch: 0.12598134473377381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6484, "loss": 0.2496117651462555, "memory_gb": 7.721559524536133, "step_time_ms": 3362.126350402832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:05] (step=0006484) Train Loss: 0.2957, Train Steps/Sec: 0.28, Epoch: 0.12600077730275944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6485, "loss": 0.29430311918258667, "memory_gb": 7.721559524536133, "step_time_ms": 3359.74383354187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:08] (step=0006485) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.12602020987174503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6486, "loss": 0.2270616888999939, "memory_gb": 7.721559524536133, "step_time_ms": 3357.511043548584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:12] (step=0006486) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.12603964244073065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6487, "loss": 0.23573128879070282, "memory_gb": 7.721559524536133, "step_time_ms": 3348.928928375244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:15] (step=0006487) Train Loss: 0.2498, Train Steps/Sec: 0.29, Epoch: 0.12605907500971628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6488, "loss": 0.26290276646614075, "memory_gb": 7.721559524536133, "step_time_ms": 3363.085985183716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:19] (step=0006488) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.1260785075787019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6489, "loss": 0.25590986013412476, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5077056884766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:22] (step=0006489) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.12609794014768752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6490, "loss": 0.1539517343044281, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7677478790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:26] (step=0006490) Train Loss: 0.1599, Train Steps/Sec: 0.28, Epoch: 0.12611737271667314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6491, "loss": 0.2955862879753113, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3685626983643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:29] (step=0006491) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.12613680528565877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6492, "loss": 0.316584050655365, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3695888519287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:33] (step=0006492) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.1261562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6493, "loss": 0.2332257628440857, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2379093170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:36] (step=0006493) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.12617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6494, "loss": 0.1786835789680481, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8570613861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:40] (step=0006494) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.12619510299261563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6495, "loss": 0.2555329501628876, "memory_gb": 7.721559524536133, "step_time_ms": 3359.995126724243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:43] (step=0006495) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.12621453556160125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6496, "loss": 0.14121940732002258, "memory_gb": 7.721559524536133, "step_time_ms": 3358.203887939453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:47] (step=0006496) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.12623396813058688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6497, "loss": 0.27445870637893677, "memory_gb": 7.721559524536133, "step_time_ms": 3357.482433319092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:50] (step=0006497) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.12625340069957247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6498, "loss": 0.18369810283184052, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3785552978516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:54] (step=0006498) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.1262728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6499, "loss": 0.26383835077285767, "memory_gb": 7.721559524536133, "step_time_ms": 3347.771167755127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:31:57] (step=0006499) Train Loss: 0.2773, Train Steps/Sec: 0.29, Epoch: 0.12629226583754372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6500, "loss": 0.2568095326423645, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1698665618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:01] (step=0006500) Train Loss: 0.2765, Train Steps/Sec: 0.29, Epoch: 0.12631169840652934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6501, "loss": 0.22206555306911469, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1275424957275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:04] (step=0006501) Train Loss: 0.2328, Train Steps/Sec: 0.27, Epoch: 0.12633113097551496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6502, "loss": 0.25491124391555786, "memory_gb": 7.721559524536133, "step_time_ms": 3352.149486541748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:08] (step=0006502) Train Loss: 0.2280, Train Steps/Sec: 0.29, Epoch: 0.12635056354450058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6503, "loss": 0.27211663126945496, "memory_gb": 7.721559524536133, "step_time_ms": 3354.353189468384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:11] (step=0006503) Train Loss: 0.2266, Train Steps/Sec: 0.29, Epoch: 0.1263699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6504, "loss": 0.31852859258651733, "memory_gb": 7.715639114379883, "step_time_ms": 3323.0490684509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:15] (step=0006504) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.12638942868247183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6505, "loss": 0.19892702996730804, "memory_gb": 7.721559524536133, "step_time_ms": 3357.513427734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:18] (step=0006505) Train Loss: 0.1980, Train Steps/Sec: 0.29, Epoch: 0.12640886125145745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6506, "loss": 0.1590600311756134, "memory_gb": 7.721559524536133, "step_time_ms": 3501.1138916015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:22] (step=0006506) Train Loss: 0.1473, Train Steps/Sec: 0.28, Epoch: 0.12642829382044307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6507, "loss": 0.3005342483520508, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2704372406006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:26] (step=0006507) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.1264477263894287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6508, "loss": 0.2535076439380646, "memory_gb": 7.721559524536133, "step_time_ms": 3356.663227081299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:29] (step=0006508) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.1264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6509, "loss": 0.324760377407074, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8096885681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:33] (step=0006509) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.1264865915273999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6510, "loss": 0.19089782238006592, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4512729644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:36] (step=0006510) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.12650602409638553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6511, "loss": 0.19567935168743134, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6645126342773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:40] (step=0006511) Train Loss: 0.1964, Train Steps/Sec: 0.28, Epoch: 0.12652545666537116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6512, "loss": 0.14514818787574768, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1079711914062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:43] (step=0006512) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.12654488923435678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6513, "loss": 0.3227177858352661, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8975200653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:47] (step=0006513) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.1265643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6514, "loss": 0.3460525870323181, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1789264678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:50] (step=0006514) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.12658375437232802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6515, "loss": 0.2495027333498001, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9442501068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:54] (step=0006515) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.12660318694131364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6516, "loss": 0.23505260050296783, "memory_gb": 7.721559524536133, "step_time_ms": 3359.94815826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:32:57] (step=0006516) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.12662261951029927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6517, "loss": 0.2742563784122467, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0054721832275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:01] (step=0006517) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.1266420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6518, "loss": 0.17634370923042297, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8175773620605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:04] (step=0006518) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.1266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6519, "loss": 0.25319045782089233, "memory_gb": 7.721559524536133, "step_time_ms": 3361.179828643799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:08] (step=0006519) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.12668091721725613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6520, "loss": 0.29737919569015503, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7374897003174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:12] (step=0006520) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.12670034978624173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6521, "loss": 0.23713114857673645, "memory_gb": 7.721559524536133, "step_time_ms": 3360.874891281128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:15] (step=0006521) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.12671978235522735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6522, "loss": 0.35741037130355835, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1603298187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:19] (step=0006522) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.12673921492421297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6523, "loss": 0.3337147533893585, "memory_gb": 7.721559524536133, "step_time_ms": 3362.776041030884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:22] (step=0006523) Train Loss: 0.3134, Train Steps/Sec: 0.28, Epoch: 0.1267586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6524, "loss": 0.26579993963241577, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3667488098145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:26] (step=0006524) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.12677808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6525, "loss": 0.16235679388046265, "memory_gb": 7.721559524536133, "step_time_ms": 3365.248918533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:29] (step=0006525) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.12679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6526, "loss": 0.21117010712623596, "memory_gb": 7.721559524536133, "step_time_ms": 3355.133295059204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:33] (step=0006526) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.12681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6527, "loss": 0.19867974519729614, "memory_gb": 7.721559524536133, "step_time_ms": 3362.154245376587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:37] (step=0006527) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.12683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6528, "loss": 0.2564539313316345, "memory_gb": 7.721559524536133, "step_time_ms": 3358.767509460449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:40] (step=0006528) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.1268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6529, "loss": 0.2721146047115326, "memory_gb": 7.721559524536133, "step_time_ms": 3345.165014266968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:44] (step=0006529) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.12687524290711233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6530, "loss": 0.20275038480758667, "memory_gb": 7.721559524536133, "step_time_ms": 3360.96453666687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:47] (step=0006530) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.12689467547609795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6531, "loss": 0.241621732711792, "memory_gb": 7.721559524536133, "step_time_ms": 3365.772008895874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:51] (step=0006531) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.12691410804508357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6532, "loss": 0.2379816174507141, "memory_gb": 7.721559524536133, "step_time_ms": 3363.377332687378, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:55] (step=0006532) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.12693354061406917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:33:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6533, "loss": 0.22470861673355103, "memory_gb": 7.721559524536133, "step_time_ms": 3363.039255142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:33:58] (step=0006533) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.1269529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6534, "loss": 0.19100937247276306, "memory_gb": 7.721559524536133, "step_time_ms": 3364.82310295105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:02] (step=0006534) Train Loss: 0.1971, Train Steps/Sec: 0.28, Epoch: 0.1269724057520404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6535, "loss": 0.29581236839294434, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0730381011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:05] (step=0006535) Train Loss: 0.2890, Train Steps/Sec: 0.28, Epoch: 0.12699183832102603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6536, "loss": 0.19599327445030212, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9541606903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:09] (step=0006536) Train Loss: 0.1982, Train Steps/Sec: 0.28, Epoch: 0.12701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6537, "loss": 0.26249444484710693, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2752170562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:13] (step=0006537) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.12703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6538, "loss": 0.2428179234266281, "memory_gb": 7.721559524536133, "step_time_ms": 3357.328176498413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:16] (step=0006538) Train Loss: 0.3017, Train Steps/Sec: 0.28, Epoch: 0.1270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6539, "loss": 0.2776717245578766, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8583374023438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:20] (step=0006539) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.12706956859696852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6540, "loss": 0.19176077842712402, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1925888061523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:24] (step=0006540) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.12708900116595415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6541, "loss": 0.3272014856338501, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0971488952637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:27] (step=0006541) Train Loss: 0.2498, Train Steps/Sec: 0.27, Epoch: 0.12710843373493977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6542, "loss": 0.19579359889030457, "memory_gb": 7.721559524536133, "step_time_ms": 3357.527017593384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:31] (step=0006542) Train Loss: 0.1621, Train Steps/Sec: 0.28, Epoch: 0.1271278663039254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6543, "loss": 0.1921643316745758, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5664920806885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:35] (step=0006543) Train Loss: 0.1976, Train Steps/Sec: 0.28, Epoch: 0.12714729887291099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6544, "loss": 0.2444792091846466, "memory_gb": 7.721559524536133, "step_time_ms": 3365.140676498413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:38] (step=0006544) Train Loss: 0.2957, Train Steps/Sec: 0.28, Epoch: 0.1271667314418966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6545, "loss": 0.19253137707710266, "memory_gb": 7.721559524536133, "step_time_ms": 3354.330539703369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:42] (step=0006545) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.12718616401088223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6546, "loss": 0.2241448014974594, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6271228790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:45] (step=0006546) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.12720559657986785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6547, "loss": 0.20905932784080505, "memory_gb": 7.721559524536133, "step_time_ms": 3513.0200386047363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:49] (step=0006547) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.12722502914885347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6548, "loss": 0.2710692584514618, "memory_gb": 7.721559524536133, "step_time_ms": 3343.731164932251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:53] (step=0006548) Train Loss: 0.2875, Train Steps/Sec: 0.28, Epoch: 0.1272444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:34:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6549, "loss": 0.23582175374031067, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7233715057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:34:56] (step=0006549) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.12726389428682472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6550, "loss": 0.27488577365875244, "memory_gb": 7.721559524536133, "step_time_ms": 3368.589162826538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:00] (step=0006550) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.12728332685581034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6551, "loss": 0.1853146255016327, "memory_gb": 7.715639114379883, "step_time_ms": 3332.437753677368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:03] (step=0006551) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.12730275942479596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6552, "loss": 0.30761805176734924, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0097122192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:07] (step=0006552) Train Loss: 0.3342, Train Steps/Sec: 0.28, Epoch: 0.12732219199378159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6553, "loss": 0.1979820877313614, "memory_gb": 7.721559524536133, "step_time_ms": 3362.741231918335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:11] (step=0006553) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.1273416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6554, "loss": 0.21318864822387695, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8667335510254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:14] (step=0006554) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.12736105713175283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6555, "loss": 0.20480263233184814, "memory_gb": 7.721559524536133, "step_time_ms": 3356.365442276001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:18] (step=0006555) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.12738048970073843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6556, "loss": 0.28394004702568054, "memory_gb": 7.721559524536133, "step_time_ms": 3364.612102508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:22] (step=0006556) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.12739992226972405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6557, "loss": 0.24040473997592926, "memory_gb": 7.721559524536133, "step_time_ms": 3367.093563079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:25] (step=0006557) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.12741935483870967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6558, "loss": 0.20744159817695618, "memory_gb": 7.721559524536133, "step_time_ms": 3357.821464538574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:29] (step=0006558) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.1274387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6559, "loss": 0.2620762884616852, "memory_gb": 7.721559524536133, "step_time_ms": 3365.779399871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:32] (step=0006559) Train Loss: 0.1796, Train Steps/Sec: 0.28, Epoch: 0.12745821997668091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6560, "loss": 0.16797614097595215, "memory_gb": 7.721559524536133, "step_time_ms": 3367.558717727661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:36] (step=0006560) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.12747765254566654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6561, "loss": 0.10917655378580093, "memory_gb": 7.721559524536133, "step_time_ms": 3365.629196166992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:40] (step=0006561) Train Loss: 0.1533, Train Steps/Sec: 0.28, Epoch: 0.12749708511465216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6562, "loss": 0.2657756507396698, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6206340789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:43] (step=0006562) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.12751651768363778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6563, "loss": 0.29033803939819336, "memory_gb": 7.721559524536133, "step_time_ms": 3366.075277328491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:47] (step=0006563) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.1275359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6564, "loss": 0.23134727776050568, "memory_gb": 7.721559524536133, "step_time_ms": 3370.058059692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:51] (step=0006564) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.12755538282160903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6565, "loss": 0.29491525888442993, "memory_gb": 7.721559524536133, "step_time_ms": 3359.172582626343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:54] (step=0006565) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.12757481539059465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6566, "loss": 0.16109433770179749, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0808334350586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:35:58] (step=0006566) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.12759424795958027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6567, "loss": 0.1888439655303955, "memory_gb": 7.721559524536133, "step_time_ms": 3368.697166442871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:01] (step=0006567) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.12761368052856586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6568, "loss": 0.2092956304550171, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0079078674316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:05] (step=0006568) Train Loss: 0.2114, Train Steps/Sec: 0.28, Epoch: 0.1276331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6569, "loss": 0.11147641390562057, "memory_gb": 7.721559524536133, "step_time_ms": 3365.906238555908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:09] (step=0006569) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.1276525456665371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6570, "loss": 0.16005726158618927, "memory_gb": 7.721559524536133, "step_time_ms": 3366.749048233032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:12] (step=0006570) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.12767197823552273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6571, "loss": 0.28333473205566406, "memory_gb": 7.721559524536133, "step_time_ms": 3360.98575592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:16] (step=0006571) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.12769141080450835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6572, "loss": 0.15605898201465607, "memory_gb": 7.721559524536133, "step_time_ms": 3370.703935623169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:20] (step=0006572) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.12771084337349398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6573, "loss": 0.25930607318878174, "memory_gb": 7.721559524536133, "step_time_ms": 3374.7782707214355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:23] (step=0006573) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.1277302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6574, "loss": 0.22798709571361542, "memory_gb": 7.721559524536133, "step_time_ms": 3370.9700107574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:27] (step=0006574) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.12774970851146522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6575, "loss": 0.33572524785995483, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6493587493896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:30] (step=0006575) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.12776914108045084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6576, "loss": 0.28537964820861816, "memory_gb": 7.721559524536133, "step_time_ms": 3368.119239807129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:34] (step=0006576) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.12778857364943647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6577, "loss": 0.2124093770980835, "memory_gb": 7.721559524536133, "step_time_ms": 3373.410224914551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:38] (step=0006577) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.1278080062184221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6578, "loss": 0.3729267716407776, "memory_gb": 7.721559524536133, "step_time_ms": 3365.121841430664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:41] (step=0006578) Train Loss: 0.2884, Train Steps/Sec: 0.28, Epoch: 0.12782743878740768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6579, "loss": 0.2196984440088272, "memory_gb": 7.721559524536133, "step_time_ms": 3374.8865127563477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:45] (step=0006579) Train Loss: 0.1827, Train Steps/Sec: 0.28, Epoch: 0.1278468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6580, "loss": 0.23376792669296265, "memory_gb": 7.721559524536133, "step_time_ms": 3376.431941986084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:49] (step=0006580) Train Loss: 0.2142, Train Steps/Sec: 0.28, Epoch: 0.12786630392537893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6581, "loss": 0.24461150169372559, "memory_gb": 7.721559524536133, "step_time_ms": 3364.871025085449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:52] (step=0006581) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.12788573649436455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6582, "loss": 0.20521581172943115, "memory_gb": 7.721559524536133, "step_time_ms": 3375.1580715179443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:56] (step=0006582) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.12790516906335017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6583, "loss": 0.259880006313324, "memory_gb": 7.721559524536133, "step_time_ms": 3376.467227935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:36:59] (step=0006583) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.1279246016323358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6584, "loss": 0.1306382715702057, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5879917144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:03] (step=0006584) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.12794403420132142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6585, "loss": 0.23276863992214203, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1300106048584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:07] (step=0006585) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.12796346677030704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6586, "loss": 0.24192698299884796, "memory_gb": 7.721559524536133, "step_time_ms": 3371.609926223755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:10] (step=0006586) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.12798289933929266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6587, "loss": 0.3131425380706787, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1202144622803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:14] (step=0006587) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.12800233190827828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6588, "loss": 0.21465033292770386, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2987480163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:18] (step=0006588) Train Loss: 0.2122, Train Steps/Sec: 0.27, Epoch: 0.1280217644772639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6589, "loss": 0.251219779253006, "memory_gb": 7.721559524536133, "step_time_ms": 3366.497278213501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:21] (step=0006589) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.12804119704624953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6590, "loss": 0.29512253403663635, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6695613861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:25] (step=0006590) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.12806062961523512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6591, "loss": 0.3145085275173187, "memory_gb": 7.721559524536133, "step_time_ms": 3368.990898132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:29] (step=0006591) Train Loss: 0.2917, Train Steps/Sec: 0.28, Epoch: 0.12808006218422074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6592, "loss": 0.21685713529586792, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2870864868164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:32] (step=0006592) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.12809949475320637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6593, "loss": 0.1856476366519928, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9746131896973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:36] (step=0006593) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.128118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6594, "loss": 0.2254374921321869, "memory_gb": 7.721559524536133, "step_time_ms": 3513.915538787842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:39] (step=0006594) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.1281383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6595, "loss": 0.2433982789516449, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1229610443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:43] (step=0006595) Train Loss: 0.2970, Train Steps/Sec: 0.28, Epoch: 0.12815779246016323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6596, "loss": 0.2614437937736511, "memory_gb": 7.721559524536133, "step_time_ms": 3348.839044570923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:47] (step=0006596) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.12817722502914886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6597, "loss": 0.138915553689003, "memory_gb": 7.721559524536133, "step_time_ms": 3369.076728820801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:50] (step=0006597) Train Loss: 0.1437, Train Steps/Sec: 0.28, Epoch: 0.12819665759813448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6598, "loss": 0.1402842402458191, "memory_gb": 7.721559524536133, "step_time_ms": 3366.433620452881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:54] (step=0006598) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.1282160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:37:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6599, "loss": 0.19929422438144684, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6234951019287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:37:58] (step=0006599) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.12823552273610572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6600, "loss": 0.2724373936653137, "memory_gb": 7.721559524536133, "step_time_ms": 3368.533134460449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:01] (step=0006600) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.12825495530509134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6601, "loss": 0.1589992642402649, "memory_gb": 7.721559524536133, "step_time_ms": 3368.71600151062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:05] (step=0006601) Train Loss: 0.1757, Train Steps/Sec: 0.28, Epoch: 0.12827438787407694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6602, "loss": 0.22236500680446625, "memory_gb": 7.721559524536133, "step_time_ms": 3368.579387664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:08] (step=0006602) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.12829382044306256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6603, "loss": 0.1956976056098938, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1105213165283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:12] (step=0006603) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.12831325301204818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6604, "loss": 0.2730543613433838, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2045001983643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:16] (step=0006604) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.1283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6605, "loss": 0.2534671723842621, "memory_gb": 7.721559524536133, "step_time_ms": 3367.663621902466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:19] (step=0006605) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.12835211815001943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6606, "loss": 0.2307410091161728, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5318489074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:23] (step=0006606) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.12837155071900505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6607, "loss": 0.29060158133506775, "memory_gb": 7.721559524536133, "step_time_ms": 3368.389368057251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:27] (step=0006607) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.12839098328799067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6608, "loss": 0.15485823154449463, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7096881866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:30] (step=0006608) Train Loss: 0.1499, Train Steps/Sec: 0.28, Epoch: 0.1284104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6609, "loss": 0.19504940509796143, "memory_gb": 7.721559524536133, "step_time_ms": 3366.626024246216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:34] (step=0006609) Train Loss: 0.1927, Train Steps/Sec: 0.28, Epoch: 0.12842984842596192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6610, "loss": 0.1583346426486969, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1738262176514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:37] (step=0006610) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.12844928099494754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6611, "loss": 0.27255040407180786, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0998344421387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:41] (step=0006611) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.12846871356393316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6612, "loss": 0.18242725729942322, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5897636413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:45] (step=0006612) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.12848814613291878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6613, "loss": 0.29546451568603516, "memory_gb": 7.721559524536133, "step_time_ms": 3356.379508972168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:48] (step=0006613) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.12850757870190438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6614, "loss": 0.16759030520915985, "memory_gb": 7.721559524536133, "step_time_ms": 3362.722396850586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:52] (step=0006614) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.12852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6615, "loss": 0.3622553050518036, "memory_gb": 7.721559524536133, "step_time_ms": 3366.121530532837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:56] (step=0006615) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.12854644383987562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6616, "loss": 0.3041914403438568, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1035766601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:38:59] (step=0006616) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.12856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6617, "loss": 0.202424556016922, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1770820617676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:03] (step=0006617) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.12858530897784687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6618, "loss": 0.316693514585495, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9236946105957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:06] (step=0006618) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.1286047415468325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6619, "loss": 0.2476474493741989, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8264923095703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:10] (step=0006619) Train Loss: 0.1875, Train Steps/Sec: 0.28, Epoch: 0.1286241741158181, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6620, "loss": 0.3390882909297943, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6110858917236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:14] (step=0006620) Train Loss: 0.2497, Train Steps/Sec: 0.28, Epoch: 0.12864360668480374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6621, "loss": 0.18120023608207703, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1814975738525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:17] (step=0006621) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.12866303925378936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6622, "loss": 0.1968761682510376, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9322052001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:21] (step=0006622) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.12868247182277498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6623, "loss": 0.1797133982181549, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4672260284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:25] (step=0006623) Train Loss: 0.1531, Train Steps/Sec: 0.28, Epoch: 0.1287019043917606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6624, "loss": 0.21217040717601776, "memory_gb": 7.721559524536133, "step_time_ms": 3360.140562057495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:28] (step=0006624) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.12872133696074622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6625, "loss": 0.17260727286338806, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7655296325684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:32] (step=0006625) Train Loss: 0.1620, Train Steps/Sec: 0.28, Epoch: 0.12874076952973182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6626, "loss": 0.2943154275417328, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0450592041016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:35] (step=0006626) Train Loss: 0.2810, Train Steps/Sec: 0.28, Epoch: 0.12876020209871744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6627, "loss": 0.1620039939880371, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8786125183105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:39] (step=0006627) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.12877963466770306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6628, "loss": 0.17142055928707123, "memory_gb": 7.721559524536133, "step_time_ms": 3358.463764190674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:43] (step=0006628) Train Loss: 0.2035, Train Steps/Sec: 0.28, Epoch: 0.12879906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6629, "loss": 0.3209148645401001, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7139587402344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:46] (step=0006629) Train Loss: 0.2569, Train Steps/Sec: 0.27, Epoch: 0.1288184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6630, "loss": 0.1525219976902008, "memory_gb": 7.721559524536133, "step_time_ms": 3358.241558074951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:50] (step=0006630) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.12883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6631, "loss": 0.2892133891582489, "memory_gb": 7.721559524536133, "step_time_ms": 3358.318090438843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:54] (step=0006631) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.12885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6632, "loss": 0.24487565457820892, "memory_gb": 7.721559524536133, "step_time_ms": 3359.666347503662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:39:57] (step=0006632) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.12887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6633, "loss": 0.35199010372161865, "memory_gb": 7.721559524536133, "step_time_ms": 3356.121778488159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:01] (step=0006633) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.1288962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6634, "loss": 0.2135181427001953, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5705757141113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:05] (step=0006634) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.12891566265060242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6635, "loss": 0.24150505661964417, "memory_gb": 7.721559524536133, "step_time_ms": 3495.9959983825684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:08] (step=0006635) Train Loss: 0.2296, Train Steps/Sec: 0.28, Epoch: 0.12893509521958804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6636, "loss": 0.2718115448951721, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3945503234863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:12] (step=0006636) Train Loss: 0.3100, Train Steps/Sec: 0.28, Epoch: 0.12895452778857364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6637, "loss": 0.24498823285102844, "memory_gb": 7.721559524536133, "step_time_ms": 3359.360933303833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:15] (step=0006637) Train Loss: 0.1809, Train Steps/Sec: 0.28, Epoch: 0.12897396035755926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6638, "loss": 0.21216967701911926, "memory_gb": 7.721559524536133, "step_time_ms": 3358.128070831299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:19] (step=0006638) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.12899339292654488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6639, "loss": 0.22496478259563446, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0917587280273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:23] (step=0006639) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.1290128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6640, "loss": 0.37140703201293945, "memory_gb": 7.721559524536133, "step_time_ms": 3357.358932495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:26] (step=0006640) Train Loss: 0.3688, Train Steps/Sec: 0.28, Epoch: 0.12903225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6641, "loss": 0.18962088227272034, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0942878723145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:30] (step=0006641) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.12905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6642, "loss": 0.13260069489479065, "memory_gb": 7.721559524536133, "step_time_ms": 3355.745792388916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:33] (step=0006642) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.12907112320248737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6643, "loss": 0.11042596399784088, "memory_gb": 7.721559524536133, "step_time_ms": 3354.787826538086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:37] (step=0006643) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.129090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6644, "loss": 0.21636399626731873, "memory_gb": 7.721559524536133, "step_time_ms": 3359.327554702759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:41] (step=0006644) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.12910998834045861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6645, "loss": 0.2103746384382248, "memory_gb": 7.721559524536133, "step_time_ms": 3357.272148132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:44] (step=0006645) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.12912942090944424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6646, "loss": 0.25434690713882446, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8508834838867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:48] (step=0006646) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.12914885347842986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6647, "loss": 0.20250001549720764, "memory_gb": 7.721559524536133, "step_time_ms": 3360.41259765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:51] (step=0006647) Train Loss: 0.2122, Train Steps/Sec: 0.28, Epoch: 0.12916828604741548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6648, "loss": 0.3298654556274414, "memory_gb": 7.715639114379883, "step_time_ms": 3325.321912765503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:55] (step=0006648) Train Loss: 0.3210, Train Steps/Sec: 0.28, Epoch: 0.12918771861640108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:40:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6649, "loss": 0.1900995373725891, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6951026916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:40:59] (step=0006649) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.1292071511853867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6650, "loss": 0.14543019235134125, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1317405700684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:02] (step=0006650) Train Loss: 0.1948, Train Steps/Sec: 0.28, Epoch: 0.12922658375437232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6651, "loss": 0.2803199291229248, "memory_gb": 7.721559524536133, "step_time_ms": 3358.093023300171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:06] (step=0006651) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.12924601632335794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6652, "loss": 0.18619927763938904, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4398498535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:10] (step=0006652) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.12926544889234357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6653, "loss": 0.20486772060394287, "memory_gb": 7.721559524536133, "step_time_ms": 3360.645055770874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:13] (step=0006653) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.1292848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6654, "loss": 0.24511437118053436, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7725887298584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:17] (step=0006654) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.1293043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6655, "loss": 0.2473955750465393, "memory_gb": 7.721559524536133, "step_time_ms": 3363.510847091675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:20] (step=0006655) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.12932374659930043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6656, "loss": 0.18326902389526367, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6557636260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:24] (step=0006656) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.12934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6657, "loss": 0.291224867105484, "memory_gb": 7.721559524536133, "step_time_ms": 3364.441394805908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:28] (step=0006657) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.12936261173727168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6658, "loss": 0.30144041776657104, "memory_gb": 7.715639114379883, "step_time_ms": 3322.0083713531494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:31] (step=0006658) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.1293820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6659, "loss": 0.3056924045085907, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8832569122314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:35] (step=0006659) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.1294014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6660, "loss": 0.3163028955459595, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5525493621826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:38] (step=0006660) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.12942090944422852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6661, "loss": 0.29185497760772705, "memory_gb": 7.721559524536133, "step_time_ms": 3358.330726623535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:42] (step=0006661) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.12944034201321414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6662, "loss": 0.21897083520889282, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6373748779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:46] (step=0006662) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.12945977458219976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6663, "loss": 0.27832257747650146, "memory_gb": 7.721559524536133, "step_time_ms": 3365.705728530884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:49] (step=0006663) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.12947920715118538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6664, "loss": 0.25532984733581543, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1439628601074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:53] (step=0006664) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.129498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:41:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6665, "loss": 0.3073006272315979, "memory_gb": 7.721559524536133, "step_time_ms": 3365.885019302368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:41:57] (step=0006665) Train Loss: 0.3067, Train Steps/Sec: 0.28, Epoch: 0.12951807228915663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6666, "loss": 0.25880950689315796, "memory_gb": 7.721559524536133, "step_time_ms": 3362.905740737915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:00] (step=0006666) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.12953750485814225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6667, "loss": 0.2872318625450134, "memory_gb": 7.721559524536133, "step_time_ms": 3369.058609008789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:04] (step=0006667) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.12955693742712787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6668, "loss": 0.3222704529762268, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9915924072266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:07] (step=0006668) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.1295763699961135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6669, "loss": 0.23895229399204254, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2163791656494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:11] (step=0006669) Train Loss: 0.1876, Train Steps/Sec: 0.28, Epoch: 0.12959580256509912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6670, "loss": 0.18468748033046722, "memory_gb": 7.721559524536133, "step_time_ms": 3368.086338043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:15] (step=0006670) Train Loss: 0.1827, Train Steps/Sec: 0.28, Epoch: 0.12961523513408474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6671, "loss": 0.2331189215183258, "memory_gb": 7.721559524536133, "step_time_ms": 3357.736110687256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:18] (step=0006671) Train Loss: 0.2862, Train Steps/Sec: 0.28, Epoch: 0.12963466770307033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6672, "loss": 0.26008340716362, "memory_gb": 7.721559524536133, "step_time_ms": 3365.542411804199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:22] (step=0006672) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.12965410027205596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6673, "loss": 0.18758726119995117, "memory_gb": 7.721559524536133, "step_time_ms": 3360.358476638794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:25] (step=0006673) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.12967353284104158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6674, "loss": 0.20834049582481384, "memory_gb": 7.721559524536133, "step_time_ms": 3368.189811706543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:29] (step=0006674) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.1296929654100272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6675, "loss": 0.2796283960342407, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5625324249268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:33] (step=0006675) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.12971239797901282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6676, "loss": 0.21473851799964905, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2308616638184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:36] (step=0006676) Train Loss: 0.1848, Train Steps/Sec: 0.28, Epoch: 0.12973183054799844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6677, "loss": 0.28992289304733276, "memory_gb": 7.721559524536133, "step_time_ms": 3361.983060836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:40] (step=0006677) Train Loss: 0.2200, Train Steps/Sec: 0.27, Epoch: 0.12975126311698407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6678, "loss": 0.2022891640663147, "memory_gb": 7.721559524536133, "step_time_ms": 3348.6080169677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:44] (step=0006678) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.1297706956859697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6679, "loss": 0.2031710147857666, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1840381622314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:47] (step=0006679) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.1297901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6680, "loss": 0.324046790599823, "memory_gb": 7.721559524536133, "step_time_ms": 3368.398666381836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:51] (step=0006680) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.12980956082394093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6681, "loss": 0.34569233655929565, "memory_gb": 7.721559524536133, "step_time_ms": 3369.870662689209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:55] (step=0006681) Train Loss: 0.2752, Train Steps/Sec: 0.28, Epoch: 0.12982899339292656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:42:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6682, "loss": 0.280324786901474, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7233715057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:42:58] (step=0006682) Train Loss: 0.2767, Train Steps/Sec: 0.28, Epoch: 0.12984842596191218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6683, "loss": 0.15674236416816711, "memory_gb": 7.721559524536133, "step_time_ms": 3504.23526763916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:02] (step=0006683) Train Loss: 0.1544, Train Steps/Sec: 0.28, Epoch: 0.12986785853089777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6684, "loss": 0.24769026041030884, "memory_gb": 7.721559524536133, "step_time_ms": 3359.715700149536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:05] (step=0006684) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.1298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6685, "loss": 0.15327611565589905, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9794540405273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:09] (step=0006685) Train Loss: 0.1681, Train Steps/Sec: 0.28, Epoch: 0.12990672366886902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6686, "loss": 0.1515834629535675, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1468448638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:13] (step=0006686) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.12992615623785464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6687, "loss": 0.26293572783470154, "memory_gb": 7.721559524536133, "step_time_ms": 3364.412307739258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:16] (step=0006687) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.12994558880684026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6688, "loss": 0.17906157672405243, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3032989501953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:20] (step=0006688) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.12996502137582588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6689, "loss": 0.167193204164505, "memory_gb": 7.721559524536133, "step_time_ms": 3374.706506729126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:23] (step=0006689) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.1299844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6690, "loss": 0.13906216621398926, "memory_gb": 7.721559524536133, "step_time_ms": 3376.549243927002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:27] (step=0006690) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.13000388651379713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6691, "loss": 0.2919352650642395, "memory_gb": 7.721559524536133, "step_time_ms": 3368.457317352295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:31] (step=0006691) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.13002331908278275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6692, "loss": 0.29349422454833984, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6266765594482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:34] (step=0006692) Train Loss: 0.2523, Train Steps/Sec: 0.28, Epoch: 0.13004275165176837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6693, "loss": 0.3238789439201355, "memory_gb": 7.721559524536133, "step_time_ms": 3381.929397583008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:38] (step=0006693) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.130062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6694, "loss": 0.14803791046142578, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7216091156006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:42] (step=0006694) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.1300816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6695, "loss": 0.21953412890434265, "memory_gb": 7.721559524536133, "step_time_ms": 3370.7737922668457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:45] (step=0006695) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.1301010493587252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6696, "loss": 0.2296043485403061, "memory_gb": 7.721559524536133, "step_time_ms": 3370.116710662842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:49] (step=0006696) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.13012048192771083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6697, "loss": 0.26424989104270935, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5780601501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:52] (step=0006697) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.13013991449669646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:43:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6698, "loss": 0.15272249281406403, "memory_gb": 7.721559524536133, "step_time_ms": 3366.63556098938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:43:56] (step=0006698) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.13015934706568208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6699, "loss": 0.20935234427452087, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4334239959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:00] (step=0006699) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.1301787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6700, "loss": 0.3646959960460663, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6579303741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:03] (step=0006700) Train Loss: 0.3151, Train Steps/Sec: 0.28, Epoch: 0.13019821220365332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6701, "loss": 0.21652504801750183, "memory_gb": 7.721559524536133, "step_time_ms": 3369.983911514282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:07] (step=0006701) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.13021764477263895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6702, "loss": 0.23757697641849518, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8415756225586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:11] (step=0006702) Train Loss: 0.2922, Train Steps/Sec: 0.28, Epoch: 0.13023707734162457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6703, "loss": 0.2249099314212799, "memory_gb": 7.721559524536133, "step_time_ms": 3362.699508666992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:14] (step=0006703) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.1302565099106102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6704, "loss": 0.14491915702819824, "memory_gb": 7.721559524536133, "step_time_ms": 3366.565227508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:18] (step=0006704) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.1302759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6705, "loss": 0.21851947903633118, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1629638671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:21] (step=0006705) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.13029537504858144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6706, "loss": 0.1784420907497406, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7497844696045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:25] (step=0006706) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.13031480761756703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6707, "loss": 0.1483677476644516, "memory_gb": 7.721559524536133, "step_time_ms": 3371.415615081787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:29] (step=0006707) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.13033424018655265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6708, "loss": 0.27138635516166687, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4866733551025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:32] (step=0006708) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.13035367275553827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6709, "loss": 0.2122039496898651, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2547855377197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:36] (step=0006709) Train Loss: 0.2467, Train Steps/Sec: 0.28, Epoch: 0.1303731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6710, "loss": 0.19506768882274628, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8596744537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:39] (step=0006710) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.13039253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6711, "loss": 0.3435339033603668, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2724075317383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:43] (step=0006711) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.13041197046249514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6712, "loss": 0.30777570605278015, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5765991210938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:47] (step=0006712) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.13043140303148076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6713, "loss": 0.33369606733322144, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4325733184814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:50] (step=0006713) Train Loss: 0.2551, Train Steps/Sec: 0.28, Epoch: 0.13045083560046639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6714, "loss": 0.26380079984664917, "memory_gb": 7.721559524536133, "step_time_ms": 3368.565082550049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:54] (step=0006714) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.130470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:44:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6715, "loss": 0.23190665245056152, "memory_gb": 7.721559524536133, "step_time_ms": 3346.3051319122314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:44:58] (step=0006715) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.13048970073843763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6716, "loss": 0.2644828259944916, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6865100860596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:01] (step=0006716) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.13050913330742325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6717, "loss": 0.3490937352180481, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7445392608643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:05] (step=0006717) Train Loss: 0.3253, Train Steps/Sec: 0.27, Epoch: 0.13052856587640885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6718, "loss": 0.30498212575912476, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2688007354736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:09] (step=0006718) Train Loss: 0.2936, Train Steps/Sec: 0.28, Epoch: 0.13054799844539447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6719, "loss": 0.25845468044281006, "memory_gb": 7.721559524536133, "step_time_ms": 3369.832992553711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:12] (step=0006719) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.1305674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6720, "loss": 0.18547435104846954, "memory_gb": 7.721559524536133, "step_time_ms": 3357.370376586914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:16] (step=0006720) Train Loss: 0.1954, Train Steps/Sec: 0.28, Epoch: 0.13058686358336571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6721, "loss": 0.3029820919036865, "memory_gb": 7.721559524536133, "step_time_ms": 3363.722801208496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:19] (step=0006721) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.13060629615235134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6722, "loss": 0.3735567331314087, "memory_gb": 7.721559524536133, "step_time_ms": 3368.117332458496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:23] (step=0006722) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.13062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6723, "loss": 0.27864372730255127, "memory_gb": 7.721559524536133, "step_time_ms": 3510.9899044036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:27] (step=0006723) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.13064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6724, "loss": 0.2927236557006836, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9418144226074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:30] (step=0006724) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.1306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6725, "loss": 0.32341456413269043, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4729595184326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:34] (step=0006725) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.13068402642829383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6726, "loss": 0.28111109137535095, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8274154663086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:38] (step=0006726) Train Loss: 0.2996, Train Steps/Sec: 0.28, Epoch: 0.13070345899727945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6727, "loss": 0.257496178150177, "memory_gb": 7.721559524536133, "step_time_ms": 3363.79337310791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:41] (step=0006727) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.13072289156626507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6728, "loss": 0.3335007131099701, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7827377319336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:45] (step=0006728) Train Loss: 0.3058, Train Steps/Sec: 0.28, Epoch: 0.1307423241352507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6729, "loss": 0.21052682399749756, "memory_gb": 7.721559524536133, "step_time_ms": 3359.973192214966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:48] (step=0006729) Train Loss: 0.2138, Train Steps/Sec: 0.28, Epoch: 0.1307617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6730, "loss": 0.1623510867357254, "memory_gb": 7.721559524536133, "step_time_ms": 3366.816759109497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:52] (step=0006730) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.1307811892732219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6731, "loss": 0.32537198066711426, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9278621673584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:56] (step=0006731) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.13080062184220753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:45:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6732, "loss": 0.22253714501857758, "memory_gb": 7.721559524536133, "step_time_ms": 3365.791082382202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:45:59] (step=0006732) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.13082005441119315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6733, "loss": 0.22523729503154755, "memory_gb": 7.721559524536133, "step_time_ms": 3363.156795501709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:03] (step=0006733) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.13083948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6734, "loss": 0.24845349788665771, "memory_gb": 7.721559524536133, "step_time_ms": 3358.592987060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:06] (step=0006734) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.1308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6735, "loss": 0.268765926361084, "memory_gb": 7.721559524536133, "step_time_ms": 3362.987279891968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:10] (step=0006735) Train Loss: 0.2734, Train Steps/Sec: 0.28, Epoch: 0.13087835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6736, "loss": 0.24458648264408112, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5524768829346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:14] (step=0006736) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.13089778468713564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6737, "loss": 0.237218976020813, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2869091033936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:17] (step=0006737) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.13091721725612127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6738, "loss": 0.1615264117717743, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4183406829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:21] (step=0006738) Train Loss: 0.1697, Train Steps/Sec: 0.28, Epoch: 0.1309366498251069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6739, "loss": 0.23998501896858215, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5292987823486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:25] (step=0006739) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.1309560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6740, "loss": 0.2606801986694336, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7651767730713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:28] (step=0006740) Train Loss: 0.2971, Train Steps/Sec: 0.28, Epoch: 0.13097551496307813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6741, "loss": 0.21345697343349457, "memory_gb": 7.721559524536133, "step_time_ms": 3365.651845932007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:32] (step=0006741) Train Loss: 0.2450, Train Steps/Sec: 0.28, Epoch: 0.13099494753206373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6742, "loss": 0.18947733938694, "memory_gb": 7.721559524536133, "step_time_ms": 3363.813877105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:35] (step=0006742) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.13101438010104935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6743, "loss": 0.2203313708305359, "memory_gb": 7.721559524536133, "step_time_ms": 3364.459991455078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:39] (step=0006743) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.13103381267003497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6744, "loss": 0.22518864274024963, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8579845428467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:43] (step=0006744) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.1310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6745, "loss": 0.23736584186553955, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9605350494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:46] (step=0006745) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.13107267780800622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6746, "loss": 0.1894916594028473, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3201122283936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:50] (step=0006746) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.13109211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6747, "loss": 0.23530958592891693, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1247119903564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:53] (step=0006747) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.13111154294597746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:46:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6748, "loss": 0.18311557173728943, "memory_gb": 7.721559524536133, "step_time_ms": 3356.468677520752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:46:57] (step=0006748) Train Loss: 0.2426, Train Steps/Sec: 0.28, Epoch: 0.13113097551496308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6749, "loss": 0.275766521692276, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6199703216553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:01] (step=0006749) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.1311504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6750, "loss": 0.28641337156295776, "memory_gb": 7.721559524536133, "step_time_ms": 3365.957021713257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:04] (step=0006750) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.13116984065293433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6751, "loss": 0.17538145184516907, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3273372650146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:08] (step=0006751) Train Loss: 0.2058, Train Steps/Sec: 0.28, Epoch: 0.13118927322191995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6752, "loss": 0.23429563641548157, "memory_gb": 7.721559524536133, "step_time_ms": 3364.140748977661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:12] (step=0006752) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.13120870579090554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6753, "loss": 0.34375447034835815, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1442527770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:15] (step=0006753) Train Loss: 0.3370, Train Steps/Sec: 0.28, Epoch: 0.13122813835989117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6754, "loss": 0.1252095103263855, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4160289764404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:19] (step=0006754) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.1312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6755, "loss": 0.262661874294281, "memory_gb": 7.721559524536133, "step_time_ms": 3362.97869682312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:22] (step=0006755) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.1312670034978624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6756, "loss": 0.22331872582435608, "memory_gb": 7.721559524536133, "step_time_ms": 3363.68465423584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:26] (step=0006756) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.13128643606684803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6757, "loss": 0.1825275719165802, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2962703704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:30] (step=0006757) Train Loss: 0.1794, Train Steps/Sec: 0.28, Epoch: 0.13130586863583366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6758, "loss": 0.2422563135623932, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1927757263184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:33] (step=0006758) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.13132530120481928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6759, "loss": 0.24448184669017792, "memory_gb": 7.721559524536133, "step_time_ms": 3359.280586242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:37] (step=0006759) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.1313447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6760, "loss": 0.31662648916244507, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9134426116943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:41] (step=0006760) Train Loss: 0.2908, Train Steps/Sec: 0.28, Epoch: 0.13136416634279052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6761, "loss": 0.19744402170181274, "memory_gb": 7.721559524536133, "step_time_ms": 3364.52579498291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:44] (step=0006761) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.13138359891177614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6762, "loss": 0.2683364748954773, "memory_gb": 7.721559524536133, "step_time_ms": 3367.161273956299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:48] (step=0006762) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.13140303148076177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6763, "loss": 0.25641244649887085, "memory_gb": 7.721559524536133, "step_time_ms": 3365.245819091797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:51] (step=0006763) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.1314224640497474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6764, "loss": 0.2063782811164856, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9627952575684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:55] (step=0006764) Train Loss: 0.2358, Train Steps/Sec: 0.27, Epoch: 0.13144189661873298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:47:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6765, "loss": 0.29028239846229553, "memory_gb": 7.721559524536133, "step_time_ms": 3498.9936351776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:47:59] (step=0006765) Train Loss: 0.3084, Train Steps/Sec: 0.28, Epoch: 0.1314613291877186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6766, "loss": 0.2781101167201996, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1293773651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:02] (step=0006766) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.13148076175670423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6767, "loss": 0.16265729069709778, "memory_gb": 7.721559524536133, "step_time_ms": 3356.931686401367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:06] (step=0006767) Train Loss: 0.1723, Train Steps/Sec: 0.28, Epoch: 0.13150019432568985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6768, "loss": 0.19763189554214478, "memory_gb": 7.721559524536133, "step_time_ms": 3359.92693901062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:10] (step=0006768) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.13151962689467547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6769, "loss": 0.22618718445301056, "memory_gb": 7.721559524536133, "step_time_ms": 3365.70405960083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:13] (step=0006769) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.1315390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6770, "loss": 0.2509438991546631, "memory_gb": 7.721559524536133, "step_time_ms": 3359.039783477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:17] (step=0006770) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.13155849203264672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6771, "loss": 0.09894368797540665, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0014610290527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:20] (step=0006771) Train Loss: 0.1806, Train Steps/Sec: 0.28, Epoch: 0.13157792460163234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6772, "loss": 0.3515332043170929, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1559867858887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:24] (step=0006772) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.13159735717061796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6773, "loss": 0.33082225918769836, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3204021453857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:28] (step=0006773) Train Loss: 0.3197, Train Steps/Sec: 0.28, Epoch: 0.13161678973960358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6774, "loss": 0.1960734724998474, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7898178100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:31] (step=0006774) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.1316362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6775, "loss": 0.17371580004692078, "memory_gb": 7.721559524536133, "step_time_ms": 3369.400978088379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:35] (step=0006775) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.13165565487757483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6776, "loss": 0.21857769787311554, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4629764556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:39] (step=0006776) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.13167508744656042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6777, "loss": 0.309566855430603, "memory_gb": 7.721559524536133, "step_time_ms": 3369.781970977783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:42] (step=0006777) Train Loss: 0.3044, Train Steps/Sec: 0.28, Epoch: 0.13169452001554605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6778, "loss": 0.2277885228395462, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4774169921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:46] (step=0006778) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.13171395258453167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6779, "loss": 0.13611790537834167, "memory_gb": 7.721559524536133, "step_time_ms": 3359.239339828491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:49] (step=0006779) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.1317333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6780, "loss": 0.34800630807876587, "memory_gb": 7.721559524536133, "step_time_ms": 3370.134115219116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:53] (step=0006780) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.1317528177225029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6781, "loss": 0.17362481355667114, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5693435668945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:48:57] (step=0006781) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.13177225029148854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6782, "loss": 0.2311207354068756, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8016872406006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:00] (step=0006782) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.13179168286047416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6783, "loss": 0.1931319236755371, "memory_gb": 7.721559524536133, "step_time_ms": 3369.128942489624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:04] (step=0006783) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.13181111542945978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6784, "loss": 0.3401751220226288, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9984130859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:08] (step=0006784) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.1318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6785, "loss": 0.3058881461620331, "memory_gb": 7.721559524536133, "step_time_ms": 3364.758014678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:11] (step=0006785) Train Loss: 0.3287, Train Steps/Sec: 0.28, Epoch: 0.13184998056743102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6786, "loss": 0.2625405788421631, "memory_gb": 7.721559524536133, "step_time_ms": 3370.164155960083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:15] (step=0006786) Train Loss: 0.2473, Train Steps/Sec: 0.28, Epoch: 0.13186941313641665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6787, "loss": 0.2624412477016449, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7962760925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:18] (step=0006787) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.13188884570540224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6788, "loss": 0.16723483800888062, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8624629974365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:22] (step=0006788) Train Loss: 0.2017, Train Steps/Sec: 0.28, Epoch: 0.13190827827438786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6789, "loss": 0.2752035856246948, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8313961029053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:26] (step=0006789) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.13192771084337349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6790, "loss": 0.236921489238739, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9916229248047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:29] (step=0006790) Train Loss: 0.2426, Train Steps/Sec: 0.28, Epoch: 0.1319471434123591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6791, "loss": 0.27564698457717896, "memory_gb": 7.721559524536133, "step_time_ms": 3369.770050048828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:33] (step=0006791) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.13196657598134473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6792, "loss": 0.30134186148643494, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8712120056152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:36] (step=0006792) Train Loss: 0.2934, Train Steps/Sec: 0.28, Epoch: 0.13198600855033035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6793, "loss": 0.19637587666511536, "memory_gb": 7.721559524536133, "step_time_ms": 3372.279644012451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:40] (step=0006793) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.13200544111931597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6794, "loss": 0.18337799608707428, "memory_gb": 7.721559524536133, "step_time_ms": 3364.27903175354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:44] (step=0006794) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.1320248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6795, "loss": 0.13538295030593872, "memory_gb": 7.721559524536133, "step_time_ms": 3362.370491027832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:47] (step=0006795) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.13204430625728722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6796, "loss": 0.3026811480522156, "memory_gb": 7.721559524536133, "step_time_ms": 3374.986171722412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:51] (step=0006796) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.13206373882627284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6797, "loss": 0.34467029571533203, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9777641296387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:55] (step=0006797) Train Loss: 0.2940, Train Steps/Sec: 0.28, Epoch: 0.13208317139525846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:49:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6798, "loss": 0.2819812297821045, "memory_gb": 7.721559524536133, "step_time_ms": 3374.453067779541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:49:58] (step=0006798) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.1321026039642441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6799, "loss": 0.2610049247741699, "memory_gb": 7.721559524536133, "step_time_ms": 3372.434377670288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:02] (step=0006799) Train Loss: 0.3254, Train Steps/Sec: 0.28, Epoch: 0.13212203653322968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6800, "loss": 0.3077298402786255, "memory_gb": 7.721559524536133, "step_time_ms": 3377.4313926696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:05] (step=0006800) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.1321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6801, "loss": 0.22287677228450775, "memory_gb": 7.721559524536133, "step_time_ms": 3374.781370162964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:09] (step=0006801) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.13216090167120093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6802, "loss": 0.19778066873550415, "memory_gb": 7.721559524536133, "step_time_ms": 3375.3223419189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:13] (step=0006802) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.13218033424018655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6803, "loss": 0.24612925946712494, "memory_gb": 7.721559524536133, "step_time_ms": 3372.957468032837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:16] (step=0006803) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.13219976680917217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6804, "loss": 0.2258603721857071, "memory_gb": 7.721559524536133, "step_time_ms": 3372.1659183502197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:20] (step=0006804) Train Loss: 0.2114, Train Steps/Sec: 0.28, Epoch: 0.1322191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6805, "loss": 0.17518775165081024, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6300563812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:24] (step=0006805) Train Loss: 0.2321, Train Steps/Sec: 0.27, Epoch: 0.13223863194714341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6806, "loss": 0.24009883403778076, "memory_gb": 7.721559524536133, "step_time_ms": 3373.2893466949463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:27] (step=0006806) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.13225806451612904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6807, "loss": 0.30565667152404785, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5109825134277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:31] (step=0006807) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.13227749708511466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6808, "loss": 0.20530100166797638, "memory_gb": 7.721559524536133, "step_time_ms": 3372.978687286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:35] (step=0006808) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.13229692965410028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6809, "loss": 0.30363985896110535, "memory_gb": 7.721559524536133, "step_time_ms": 3371.9840049743652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:38] (step=0006809) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.1323163622230859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6810, "loss": 0.22018073499202728, "memory_gb": 7.721559524536133, "step_time_ms": 3365.997076034546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:42] (step=0006810) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.1323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6811, "loss": 0.26325392723083496, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9724884033203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:45] (step=0006811) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.13235522736105712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6812, "loss": 0.24724651873111725, "memory_gb": 7.721559524536133, "step_time_ms": 3519.427537918091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:49] (step=0006812) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.13237465993004274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6813, "loss": 0.34535178542137146, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8202838897705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:53] (step=0006813) Train Loss: 0.3019, Train Steps/Sec: 0.28, Epoch: 0.13239409249902837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:50:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6814, "loss": 0.3110569417476654, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8478469848633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:50:56] (step=0006814) Train Loss: 0.3061, Train Steps/Sec: 0.28, Epoch: 0.132413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6815, "loss": 0.2367979735136032, "memory_gb": 7.721559524536133, "step_time_ms": 3362.274169921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:00] (step=0006815) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.1324329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6816, "loss": 0.2830163240432739, "memory_gb": 7.721559524536133, "step_time_ms": 3350.77166557312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:03] (step=0006816) Train Loss: 0.2707, Train Steps/Sec: 0.28, Epoch: 0.13245239020598523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6817, "loss": 0.17855215072631836, "memory_gb": 7.721559524536133, "step_time_ms": 3370.180606842041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:07] (step=0006817) Train Loss: 0.1680, Train Steps/Sec: 0.28, Epoch: 0.13247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6818, "loss": 0.18952831625938416, "memory_gb": 7.721559524536133, "step_time_ms": 3370.983123779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:11] (step=0006818) Train Loss: 0.2042, Train Steps/Sec: 0.28, Epoch: 0.13249125534395648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6819, "loss": 0.25063201785087585, "memory_gb": 7.721559524536133, "step_time_ms": 3367.107629776001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:14] (step=0006819) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.1325106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6820, "loss": 0.23695839941501617, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5797080993652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:18] (step=0006820) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.13253012048192772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6821, "loss": 0.2646591365337372, "memory_gb": 7.721559524536133, "step_time_ms": 3368.443250656128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:22] (step=0006821) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.13254955305091334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6822, "loss": 0.32348817586898804, "memory_gb": 7.721559524536133, "step_time_ms": 3355.748176574707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:25] (step=0006822) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.13256898561989894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6823, "loss": 0.2559756636619568, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0637817382812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:29] (step=0006823) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.13258841818888456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6824, "loss": 0.24314507842063904, "memory_gb": 7.721559524536133, "step_time_ms": 3366.487979888916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:32] (step=0006824) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.13260785075787018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6825, "loss": 0.31286489963531494, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9800033569336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:36] (step=0006825) Train Loss: 0.2873, Train Steps/Sec: 0.28, Epoch: 0.1326272833268558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6826, "loss": 0.20132680237293243, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3265705108643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:40] (step=0006826) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.13264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6827, "loss": 0.30829161405563354, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1602993011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:43] (step=0006827) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.13266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6828, "loss": 0.2726701498031616, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0943298339844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:47] (step=0006828) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.13268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6829, "loss": 0.19122591614723206, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5400886535645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:50] (step=0006829) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.1327050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6830, "loss": 0.2038639783859253, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7148609161377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:54] (step=0006830) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.13272444617178392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:51:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6831, "loss": 0.22762653231620789, "memory_gb": 7.721559524536133, "step_time_ms": 3366.837501525879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:51:58] (step=0006831) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.13274387874076954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6832, "loss": 0.22836646437644958, "memory_gb": 7.721559524536133, "step_time_ms": 3367.088556289673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:01] (step=0006832) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.13276331130975516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6833, "loss": 0.3467302918434143, "memory_gb": 7.721559524536133, "step_time_ms": 3364.143133163452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:05] (step=0006833) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.13278274387874078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6834, "loss": 0.26217442750930786, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2052154541016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:09] (step=0006834) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.13280217644772638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6835, "loss": 0.2600312829017639, "memory_gb": 7.721559524536133, "step_time_ms": 3349.0688800811768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:12] (step=0006835) Train Loss: 0.2341, Train Steps/Sec: 0.28, Epoch: 0.132821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6836, "loss": 0.17343951761722565, "memory_gb": 7.721559524536133, "step_time_ms": 3364.622116088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:16] (step=0006836) Train Loss: 0.1836, Train Steps/Sec: 0.28, Epoch: 0.13284104158569762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6837, "loss": 0.20085424184799194, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0600910186768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:19] (step=0006837) Train Loss: 0.1537, Train Steps/Sec: 0.28, Epoch: 0.13286047415468324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6838, "loss": 0.30996888875961304, "memory_gb": 7.721559524536133, "step_time_ms": 3362.489938735962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:23] (step=0006838) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.13287990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6839, "loss": 0.21199846267700195, "memory_gb": 7.721559524536133, "step_time_ms": 3365.370750427246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:27] (step=0006839) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.1328993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6840, "loss": 0.17200934886932373, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8517742156982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:30] (step=0006840) Train Loss: 0.1484, Train Steps/Sec: 0.28, Epoch: 0.1329187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6841, "loss": 0.25793132185935974, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3037338256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:34] (step=0006841) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.13293820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6842, "loss": 0.2541603446006775, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1902256011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:37] (step=0006842) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.13295763699961136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6843, "loss": 0.18409496545791626, "memory_gb": 7.721559524536133, "step_time_ms": 3359.851837158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:41] (step=0006843) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.13297706956859698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6844, "loss": 0.2841774821281433, "memory_gb": 7.721559524536133, "step_time_ms": 3367.573022842407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:45] (step=0006844) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.1329965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6845, "loss": 0.22329241037368774, "memory_gb": 7.721559524536133, "step_time_ms": 3358.631134033203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:48] (step=0006845) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.1330159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6846, "loss": 0.19375115633010864, "memory_gb": 7.721559524536133, "step_time_ms": 3355.501651763916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:52] (step=0006846) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.13303536727555382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6847, "loss": 0.22773993015289307, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6833686828613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:56] (step=0006847) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.13305479984453944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:52:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6848, "loss": 0.2853873670101166, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3522357940674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:52:59] (step=0006848) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.13307423241352506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6849, "loss": 0.24250172078609467, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6996746063232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:03] (step=0006849) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.13309366498251068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6850, "loss": 0.2074294537305832, "memory_gb": 7.721559524536133, "step_time_ms": 3354.156255722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:06] (step=0006850) Train Loss: 0.1900, Train Steps/Sec: 0.28, Epoch: 0.1331130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6851, "loss": 0.1896083652973175, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4908714294434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:10] (step=0006851) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.13313253012048193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6852, "loss": 0.2408023178577423, "memory_gb": 7.721559524536133, "step_time_ms": 3347.5284576416016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:14] (step=0006852) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.13315196268946755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6853, "loss": 0.2449919432401657, "memory_gb": 7.721559524536133, "step_time_ms": 3495.4686164855957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:17] (step=0006853) Train Loss: 0.1949, Train Steps/Sec: 0.27, Epoch: 0.13317139525845317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6854, "loss": 0.22831805050373077, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0620708465576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:21] (step=0006854) Train Loss: 0.1941, Train Steps/Sec: 0.28, Epoch: 0.1331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6855, "loss": 0.33058086037635803, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6975288391113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:25] (step=0006855) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.13321026039642442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6856, "loss": 0.22384537756443024, "memory_gb": 7.721559524536133, "step_time_ms": 3359.222650527954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:28] (step=0006856) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.13322969296541004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6857, "loss": 0.3364071547985077, "memory_gb": 7.721559524536133, "step_time_ms": 3358.461380004883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:32] (step=0006857) Train Loss: 0.2949, Train Steps/Sec: 0.28, Epoch: 0.13324912553439563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6858, "loss": 0.19429154694080353, "memory_gb": 7.721559524536133, "step_time_ms": 3353.013753890991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:35] (step=0006858) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.13326855810338126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6859, "loss": 0.21500453352928162, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3289127349854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:39] (step=0006859) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.13328799067236688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6860, "loss": 0.373737096786499, "memory_gb": 7.721559524536133, "step_time_ms": 3341.855049133301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:43] (step=0006860) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.1333074232413525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6861, "loss": 0.15419337153434753, "memory_gb": 7.721559524536133, "step_time_ms": 3355.168104171753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:46] (step=0006861) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.13332685581033812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6862, "loss": 0.14537236094474792, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3545894622803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:50] (step=0006862) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.13334628837932375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6863, "loss": 0.1915922313928604, "memory_gb": 7.721559524536133, "step_time_ms": 3357.980251312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:54] (step=0006863) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.13336572094830937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6864, "loss": 0.23170675337314606, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7542304992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:53:57] (step=0006864) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.133385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6865, "loss": 0.1734502911567688, "memory_gb": 7.721559524536133, "step_time_ms": 3362.185001373291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:01] (step=0006865) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.1334045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6866, "loss": 0.31913453340530396, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1816425323486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:04] (step=0006866) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.13342401865526624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6867, "loss": 0.1801593005657196, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5802059173584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:08] (step=0006867) Train Loss: 0.1730, Train Steps/Sec: 0.28, Epoch: 0.13344345122425186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6868, "loss": 0.20250143110752106, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375238418579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:12] (step=0006868) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.13346288379323745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6869, "loss": 0.19733168184757233, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4650592803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:15] (step=0006869) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.13348231636222307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6870, "loss": 0.2925288677215576, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4517498016357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:19] (step=0006870) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.1335017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6871, "loss": 0.18360909819602966, "memory_gb": 7.721559524536133, "step_time_ms": 3357.248067855835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:22] (step=0006871) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.13352118150019432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6872, "loss": 0.1473412811756134, "memory_gb": 7.721559524536133, "step_time_ms": 3359.260082244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:26] (step=0006872) Train Loss: 0.1365, Train Steps/Sec: 0.28, Epoch: 0.13354061406917994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6873, "loss": 0.1721777766942978, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2494983673096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:30] (step=0006873) Train Loss: 0.1864, Train Steps/Sec: 0.28, Epoch: 0.13356004663816556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6874, "loss": 0.2538018226623535, "memory_gb": 7.721559524536133, "step_time_ms": 3353.792667388916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:33] (step=0006874) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.13357947920715119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6875, "loss": 0.23848555982112885, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5684299468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:37] (step=0006875) Train Loss: 0.2078, Train Steps/Sec: 0.28, Epoch: 0.1335989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6876, "loss": 0.20537523925304413, "memory_gb": 7.721559524536133, "step_time_ms": 3358.346700668335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:40] (step=0006876) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.13361834434512243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6877, "loss": 0.2046351134777069, "memory_gb": 7.721559524536133, "step_time_ms": 3356.884479522705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:44] (step=0006877) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.13363777691410805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6878, "loss": 0.2468978464603424, "memory_gb": 7.721559524536133, "step_time_ms": 3359.168767929077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:48] (step=0006878) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.13365720948309368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6879, "loss": 0.3279421925544739, "memory_gb": 7.721559524536133, "step_time_ms": 3339.587688446045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:51] (step=0006879) Train Loss: 0.3169, Train Steps/Sec: 0.28, Epoch: 0.1336766420520793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6880, "loss": 0.23524408042430878, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3191890716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:55] (step=0006880) Train Loss: 0.2747, Train Steps/Sec: 0.28, Epoch: 0.1336960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:54:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6881, "loss": 0.27120763063430786, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2521934509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:54:58] (step=0006881) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.13371550719005051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6882, "loss": 0.2344091534614563, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2653274536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:02] (step=0006882) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.13373493975903614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6883, "loss": 0.1532149314880371, "memory_gb": 7.721559524536133, "step_time_ms": 3358.05606842041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:05] (step=0006883) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.13375437232802176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6884, "loss": 0.16174408793449402, "memory_gb": 7.721559524536133, "step_time_ms": 3357.844591140747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:09] (step=0006884) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.13377380489700738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6885, "loss": 0.1571645885705948, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8929481506348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:13] (step=0006885) Train Loss: 0.1536, Train Steps/Sec: 0.28, Epoch: 0.133793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6886, "loss": 0.24828051030635834, "memory_gb": 7.721559524536133, "step_time_ms": 3357.83314704895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:16] (step=0006886) Train Loss: 0.2887, Train Steps/Sec: 0.28, Epoch: 0.13381267003497863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6887, "loss": 0.23328536748886108, "memory_gb": 7.721559524536133, "step_time_ms": 3357.414960861206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:20] (step=0006887) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.13383210260396425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6888, "loss": 0.2684633731842041, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2959175109863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:23] (step=0006888) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.13385153517294987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6889, "loss": 0.17303021252155304, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8802814483643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:27] (step=0006889) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.1338709677419355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6890, "loss": 0.22127379477024078, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4931106567383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:31] (step=0006890) Train Loss: 0.1748, Train Steps/Sec: 0.28, Epoch: 0.13389040031092111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6891, "loss": 0.2527441382408142, "memory_gb": 7.721559524536133, "step_time_ms": 3356.837749481201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:34] (step=0006891) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.13390983287990674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6892, "loss": 0.23186299204826355, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7046604156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:38] (step=0006892) Train Loss: 0.2886, Train Steps/Sec: 0.28, Epoch: 0.13392926544889233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6893, "loss": 0.2740837335586548, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0434017181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:42] (step=0006893) Train Loss: 0.2638, Train Steps/Sec: 0.27, Epoch: 0.13394869801787795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6894, "loss": 0.19420209527015686, "memory_gb": 7.721559524536133, "step_time_ms": 3365.664482116699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:45] (step=0006894) Train Loss: 0.1705, Train Steps/Sec: 0.28, Epoch: 0.13396813058686358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 6895, "loss": 0.21699002385139465, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7865314483643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:49] (step=0006895) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.1339875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6896, "loss": 0.20011135935783386, "memory_gb": 7.721559524536133, "step_time_ms": 3364.766836166382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:52] (step=0006896) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.13400699572483482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:55:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6897, "loss": 0.20113268494606018, "memory_gb": 7.721559524536133, "step_time_ms": 3369.381904602051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:55:56] (step=0006897) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.13402642829382044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6898, "loss": 0.2518314719200134, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0035133361816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:00] (step=0006898) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.13404586086280607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6899, "loss": 0.30598771572113037, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0007972717285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:03] (step=0006899) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.1340652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 6900, "loss": 0.2738143503665924, "memory_gb": 7.721559524536133, "step_time_ms": 3373.194932937622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:07] (step=0006900) Train Loss: 0.2838, Train Steps/Sec: 0.28, Epoch: 0.1340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6901, "loss": 0.28251761198043823, "memory_gb": 7.721559524536133, "step_time_ms": 3519.200086593628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:10] (step=0006901) Train Loss: 0.3143, Train Steps/Sec: 0.28, Epoch: 0.13410415856976293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6902, "loss": 0.2405059039592743, "memory_gb": 7.721559524536133, "step_time_ms": 3372.884750366211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:14] (step=0006902) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.13412359113874855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 6903, "loss": 0.2837013900279999, "memory_gb": 7.721559524536133, "step_time_ms": 3373.4824657440186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:18] (step=0006903) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.13414302370773415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6904, "loss": 0.2692525088787079, "memory_gb": 7.721559524536133, "step_time_ms": 3368.396520614624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:21] (step=0006904) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.13416245627671977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6905, "loss": 0.3244357705116272, "memory_gb": 7.721559524536133, "step_time_ms": 3373.4889030456543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:25] (step=0006905) Train Loss: 0.3548, Train Steps/Sec: 0.28, Epoch: 0.1341818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6906, "loss": 0.2255009412765503, "memory_gb": 7.721559524536133, "step_time_ms": 3366.563081741333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:29] (step=0006906) Train Loss: 0.2497, Train Steps/Sec: 0.28, Epoch: 0.13420132141469102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6907, "loss": 0.36392438411712646, "memory_gb": 7.721559524536133, "step_time_ms": 3368.640899658203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:32] (step=0006907) Train Loss: 0.3087, Train Steps/Sec: 0.28, Epoch: 0.13422075398367664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 6908, "loss": 0.16242341697216034, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4493350982666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:36] (step=0006908) Train Loss: 0.1409, Train Steps/Sec: 0.28, Epoch: 0.13424018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6909, "loss": 0.1703653335571289, "memory_gb": 7.721559524536133, "step_time_ms": 3368.133544921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:39] (step=0006909) Train Loss: 0.2638, Train Steps/Sec: 0.28, Epoch: 0.13425961912164788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 6910, "loss": 0.1937083601951599, "memory_gb": 7.721559524536133, "step_time_ms": 3365.157127380371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:43] (step=0006910) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.1342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6911, "loss": 0.2696555256843567, "memory_gb": 7.715639114379883, "step_time_ms": 3342.283010482788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:47] (step=0006911) Train Loss: 0.2977, Train Steps/Sec: 0.28, Epoch: 0.13429848425961913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6912, "loss": 0.23578429222106934, "memory_gb": 7.721559524536133, "step_time_ms": 3367.72084236145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:50] (step=0006912) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.13431791682860475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 6913, "loss": 0.18068450689315796, "memory_gb": 7.721559524536133, "step_time_ms": 3370.34273147583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:54] (step=0006913) Train Loss: 0.1725, Train Steps/Sec: 0.28, Epoch: 0.13433734939759037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:56:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6914, "loss": 0.2444664090871811, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0086345672607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:56:57] (step=0006914) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.134356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 6915, "loss": 0.33350005745887756, "memory_gb": 7.721559524536133, "step_time_ms": 3369.110107421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:01] (step=0006915) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.1343762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6916, "loss": 0.26599282026290894, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0803985595703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:05] (step=0006916) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.1343956471045472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6917, "loss": 0.3225584924221039, "memory_gb": 7.721559524536133, "step_time_ms": 3368.333101272583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:08] (step=0006917) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.13441507967353283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6918, "loss": 0.15512745082378387, "memory_gb": 7.721559524536133, "step_time_ms": 3371.2079524993896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:12] (step=0006918) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.13443451224251846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6919, "loss": 0.24490398168563843, "memory_gb": 7.721559524536133, "step_time_ms": 3370.258331298828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:16] (step=0006919) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.13445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6920, "loss": 0.2950994372367859, "memory_gb": 7.721559524536133, "step_time_ms": 3370.137929916382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:19] (step=0006920) Train Loss: 0.2799, Train Steps/Sec: 0.28, Epoch: 0.1344733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6921, "loss": 0.1760956346988678, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7450675964355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:23] (step=0006921) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.13449280994947532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6922, "loss": 0.18216077983379364, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9826259613037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:26] (step=0006922) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.13451224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 6923, "loss": 0.24133451282978058, "memory_gb": 7.721559524536133, "step_time_ms": 3370.488405227661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:30] (step=0006923) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.13453167508744657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6924, "loss": 0.26715511083602905, "memory_gb": 7.721559524536133, "step_time_ms": 3371.730327606201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:34] (step=0006924) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.1345511076564322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6925, "loss": 0.2081824243068695, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3621368408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:37] (step=0006925) Train Loss: 0.1806, Train Steps/Sec: 0.28, Epoch: 0.1345705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6926, "loss": 0.15858277678489685, "memory_gb": 7.721559524536133, "step_time_ms": 3371.591806411743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:41] (step=0006926) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.1345899727944034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6927, "loss": 0.1610957384109497, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7195987701416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:44] (step=0006927) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.13460940536338903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6928, "loss": 0.2084672451019287, "memory_gb": 7.721559524536133, "step_time_ms": 3373.890161514282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:48] (step=0006928) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.13462883793237465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6929, "loss": 0.22334103286266327, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8224601745605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:52] (step=0006929) Train Loss: 0.1987, Train Steps/Sec: 0.28, Epoch: 0.13464827050136027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6930, "loss": 0.30460530519485474, "memory_gb": 7.721559524536133, "step_time_ms": 3365.492343902588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:55] (step=0006930) Train Loss: 0.2982, Train Steps/Sec: 0.28, Epoch: 0.1346677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:57:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6931, "loss": 0.12341783195734024, "memory_gb": 7.721559524536133, "step_time_ms": 3371.997833251953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:57:59] (step=0006931) Train Loss: 0.1724, Train Steps/Sec: 0.28, Epoch: 0.13468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6932, "loss": 0.28752684593200684, "memory_gb": 7.721559524536133, "step_time_ms": 3368.784189224243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:03] (step=0006932) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.13470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6933, "loss": 0.2042754590511322, "memory_gb": 7.721559524536133, "step_time_ms": 3346.578359603882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:06] (step=0006933) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.13472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6934, "loss": 0.2799041271209717, "memory_gb": 7.721559524536133, "step_time_ms": 3368.375301361084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:10] (step=0006934) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.13474543334628838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 6935, "loss": 0.1887267678976059, "memory_gb": 7.721559524536133, "step_time_ms": 3358.257532119751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:13] (step=0006935) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.134764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6936, "loss": 0.18091422319412231, "memory_gb": 7.721559524536133, "step_time_ms": 3364.694833755493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:17] (step=0006936) Train Loss: 0.1908, Train Steps/Sec: 0.28, Epoch: 0.13478429848425963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6937, "loss": 0.2891041040420532, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4191493988037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:21] (step=0006937) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.13480373105324525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 6938, "loss": 0.3280520439147949, "memory_gb": 7.721559524536133, "step_time_ms": 3359.356641769409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:24] (step=0006938) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.13482316362223085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6939, "loss": 0.187255859375, "memory_gb": 7.721559524536133, "step_time_ms": 3366.670608520508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:28] (step=0006939) Train Loss: 0.1976, Train Steps/Sec: 0.28, Epoch: 0.13484259619121647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6940, "loss": 0.24673834443092346, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4062328338623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:32] (step=0006940) Train Loss: 0.2764, Train Steps/Sec: 0.27, Epoch: 0.1348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6941, "loss": 0.286035418510437, "memory_gb": 7.721559524536133, "step_time_ms": 3514.6968364715576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:35] (step=0006941) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.1348814613291877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6942, "loss": 0.23025384545326233, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5480918884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:39] (step=0006942) Train Loss: 0.2138, Train Steps/Sec: 0.28, Epoch: 0.13490089389817334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6943, "loss": 0.2969133257865906, "memory_gb": 7.715639114379883, "step_time_ms": 3329.000949859619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:42] (step=0006943) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.13492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6944, "loss": 0.15034794807434082, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7701740264893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:46] (step=0006944) Train Loss: 0.1788, Train Steps/Sec: 0.28, Epoch: 0.13493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6945, "loss": 0.29726019501686096, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3689575195312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:50] (step=0006945) Train Loss: 0.2877, Train Steps/Sec: 0.28, Epoch: 0.1349591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6946, "loss": 0.1567447930574417, "memory_gb": 7.721559524536133, "step_time_ms": 3366.943597793579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:53] (step=0006946) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.13497862417411582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:58:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6947, "loss": 0.2343282401561737, "memory_gb": 7.721559524536133, "step_time_ms": 3365.525722503662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:58:57] (step=0006947) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.13499805674310145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6948, "loss": 0.18738195300102234, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9416179656982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:00] (step=0006948) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.13501748931208707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6949, "loss": 0.1535172462463379, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4685020446777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:04] (step=0006949) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.1350369218810727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 6950, "loss": 0.24073994159698486, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4774684906006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:08] (step=0006950) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.13505635445005829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 6951, "loss": 0.16871780157089233, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6675815582275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:11] (step=0006951) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.1350757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 6952, "loss": 0.30112379789352417, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7121658325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:15] (step=0006952) Train Loss: 0.2713, Train Steps/Sec: 0.28, Epoch: 0.13509521958802953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 6953, "loss": 0.2013959288597107, "memory_gb": 7.721559524536133, "step_time_ms": 3363.884925842285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:19] (step=0006953) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.13511465215701515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 6954, "loss": 0.2720814347267151, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5928840637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:22] (step=0006954) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.13513408472600077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 6955, "loss": 0.3419961631298065, "memory_gb": 7.721559524536133, "step_time_ms": 3363.936185836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:26] (step=0006955) Train Loss: 0.3050, Train Steps/Sec: 0.28, Epoch: 0.1351535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 6956, "loss": 0.19692328572273254, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0189685821533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:29] (step=0006956) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.13517294986397202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 6957, "loss": 0.20255479216575623, "memory_gb": 7.721559524536133, "step_time_ms": 3357.13267326355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:33] (step=0006957) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.13519238243295764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 6958, "loss": 0.21565555036067963, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3132705688477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:37] (step=0006958) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.13521181500194326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 6959, "loss": 0.3082951307296753, "memory_gb": 7.721559524536133, "step_time_ms": 3362.405300140381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:40] (step=0006959) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.1352312475709289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 6960, "loss": 0.26947224140167236, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5177612304688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:44] (step=0006960) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.1352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 6961, "loss": 0.26562851667404175, "memory_gb": 7.721559524536133, "step_time_ms": 3358.351945877075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:47] (step=0006961) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.1352701127089001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 6962, "loss": 0.20468059182167053, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7708473205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:51] (step=0006962) Train Loss: 0.1719, Train Steps/Sec: 0.28, Epoch: 0.13528954527788573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 6963, "loss": 0.25947296619415283, "memory_gb": 7.721559524536133, "step_time_ms": 3361.659288406372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:55] (step=0006963) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.13530897784687135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 06:59:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 6964, "loss": 0.14655020833015442, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0384769439697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 06:59:58] (step=0006964) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.13532841041585697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 6965, "loss": 0.2598155736923218, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7185096740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:02] (step=0006965) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.1353478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 6966, "loss": 0.2708771526813507, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7828102111816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:05] (step=0006966) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.13536727555382821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 6967, "loss": 0.20802681148052216, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8878688812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:09] (step=0006967) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.13538670812281384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 6968, "loss": 0.3172699213027954, "memory_gb": 7.721559524536133, "step_time_ms": 3363.129138946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:12] (step=0006968) Train Loss: 0.3310, Train Steps/Sec: 0.28, Epoch: 0.13540614069179946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 6969, "loss": 0.30515074729919434, "memory_gb": 7.721559524536133, "step_time_ms": 3359.821319580078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:16] (step=0006969) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.13542557326078508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 6970, "loss": 0.260759174823761, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0994606018066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:20] (step=0006970) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.1354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 6971, "loss": 0.24883529543876648, "memory_gb": 7.721559524536133, "step_time_ms": 3360.642671585083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:23] (step=0006971) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.13546443839875633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 6972, "loss": 0.3110840618610382, "memory_gb": 7.721559524536133, "step_time_ms": 3362.238883972168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:27] (step=0006972) Train Loss: 0.3267, Train Steps/Sec: 0.28, Epoch: 0.13548387096774195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 6973, "loss": 0.0876135379076004, "memory_gb": 7.721559524536133, "step_time_ms": 3357.243537902832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:31] (step=0006973) Train Loss: 0.1400, Train Steps/Sec: 0.28, Epoch: 0.13550330353672754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 6974, "loss": 0.16845543682575226, "memory_gb": 7.721559524536133, "step_time_ms": 3363.994359970093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:34] (step=0006974) Train Loss: 0.1496, Train Steps/Sec: 0.28, Epoch: 0.13552273610571317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 6975, "loss": 0.21393054723739624, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1299266815186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:38] (step=0006975) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.1355421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 6976, "loss": 0.29833927750587463, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2285385131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:41] (step=0006976) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.1355616012436844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 6977, "loss": 0.20988795161247253, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2897605895996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:45] (step=0006977) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.13558103381267003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 6978, "loss": 0.17882195115089417, "memory_gb": 7.721559524536133, "step_time_ms": 3368.795156478882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:49] (step=0006978) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.13560046638165565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 6979, "loss": 0.27287963032722473, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2922897338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:52] (step=0006979) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.13561989895064128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 6980, "loss": 0.2933845520019531, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7031059265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:56] (step=0006980) Train Loss: 0.3046, Train Steps/Sec: 0.28, Epoch: 0.1356393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:00:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 6981, "loss": 0.1778930127620697, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1836433410645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:00:59] (step=0006981) Train Loss: 0.1676, Train Steps/Sec: 0.28, Epoch: 0.13565876408861252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 6982, "loss": 0.2605869174003601, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9933643341064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:03] (step=0006982) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.13567819665759814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 6983, "loss": 0.3015730381011963, "memory_gb": 7.721559524536133, "step_time_ms": 3366.628408432007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:06] (step=0006983) Train Loss: 0.2767, Train Steps/Sec: 0.28, Epoch: 0.13569762922658377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 6984, "loss": 0.1933894157409668, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9463443756104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:10] (step=0006984) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.1357170617955694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 6985, "loss": 0.2640954554080963, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5054454803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:14] (step=0006985) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.13573649436455498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 6986, "loss": 0.16593921184539795, "memory_gb": 7.721559524536133, "step_time_ms": 3363.050699234009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:17] (step=0006986) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.1357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 6987, "loss": 0.24325615167617798, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375476837158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:21] (step=0006987) Train Loss: 0.1961, Train Steps/Sec: 0.28, Epoch: 0.13577535950252623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 6988, "loss": 0.1892213225364685, "memory_gb": 7.721559524536133, "step_time_ms": 3510.594606399536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:25] (step=0006988) Train Loss: 0.2340, Train Steps/Sec: 0.27, Epoch: 0.13579479207151185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 6989, "loss": 0.4057998061180115, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1227741241455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:28] (step=0006989) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.13581422464049747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 6990, "loss": 0.1791684627532959, "memory_gb": 7.721559524536133, "step_time_ms": 3346.135139465332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:32] (step=0006990) Train Loss: 0.1617, Train Steps/Sec: 0.28, Epoch: 0.1358336572094831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 6991, "loss": 0.3487841486930847, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5474910736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:35] (step=0006991) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.13585308977846872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 6992, "loss": 0.22591623663902283, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4824962615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:39] (step=0006992) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.13587252234745434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 6993, "loss": 0.33157870173454285, "memory_gb": 7.721559524536133, "step_time_ms": 3362.44535446167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:42] (step=0006993) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.13589195491643996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 6994, "loss": 0.31213438510894775, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7494316101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:46] (step=0006994) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.13591138748542558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 6995, "loss": 0.2516275644302368, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5622005462646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:50] (step=0006995) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.1359308200544112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 6996, "loss": 0.24414782226085663, "memory_gb": 7.721559524536133, "step_time_ms": 3370.793104171753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:53] (step=0006996) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.1359502526233968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:01:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 6997, "loss": 0.17912346124649048, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2170639038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:01:57] (step=0006997) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.13596968519238242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 6998, "loss": 0.22781498730182648, "memory_gb": 7.721559524536133, "step_time_ms": 3360.764503479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:00] (step=0006998) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.13598911776136804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 6999, "loss": 0.2188052535057068, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8289184570312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:04] (step=0006999) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.13600855033035367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7000, "loss": 0.18830767273902893, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0220165252686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:08] (step=0007000) Train Loss: 0.1594, Train Steps/Sec: 0.28, Epoch: 0.1360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7001, "loss": 0.19143053889274597, "memory_gb": 7.721559524536133, "step_time_ms": 3364.081621170044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:11] (step=0007001) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.1360474154683249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7002, "loss": 0.2963900864124298, "memory_gb": 7.721559524536133, "step_time_ms": 3365.882396697998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:15] (step=0007002) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.13606684803731053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7003, "loss": 0.13829174637794495, "memory_gb": 7.721559524536133, "step_time_ms": 3360.443353652954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:18] (step=0007003) Train Loss: 0.1471, Train Steps/Sec: 0.28, Epoch: 0.13608628060629616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7004, "loss": 0.29614219069480896, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6634559631348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:22] (step=0007004) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.13610571317528178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7005, "loss": 0.30024027824401855, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9563064575195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:26] (step=0007005) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.1361251457442674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7006, "loss": 0.26149171590805054, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0660724639893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:29] (step=0007006) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.13614457831325302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7007, "loss": 0.21926872432231903, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9977703094482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:33] (step=0007007) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.13616401088223865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7008, "loss": 0.2831893563270569, "memory_gb": 7.715639114379883, "step_time_ms": 3336.1313343048096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:37] (step=0007008) Train Loss: 0.3234, Train Steps/Sec: 0.28, Epoch: 0.13618344345122424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7009, "loss": 0.2320927530527115, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9660091400146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:40] (step=0007009) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.13620287602020986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7010, "loss": 0.3026600480079651, "memory_gb": 7.721559524536133, "step_time_ms": 3365.983724594116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:44] (step=0007010) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.13622230858919548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7011, "loss": 0.2619357109069824, "memory_gb": 7.721559524536133, "step_time_ms": 3366.441488265991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:47] (step=0007011) Train Loss: 0.1896, Train Steps/Sec: 0.28, Epoch: 0.1362417411581811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7012, "loss": 0.2100001871585846, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0990467071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:51] (step=0007012) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.13626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7013, "loss": 0.21031621098518372, "memory_gb": 7.721559524536133, "step_time_ms": 3368.520975112915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:55] (step=0007013) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.13628060629615235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:02:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7014, "loss": 0.23254331946372986, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5784130096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:02:58] (step=0007014) Train Loss: 0.2034, Train Steps/Sec: 0.28, Epoch: 0.13630003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7015, "loss": 0.24279235303401947, "memory_gb": 7.721559524536133, "step_time_ms": 3365.706443786621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:02] (step=0007015) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.1363194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7016, "loss": 0.23710894584655762, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3453121185303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:05] (step=0007016) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.13633890400310922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7017, "loss": 0.28887486457824707, "memory_gb": 7.721559524536133, "step_time_ms": 3363.136053085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:09] (step=0007017) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.13635833657209484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7018, "loss": 0.34408414363861084, "memory_gb": 7.721559524536133, "step_time_ms": 3367.546558380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:13] (step=0007018) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.13637776914108046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7019, "loss": 0.3708754777908325, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2123889923096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:16] (step=0007019) Train Loss: 0.2846, Train Steps/Sec: 0.28, Epoch: 0.13639720171006606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7020, "loss": 0.31603527069091797, "memory_gb": 7.721559524536133, "step_time_ms": 3366.924285888672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:20] (step=0007020) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.13641663427905168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7021, "loss": 0.1508202999830246, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7513999938965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:23] (step=0007021) Train Loss: 0.1880, Train Steps/Sec: 0.28, Epoch: 0.1364360668480373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7022, "loss": 0.15633077919483185, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3131256103516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:27] (step=0007022) Train Loss: 0.1630, Train Steps/Sec: 0.28, Epoch: 0.13645549941702292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7023, "loss": 0.1617797166109085, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1493740081787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:31] (step=0007023) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.13647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7024, "loss": 0.26903122663497925, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8220977783203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:34] (step=0007024) Train Loss: 0.2799, Train Steps/Sec: 0.28, Epoch: 0.13649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7025, "loss": 0.24592693150043488, "memory_gb": 7.721559524536133, "step_time_ms": 3365.738868713379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:38] (step=0007025) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.1365137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7026, "loss": 0.19246241450309753, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6210384368896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:42] (step=0007026) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.1365332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7027, "loss": 0.29139792919158936, "memory_gb": 7.721559524536133, "step_time_ms": 3370.321750640869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:45] (step=0007027) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.13655266226195104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7028, "loss": 0.2815975844860077, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7967529296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:49] (step=0007028) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.13657209483093666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7029, "loss": 0.23326121270656586, "memory_gb": 7.721559524536133, "step_time_ms": 3526.0634422302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:52] (step=0007029) Train Loss: 0.2579, Train Steps/Sec: 0.27, Epoch: 0.13659152739992228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:03:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7030, "loss": 0.23060724139213562, "memory_gb": 7.721559524536133, "step_time_ms": 3370.110511779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:03:56] (step=0007030) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.1366109599689079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7031, "loss": 0.19832777976989746, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2098903656006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:00] (step=0007031) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.1366303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7032, "loss": 0.19698718190193176, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5030002593994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:03] (step=0007032) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.13664982510687912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7033, "loss": 0.29792487621307373, "memory_gb": 7.715639114379883, "step_time_ms": 3337.237596511841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:07] (step=0007033) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.13666925767586474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7034, "loss": 0.18654218316078186, "memory_gb": 7.721559524536133, "step_time_ms": 3366.82391166687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:11] (step=0007034) Train Loss: 0.1783, Train Steps/Sec: 0.28, Epoch: 0.13668869024485036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7035, "loss": 0.19450610876083374, "memory_gb": 7.721559524536133, "step_time_ms": 3348.155975341797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:14] (step=0007035) Train Loss: 0.2054, Train Steps/Sec: 0.28, Epoch: 0.13670812281383599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7036, "loss": 0.15970483422279358, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8406734466553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:18] (step=0007036) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.1367275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7037, "loss": 0.3075239062309265, "memory_gb": 7.721559524536133, "step_time_ms": 3369.28391456604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:21] (step=0007037) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.13674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7038, "loss": 0.19507741928100586, "memory_gb": 7.721559524536133, "step_time_ms": 3369.54927444458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:25] (step=0007038) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.13676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7039, "loss": 0.2571391761302948, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4447536468506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:29] (step=0007039) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.13678585308977848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7040, "loss": 0.24780824780464172, "memory_gb": 7.721559524536133, "step_time_ms": 3361.971139907837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:32] (step=0007040) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.1368052856587641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7041, "loss": 0.2837887406349182, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5148277282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:36] (step=0007041) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.13682471822774972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7042, "loss": 0.2840343117713928, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1178817749023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:39] (step=0007042) Train Loss: 0.2920, Train Steps/Sec: 0.28, Epoch: 0.13684415079673534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7043, "loss": 0.15673133730888367, "memory_gb": 7.721559524536133, "step_time_ms": 3365.528106689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:43] (step=0007043) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.13686358336572094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7044, "loss": 0.200076162815094, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6627082824707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:46] (step=0007044) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.13688301593470656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7045, "loss": 0.18919511139392853, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0990467071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:50] (step=0007045) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.13690244850369218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7046, "loss": 0.2763049602508545, "memory_gb": 7.721559524536133, "step_time_ms": 3363.663673400879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:54] (step=0007046) Train Loss: 0.2263, Train Steps/Sec: 0.28, Epoch: 0.1369218810726778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7047, "loss": 0.24411271512508392, "memory_gb": 7.721559524536133, "step_time_ms": 3366.553544998169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:04:57] (step=0007047) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.13694131364166343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7048, "loss": 0.17330560088157654, "memory_gb": 7.721559524536133, "step_time_ms": 3347.882032394409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:01] (step=0007048) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.13696074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7049, "loss": 0.32333487272262573, "memory_gb": 7.721559524536133, "step_time_ms": 3354.51602935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:04] (step=0007049) Train Loss: 0.3154, Train Steps/Sec: 0.28, Epoch: 0.13698017877963467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7050, "loss": 0.25892016291618347, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3088760375977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:08] (step=0007050) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.1369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7051, "loss": 0.20219656825065613, "memory_gb": 7.721559524536133, "step_time_ms": 3371.9708919525146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:12] (step=0007051) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.13701904391760591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7052, "loss": 0.15679669380187988, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2546195983887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:15] (step=0007052) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.13703847648659154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7053, "loss": 0.3481687605381012, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3575649261475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:19] (step=0007053) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.13705790905557716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7054, "loss": 0.2672857642173767, "memory_gb": 7.721559524536133, "step_time_ms": 3368.818759918213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:22] (step=0007054) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.13707734162456275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7055, "loss": 0.22666509449481964, "memory_gb": 7.721559524536133, "step_time_ms": 3363.734245300293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:26] (step=0007055) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.13709677419354838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7056, "loss": 0.20191657543182373, "memory_gb": 7.721559524536133, "step_time_ms": 3357.896089553833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:30] (step=0007056) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.137116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7057, "loss": 0.16454485058784485, "memory_gb": 7.721559524536133, "step_time_ms": 3366.741418838501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:33] (step=0007057) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.13713563933151962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7058, "loss": 0.27546969056129456, "memory_gb": 7.715639114379883, "step_time_ms": 3332.0963382720947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:37] (step=0007058) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.13715507190050524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7059, "loss": 0.25121045112609863, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8359050750732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:40] (step=0007059) Train Loss: 0.2707, Train Steps/Sec: 0.28, Epoch: 0.13717450446949087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7060, "loss": 0.2926098704338074, "memory_gb": 7.721559524536133, "step_time_ms": 3363.631248474121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:44] (step=0007060) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.1371939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7061, "loss": 0.10969855636358261, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0522956848145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:48] (step=0007061) Train Loss: 0.1790, Train Steps/Sec: 0.28, Epoch: 0.1372133696074621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7062, "loss": 0.26582154631614685, "memory_gb": 7.721559524536133, "step_time_ms": 3361.518383026123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:51] (step=0007062) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.13723280217644773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7063, "loss": 0.313496470451355, "memory_gb": 7.721559524536133, "step_time_ms": 3365.788221359253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:55] (step=0007063) Train Loss: 0.2833, Train Steps/Sec: 0.28, Epoch: 0.13725223474543335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7064, "loss": 0.3502563536167145, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9797019958496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:05:58] (step=0007064) Train Loss: 0.2885, Train Steps/Sec: 0.28, Epoch: 0.13727166731441898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7065, "loss": 0.1533580720424652, "memory_gb": 7.721559524536133, "step_time_ms": 3363.621234893799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:02] (step=0007065) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.1372910998834046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7066, "loss": 0.2802489995956421, "memory_gb": 7.721559524536133, "step_time_ms": 3360.321521759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:06] (step=0007066) Train Loss: 0.2381, Train Steps/Sec: 0.28, Epoch: 0.1373105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7067, "loss": 0.19013211131095886, "memory_gb": 7.721559524536133, "step_time_ms": 3363.279342651367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:09] (step=0007067) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.13732996502137582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7068, "loss": 0.18446847796440125, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3477478027344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:13] (step=0007068) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.13734939759036144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7069, "loss": 0.2413029968738556, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8407974243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:16] (step=0007069) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.13736883015934706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7070, "loss": 0.3136448860168457, "memory_gb": 7.721559524536133, "step_time_ms": 3507.554292678833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:20] (step=0007070) Train Loss: 0.2581, Train Steps/Sec: 0.27, Epoch: 0.13738826272833268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7071, "loss": 0.3326683044433594, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4067096710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:24] (step=0007071) Train Loss: 0.3166, Train Steps/Sec: 0.28, Epoch: 0.1374076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7072, "loss": 0.30585455894470215, "memory_gb": 7.721559524536133, "step_time_ms": 3365.034341812134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:27] (step=0007072) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.13742712786630393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7073, "loss": 0.2003973424434662, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9039363861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:31] (step=0007073) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.13744656043528955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7074, "loss": 0.3672882318496704, "memory_gb": 7.721559524536133, "step_time_ms": 3354.504108428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:35] (step=0007074) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.13746599300427517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7075, "loss": 0.2550375163555145, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8998832702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:38] (step=0007075) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.1374854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7076, "loss": 0.3336767554283142, "memory_gb": 7.721559524536133, "step_time_ms": 3346.7535972595215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:42] (step=0007076) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.13750485814224642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7077, "loss": 0.2954425811767578, "memory_gb": 7.721559524536133, "step_time_ms": 3361.287832260132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:45] (step=0007077) Train Loss: 0.2764, Train Steps/Sec: 0.28, Epoch: 0.137524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7078, "loss": 0.24978943169116974, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1394939422607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:49] (step=0007078) Train Loss: 0.2790, Train Steps/Sec: 0.28, Epoch: 0.13754372328021763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7079, "loss": 0.30601364374160767, "memory_gb": 7.721559524536133, "step_time_ms": 3356.736660003662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:52] (step=0007079) Train Loss: 0.3180, Train Steps/Sec: 0.28, Epoch: 0.13756315584920326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:06:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7080, "loss": 0.3605204224586487, "memory_gb": 7.721559524536133, "step_time_ms": 3343.6810970306396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:06:56] (step=0007080) Train Loss: 0.3341, Train Steps/Sec: 0.28, Epoch: 0.13758258841818888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7081, "loss": 0.32758796215057373, "memory_gb": 7.721559524536133, "step_time_ms": 3359.752893447876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:00] (step=0007081) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.1376020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7082, "loss": 0.27727606892585754, "memory_gb": 7.721559524536133, "step_time_ms": 3354.778289794922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:03] (step=0007082) Train Loss: 0.3318, Train Steps/Sec: 0.28, Epoch: 0.13762145355616012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7083, "loss": 0.23115819692611694, "memory_gb": 7.721559524536133, "step_time_ms": 3355.055093765259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:07] (step=0007083) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.13764088612514574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7084, "loss": 0.16143906116485596, "memory_gb": 7.721559524536133, "step_time_ms": 3353.23166847229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:10] (step=0007084) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.13766031869413137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7085, "loss": 0.3529645800590515, "memory_gb": 7.721559524536133, "step_time_ms": 3345.881700515747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:14] (step=0007085) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.137679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7086, "loss": 0.35052669048309326, "memory_gb": 7.721559524536133, "step_time_ms": 3357.248067855835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:18] (step=0007086) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.1376991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7087, "loss": 0.23540224134922028, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5360469818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:21] (step=0007087) Train Loss: 0.1812, Train Steps/Sec: 0.28, Epoch: 0.13771861640108823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7088, "loss": 0.32848379015922546, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8790378570557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:25] (step=0007088) Train Loss: 0.3344, Train Steps/Sec: 0.28, Epoch: 0.13773804897007386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7089, "loss": 0.2474355399608612, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0548763275146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:28] (step=0007089) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.13775748153905945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7090, "loss": 0.2597470283508301, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9485416412354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:32] (step=0007090) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.13777691410804507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7091, "loss": 0.35554492473602295, "memory_gb": 7.721559524536133, "step_time_ms": 3360.08358001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:36] (step=0007091) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.1377963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7092, "loss": 0.2789919376373291, "memory_gb": 7.721559524536133, "step_time_ms": 3363.574743270874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:39] (step=0007092) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.13781577924601632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7093, "loss": 0.23977583646774292, "memory_gb": 7.721559524536133, "step_time_ms": 3361.509084701538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:43] (step=0007093) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.13783521181500194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7094, "loss": 0.25492602586746216, "memory_gb": 7.721559524536133, "step_time_ms": 3362.56742477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:46] (step=0007094) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.13785464438398756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7095, "loss": 0.26633626222610474, "memory_gb": 7.721559524536133, "step_time_ms": 3357.168197631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:50] (step=0007095) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.13787407695297318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7096, "loss": 0.24449853599071503, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2286319732666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:53] (step=0007096) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.1378935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:07:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7097, "loss": 0.27701884508132935, "memory_gb": 7.721559524536133, "step_time_ms": 3362.884759902954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:07:57] (step=0007097) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.13791294209094443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7098, "loss": 0.22585180401802063, "memory_gb": 7.721559524536133, "step_time_ms": 3362.936496734619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:01] (step=0007098) Train Loss: 0.1769, Train Steps/Sec: 0.28, Epoch: 0.13793237465993005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7099, "loss": 0.29138725996017456, "memory_gb": 7.721559524536133, "step_time_ms": 3363.588809967041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:04] (step=0007099) Train Loss: 0.2681, Train Steps/Sec: 0.28, Epoch: 0.13795180722891567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7100, "loss": 0.24296776950359344, "memory_gb": 7.721559524536133, "step_time_ms": 3346.1718559265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:08] (step=0007100) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.1379712397979013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7101, "loss": 0.25785183906555176, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3080253601074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:11] (step=0007101) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.1379906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7102, "loss": 0.22142037749290466, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4773025512695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:15] (step=0007102) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.1380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7103, "loss": 0.2728702425956726, "memory_gb": 7.715639114379883, "step_time_ms": 3325.770616531372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:19] (step=0007103) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.13802953750485814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7104, "loss": 0.2597420811653137, "memory_gb": 7.721559524536133, "step_time_ms": 3361.217260360718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:22] (step=0007104) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.13804897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7105, "loss": 0.32788872718811035, "memory_gb": 7.715639114379883, "step_time_ms": 3324.1090774536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:26] (step=0007105) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.13806840264282938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7106, "loss": 0.24892380833625793, "memory_gb": 7.721559524536133, "step_time_ms": 3360.549211502075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:29] (step=0007106) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.138087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7107, "loss": 0.26024124026298523, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2140159606934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:33] (step=0007107) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.13810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7108, "loss": 0.16231496632099152, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8007221221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:37] (step=0007108) Train Loss: 0.1758, Train Steps/Sec: 0.28, Epoch: 0.13812670034978625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7109, "loss": 0.2818007469177246, "memory_gb": 7.721559524536133, "step_time_ms": 3363.332986831665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:40] (step=0007109) Train Loss: 0.3088, Train Steps/Sec: 0.28, Epoch: 0.13814613291877187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7110, "loss": 0.22166527807712555, "memory_gb": 7.721559524536133, "step_time_ms": 3358.581066131592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:44] (step=0007110) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.1381655654877575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7111, "loss": 0.2952605187892914, "memory_gb": 7.721559524536133, "step_time_ms": 3364.112138748169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:47] (step=0007111) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.1381849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7112, "loss": 0.15733104944229126, "memory_gb": 7.721559524536133, "step_time_ms": 3357.252359390259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:51] (step=0007112) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.1382044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7113, "loss": 0.27111250162124634, "memory_gb": 7.721559524536133, "step_time_ms": 3345.921516418457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:54] (step=0007113) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.13822386319471433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7114, "loss": 0.2634693384170532, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0272827148438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:08:58] (step=0007114) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.13824329576369995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7115, "loss": 0.23325014114379883, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1177673339844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:02] (step=0007115) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.13826272833268557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7116, "loss": 0.29894131422042847, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6511096954346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:05] (step=0007116) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.1382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7117, "loss": 0.15029317140579224, "memory_gb": 7.721559524536133, "step_time_ms": 3360.988140106201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:09] (step=0007117) Train Loss: 0.2238, Train Steps/Sec: 0.27, Epoch: 0.13830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7118, "loss": 0.22508248686790466, "memory_gb": 7.721559524536133, "step_time_ms": 3515.976667404175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:13] (step=0007118) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.13832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7119, "loss": 0.35602691769599915, "memory_gb": 7.721559524536133, "step_time_ms": 3362.325668334961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:16] (step=0007119) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.13834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7120, "loss": 0.3441540598869324, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7093048095703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:20] (step=0007120) Train Loss: 0.2942, Train Steps/Sec: 0.28, Epoch: 0.1383598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7121, "loss": 0.21956020593643188, "memory_gb": 7.721559524536133, "step_time_ms": 3364.828586578369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:23] (step=0007121) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.1383793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7122, "loss": 0.1538545936346054, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8506145477295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:27] (step=0007122) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.13839875631558493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7123, "loss": 0.23527811467647552, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0690574645996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:31] (step=0007123) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.13841818888457055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7124, "loss": 0.27166205644607544, "memory_gb": 7.721559524536133, "step_time_ms": 3367.584228515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:34] (step=0007124) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.13843762145355615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7125, "loss": 0.20707422494888306, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6696548461914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:38] (step=0007125) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.13845705402254177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7126, "loss": 0.23841528594493866, "memory_gb": 7.721559524536133, "step_time_ms": 3362.870693206787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:41] (step=0007126) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.1384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7127, "loss": 0.15526965260505676, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0710067749023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:45] (step=0007127) Train Loss: 0.2341, Train Steps/Sec: 0.28, Epoch: 0.13849591916051301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7128, "loss": 0.260786235332489, "memory_gb": 7.721559524536133, "step_time_ms": 3369.342088699341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:49] (step=0007128) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.13851535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7129, "loss": 0.22608166933059692, "memory_gb": 7.721559524536133, "step_time_ms": 3370.443344116211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:52] (step=0007129) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.13853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7130, "loss": 0.24568229913711548, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6277141571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:56] (step=0007130) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.13855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7131, "loss": 0.2913461923599243, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4981079101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:09:59] (step=0007131) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.1385736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7132, "loss": 0.19444069266319275, "memory_gb": 7.721559524536133, "step_time_ms": 3371.38295173645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:03] (step=0007132) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.13859308200544113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7133, "loss": 0.22683371603488922, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5758113861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:07] (step=0007133) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.13861251457442675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7134, "loss": 0.2299300730228424, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4230060577393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:10] (step=0007134) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.13863194714341237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7135, "loss": 0.24628956615924835, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0103759765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:14] (step=0007135) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.13865137971239797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7136, "loss": 0.18703946471214294, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7829456329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:17] (step=0007136) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.1386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7137, "loss": 0.20562559366226196, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3648834228516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:21] (step=0007137) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.1386902448503692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7138, "loss": 0.16282245516777039, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5407638549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:25] (step=0007138) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.13870967741935483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7139, "loss": 0.34199386835098267, "memory_gb": 7.721559524536133, "step_time_ms": 3367.229223251343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:28] (step=0007139) Train Loss: 0.3106, Train Steps/Sec: 0.28, Epoch: 0.13872910998834045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7140, "loss": 0.28390923142433167, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6446418762207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:32] (step=0007140) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.13874854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7141, "loss": 0.1891949474811554, "memory_gb": 7.721559524536133, "step_time_ms": 3366.560935974121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:36] (step=0007141) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.1387679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7142, "loss": 0.28343862295150757, "memory_gb": 7.721559524536133, "step_time_ms": 3367.723226547241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:39] (step=0007142) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.13878740769529732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7143, "loss": 0.3056567907333374, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4256801605225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:43] (step=0007143) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.13880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7144, "loss": 0.1665300726890564, "memory_gb": 7.721559524536133, "step_time_ms": 3367.79522895813, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:46] (step=0007144) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.13882627283326857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7145, "loss": 0.2908221185207367, "memory_gb": 7.721559524536133, "step_time_ms": 3369.612693786621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:50] (step=0007145) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.1388457054022542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7146, "loss": 0.22527733445167542, "memory_gb": 7.721559524536133, "step_time_ms": 3368.975877761841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:54] (step=0007146) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.1388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7147, "loss": 0.19805888831615448, "memory_gb": 7.721559524536133, "step_time_ms": 3366.473913192749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:10:57] (step=0007147) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.1388845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7148, "loss": 0.17865273356437683, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6723499298096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:01] (step=0007148) Train Loss: 0.1415, Train Steps/Sec: 0.28, Epoch: 0.13890400310921103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7149, "loss": 0.17188337445259094, "memory_gb": 7.721559524536133, "step_time_ms": 3361.349582672119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:04] (step=0007149) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.13892343567819665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7150, "loss": 0.2361888289451599, "memory_gb": 7.721559524536133, "step_time_ms": 3363.58380317688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:08] (step=0007150) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.13894286824718227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7151, "loss": 0.25158247351646423, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9354190826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:12] (step=0007151) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.1389623008161679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7152, "loss": 0.23093582689762115, "memory_gb": 7.721559524536133, "step_time_ms": 3362.321615219116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:15] (step=0007152) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.13898173338515352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7153, "loss": 0.3008357286453247, "memory_gb": 7.721559524536133, "step_time_ms": 3363.956928253174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:19] (step=0007153) Train Loss: 0.2884, Train Steps/Sec: 0.28, Epoch: 0.13900116595413914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7154, "loss": 0.2548348307609558, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4255867004395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:22] (step=0007154) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.13902059852312476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7155, "loss": 0.21685239672660828, "memory_gb": 7.715639114379883, "step_time_ms": 3323.2080936431885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:26] (step=0007155) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.13904003109211038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7156, "loss": 0.2653096616268158, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0335540771484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:30] (step=0007156) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.139059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7157, "loss": 0.1493227481842041, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9286708831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:33] (step=0007157) Train Loss: 0.1470, Train Steps/Sec: 0.27, Epoch: 0.13907889623008163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7158, "loss": 0.1621587574481964, "memory_gb": 7.721559524536133, "step_time_ms": 3363.037109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:37] (step=0007158) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.13909832879906725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7159, "loss": 0.1794579029083252, "memory_gb": 7.721559524536133, "step_time_ms": 3506.2577724456787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:41] (step=0007159) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.13911776136805284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7160, "loss": 0.27192074060440063, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8553619384766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:44] (step=0007160) Train Loss: 0.2790, Train Steps/Sec: 0.28, Epoch: 0.13913719393703847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7161, "loss": 0.2531018853187561, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2852096557617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:48] (step=0007161) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.1391566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7162, "loss": 0.19600021839141846, "memory_gb": 7.721559524536133, "step_time_ms": 3368.135690689087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:51] (step=0007162) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.1391760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7163, "loss": 0.18250055611133575, "memory_gb": 7.721559524536133, "step_time_ms": 3358.078956604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:55] (step=0007163) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.13919549164399533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7164, "loss": 0.30875569581985474, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3883514404297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:11:59] (step=0007164) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.13921492421298096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7165, "loss": 0.1789451539516449, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7617359161377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:02] (step=0007165) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.13923435678196658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7166, "loss": 0.23480209708213806, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7099056243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:06] (step=0007166) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.1392537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7167, "loss": 0.2731874883174896, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7288856506348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:09] (step=0007167) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.13927322191993782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7168, "loss": 0.16808673739433289, "memory_gb": 7.721559524536133, "step_time_ms": 3361.186742782593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:13] (step=0007168) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.13929265448892345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7169, "loss": 0.2406749278306961, "memory_gb": 7.721559524536133, "step_time_ms": 3360.605239868164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:17] (step=0007169) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.13931208705790907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7170, "loss": 0.20624591410160065, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8085384368896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:20] (step=0007170) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.13933151962689466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7171, "loss": 0.20276886224746704, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1042499542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:24] (step=0007171) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.13935095219588028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7172, "loss": 0.21460719406604767, "memory_gb": 7.721559524536133, "step_time_ms": 3355.708122253418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:27] (step=0007172) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.1393703847648659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7173, "loss": 0.38681185245513916, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3669147491455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:31] (step=0007173) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.13938981733385153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7174, "loss": 0.25520479679107666, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6107654571533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:35] (step=0007174) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.13940924990283715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7175, "loss": 0.21156808733940125, "memory_gb": 7.721559524536133, "step_time_ms": 3358.181953430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:38] (step=0007175) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.13942868247182277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7176, "loss": 0.20356741547584534, "memory_gb": 7.721559524536133, "step_time_ms": 3360.691547393799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:42] (step=0007176) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.1394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7177, "loss": 0.2528197765350342, "memory_gb": 7.721559524536133, "step_time_ms": 3356.309652328491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:45] (step=0007177) Train Loss: 0.2082, Train Steps/Sec: 0.28, Epoch: 0.13946754760979402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7178, "loss": 0.2099481225013733, "memory_gb": 7.721559524536133, "step_time_ms": 3368.091106414795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:49] (step=0007178) Train Loss: 0.2979, Train Steps/Sec: 0.28, Epoch: 0.13948698017877964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7179, "loss": 0.2426966428756714, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3620738983154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:53] (step=0007179) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.13950641274776526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:12:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7180, "loss": 0.2915400564670563, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4630908966064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:12:56] (step=0007180) Train Loss: 0.2804, Train Steps/Sec: 0.28, Epoch: 0.13952584531675088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7181, "loss": 0.1808953434228897, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3696098327637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:00] (step=0007181) Train Loss: 0.1853, Train Steps/Sec: 0.28, Epoch: 0.1395452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7182, "loss": 0.16596350073814392, "memory_gb": 7.721559524536133, "step_time_ms": 3355.607748031616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:03] (step=0007182) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.1395647104547221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7183, "loss": 0.27860134840011597, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4931831359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:07] (step=0007183) Train Loss: 0.2622, Train Steps/Sec: 0.28, Epoch: 0.13958414302370772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7184, "loss": 0.17386718094348907, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4398498535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:11] (step=0007184) Train Loss: 0.2247, Train Steps/Sec: 0.28, Epoch: 0.13960357559269335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7185, "loss": 0.25266289710998535, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0973358154297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:14] (step=0007185) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.13962300816167897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7186, "loss": 0.22487156093120575, "memory_gb": 7.721559524536133, "step_time_ms": 3356.593370437622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:18] (step=0007186) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.1396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7187, "loss": 0.38583263754844666, "memory_gb": 7.721559524536133, "step_time_ms": 3363.276243209839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:21] (step=0007187) Train Loss: 0.3345, Train Steps/Sec: 0.28, Epoch: 0.1396618732996502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7188, "loss": 0.24550729990005493, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6795539855957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:25] (step=0007188) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.13968130586863584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7189, "loss": 0.32168036699295044, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0451641082764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:29] (step=0007189) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.13970073843762146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7190, "loss": 0.33769989013671875, "memory_gb": 7.721559524536133, "step_time_ms": 3365.361452102661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:32] (step=0007190) Train Loss: 0.3203, Train Steps/Sec: 0.28, Epoch: 0.13972017100660708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7191, "loss": 0.2161211371421814, "memory_gb": 7.721559524536133, "step_time_ms": 3365.457773208618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:36] (step=0007191) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.1397396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7192, "loss": 0.22543448209762573, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7639541625977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:39] (step=0007192) Train Loss: 0.2868, Train Steps/Sec: 0.28, Epoch: 0.13975903614457832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7193, "loss": 0.20730876922607422, "memory_gb": 7.721559524536133, "step_time_ms": 3365.208864212036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:43] (step=0007193) Train Loss: 0.1694, Train Steps/Sec: 0.28, Epoch: 0.13977846871356395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7194, "loss": 0.22854727506637573, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9600582122803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:46] (step=0007194) Train Loss: 0.2341, Train Steps/Sec: 0.28, Epoch: 0.13979790128254954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7195, "loss": 0.20378616452217102, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9366931915283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:50] (step=0007195) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.13981733385153516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7196, "loss": 0.2879539132118225, "memory_gb": 7.715639114379883, "step_time_ms": 3329.9689292907715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:54] (step=0007196) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.13983676642052079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:13:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7197, "loss": 0.1919286847114563, "memory_gb": 7.721559524536133, "step_time_ms": 3361.595869064331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:13:57] (step=0007197) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.1398561989895064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7198, "loss": 0.19152702391147614, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2658977508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:01] (step=0007198) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.13987563155849203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7199, "loss": 0.3428998291492462, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3975982666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:04] (step=0007199) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.13989506412747765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7200, "loss": 0.17379963397979736, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0592098236084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:08] (step=0007200) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.13991449669646328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7201, "loss": 0.2726152539253235, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5577430725098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:12] (step=0007201) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.1399339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7202, "loss": 0.2126440554857254, "memory_gb": 7.721559524536133, "step_time_ms": 3360.456943511963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:15] (step=0007202) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.13995336183443452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7203, "loss": 0.27738818526268005, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4495010375977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:19] (step=0007203) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.13997279440342014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7204, "loss": 0.20076367259025574, "memory_gb": 7.721559524536133, "step_time_ms": 3362.778902053833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:22] (step=0007204) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.13999222697240576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7205, "loss": 0.17725080251693726, "memory_gb": 7.721559524536133, "step_time_ms": 3366.044759750366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:26] (step=0007205) Train Loss: 0.2075, Train Steps/Sec: 0.27, Epoch: 0.14001165954139136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7206, "loss": 0.15400834381580353, "memory_gb": 7.721559524536133, "step_time_ms": 3513.105869293213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:30] (step=0007206) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.14003109211037698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7207, "loss": 0.2425224483013153, "memory_gb": 7.721559524536133, "step_time_ms": 3357.548952102661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:33] (step=0007207) Train Loss: 0.2707, Train Steps/Sec: 0.28, Epoch: 0.1400505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7208, "loss": 0.18856662511825562, "memory_gb": 7.721559524536133, "step_time_ms": 3360.738754272461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:37] (step=0007208) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.14006995724834823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7209, "loss": 0.1964227259159088, "memory_gb": 7.721559524536133, "step_time_ms": 3358.217477798462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:41] (step=0007209) Train Loss: 0.1861, Train Steps/Sec: 0.28, Epoch: 0.14008938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7210, "loss": 0.2689089775085449, "memory_gb": 7.721559524536133, "step_time_ms": 3362.741708755493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:44] (step=0007210) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.14010882238631947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7211, "loss": 0.23394016921520233, "memory_gb": 7.721559524536133, "step_time_ms": 3366.509199142456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:48] (step=0007211) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.1401282549553051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7212, "loss": 0.2170119285583496, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2844429016113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:51] (step=0007212) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.14014768752429071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7213, "loss": 0.2098030298948288, "memory_gb": 7.721559524536133, "step_time_ms": 3366.990327835083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:55] (step=0007213) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.14016712009327634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7214, "loss": 0.1898956149816513, "memory_gb": 7.721559524536133, "step_time_ms": 3345.4079627990723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:14:59] (step=0007214) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.14018655266226196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7215, "loss": 0.31915634870529175, "memory_gb": 7.721559524536133, "step_time_ms": 3361.196756362915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:02] (step=0007215) Train Loss: 0.3155, Train Steps/Sec: 0.28, Epoch: 0.14020598523124758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7216, "loss": 0.20239216089248657, "memory_gb": 7.721559524536133, "step_time_ms": 3363.342046737671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:06] (step=0007216) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.1402254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7217, "loss": 0.15294744074344635, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5170249938965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:09] (step=0007217) Train Loss: 0.2123, Train Steps/Sec: 0.28, Epoch: 0.1402448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7218, "loss": 0.3697410225868225, "memory_gb": 7.721559524536133, "step_time_ms": 3367.166042327881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:13] (step=0007218) Train Loss: 0.3473, Train Steps/Sec: 0.28, Epoch: 0.14026428293820442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7219, "loss": 0.2969460189342499, "memory_gb": 7.721559524536133, "step_time_ms": 3362.520456314087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:17] (step=0007219) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.14028371550719004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7220, "loss": 0.31166303157806396, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5825901031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:20] (step=0007220) Train Loss: 0.3193, Train Steps/Sec: 0.28, Epoch: 0.14030314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7221, "loss": 0.19193078577518463, "memory_gb": 7.721559524536133, "step_time_ms": 3365.501880645752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:24] (step=0007221) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.1403225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7222, "loss": 0.39020204544067383, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1436100006104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:27] (step=0007222) Train Loss: 0.3247, Train Steps/Sec: 0.28, Epoch: 0.1403420132141469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7223, "loss": 0.22307388484477997, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8523349761963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:31] (step=0007223) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.14036144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7224, "loss": 0.1965947300195694, "memory_gb": 7.721559524536133, "step_time_ms": 3364.077091217041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:35] (step=0007224) Train Loss: 0.1958, Train Steps/Sec: 0.28, Epoch: 0.14038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7225, "loss": 0.15428847074508667, "memory_gb": 7.721559524536133, "step_time_ms": 3366.267442703247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:38] (step=0007225) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.14040031092110378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7226, "loss": 0.23617956042289734, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9227924346924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:42] (step=0007226) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.1404197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7227, "loss": 0.22147047519683838, "memory_gb": 7.721559524536133, "step_time_ms": 3366.184949874878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:46] (step=0007227) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.14043917605907502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7228, "loss": 0.2508077025413513, "memory_gb": 7.715639114379883, "step_time_ms": 3328.925132751465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:49] (step=0007228) Train Loss: 0.2450, Train Steps/Sec: 0.28, Epoch: 0.14045860862806062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7229, "loss": 0.28266826272010803, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3369464874268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:53] (step=0007229) Train Loss: 0.2644, Train Steps/Sec: 0.28, Epoch: 0.14047804119704624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7230, "loss": 0.18932583928108215, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1579875946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:15:56] (step=0007230) Train Loss: 0.1845, Train Steps/Sec: 0.28, Epoch: 0.14049747376603186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7231, "loss": 0.3373017907142639, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1047897338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:00] (step=0007231) Train Loss: 0.2942, Train Steps/Sec: 0.28, Epoch: 0.14051690633501748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7232, "loss": 0.18101584911346436, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3156757354736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:04] (step=0007232) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.1405363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7233, "loss": 0.18929427862167358, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1631088256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:07] (step=0007233) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.14055577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7234, "loss": 0.16201704740524292, "memory_gb": 7.721559524536133, "step_time_ms": 3367.04683303833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:11] (step=0007234) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.14057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7235, "loss": 0.29814475774765015, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2773838043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:14] (step=0007235) Train Loss: 0.2973, Train Steps/Sec: 0.28, Epoch: 0.14059463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7236, "loss": 0.29694825410842896, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0427589416504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:18] (step=0007236) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.1406140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7237, "loss": 0.1398838460445404, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0713176727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:22] (step=0007237) Train Loss: 0.1607, Train Steps/Sec: 0.28, Epoch: 0.14063350174893122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7238, "loss": 0.1790444701910019, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9539947509766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:25] (step=0007238) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.14065293431791684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7239, "loss": 0.24058671295642853, "memory_gb": 7.721559524536133, "step_time_ms": 3365.762710571289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:29] (step=0007239) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.14067236688690246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7240, "loss": 0.2791787385940552, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3480796813965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:32] (step=0007240) Train Loss: 0.3104, Train Steps/Sec: 0.28, Epoch: 0.14069179945588806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7241, "loss": 0.2795030474662781, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1629219055176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:36] (step=0007241) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.14071123202487368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7242, "loss": 0.2359311282634735, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1581020355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:40] (step=0007242) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.1407306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7243, "loss": 0.29951128363609314, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3459758758545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:43] (step=0007243) Train Loss: 0.3121, Train Steps/Sec: 0.28, Epoch: 0.14075009716284492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7244, "loss": 0.21578359603881836, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2642707824707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:47] (step=0007244) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.14076952973183054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7245, "loss": 0.23540912568569183, "memory_gb": 7.721559524536133, "step_time_ms": 3365.445852279663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:51] (step=0007245) Train Loss: 0.2568, Train Steps/Sec: 0.27, Epoch: 0.14078896230081617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7246, "loss": 0.2222832888364792, "memory_gb": 7.721559524536133, "step_time_ms": 3363.09814453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:54] (step=0007246) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.1408083948698018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:16:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7247, "loss": 0.3248552083969116, "memory_gb": 7.721559524536133, "step_time_ms": 3505.6960582733154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:16:58] (step=0007247) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.1408278274387874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7248, "loss": 0.274387389421463, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5686264038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:01] (step=0007248) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.14084726000777303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7249, "loss": 0.22884756326675415, "memory_gb": 7.721559524536133, "step_time_ms": 3364.015817642212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:05] (step=0007249) Train Loss: 0.2020, Train Steps/Sec: 0.28, Epoch: 0.14086669257675866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7250, "loss": 0.23892322182655334, "memory_gb": 7.721559524536133, "step_time_ms": 3363.72971534729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:09] (step=0007250) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.14088612514574428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7251, "loss": 0.20271512866020203, "memory_gb": 7.721559524536133, "step_time_ms": 3358.565330505371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:12] (step=0007251) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.1409055577147299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7252, "loss": 0.25083547830581665, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3759021759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:16] (step=0007252) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.1409249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7253, "loss": 0.14514565467834473, "memory_gb": 7.721559524536133, "step_time_ms": 3361.811876296997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:19] (step=0007253) Train Loss: 0.1864, Train Steps/Sec: 0.28, Epoch: 0.14094442285270112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7254, "loss": 0.2794201374053955, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8860969543457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:23] (step=0007254) Train Loss: 0.3080, Train Steps/Sec: 0.28, Epoch: 0.14096385542168674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7255, "loss": 0.19158485531806946, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9777431488037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:27] (step=0007255) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.14098328799067236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7256, "loss": 0.2472509741783142, "memory_gb": 7.721559524536133, "step_time_ms": 3363.347291946411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:30] (step=0007256) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.14100272055965798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7257, "loss": 0.2511235475540161, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4047508239746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:34] (step=0007257) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.1410221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7258, "loss": 0.2773634195327759, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8440837860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:38] (step=0007258) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.14104158569762923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7259, "loss": 0.2927786111831665, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0212802886963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:41] (step=0007259) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.14106101826661485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7260, "loss": 0.21085798740386963, "memory_gb": 7.721559524536133, "step_time_ms": 3361.046552658081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:45] (step=0007260) Train Loss: 0.1771, Train Steps/Sec: 0.28, Epoch: 0.14108045083560047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7261, "loss": 0.26669567823410034, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5245819091797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:48] (step=0007261) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.1410998834045861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7262, "loss": 0.2545977234840393, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5878868103027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:52] (step=0007262) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.14111931597357172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7263, "loss": 0.2723868489265442, "memory_gb": 7.721559524536133, "step_time_ms": 3355.173349380493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:56] (step=0007263) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.1411387485425573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7264, "loss": 0.23943069577217102, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2238216400146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:17:59] (step=0007264) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.14115818111154294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7265, "loss": 0.2448282241821289, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8176708221436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:03] (step=0007265) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.14117761368052856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7266, "loss": 0.274204820394516, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3057861328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:06] (step=0007266) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.14119704624951418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7267, "loss": 0.2621348202228546, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0585765838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:10] (step=0007267) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.1412164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7268, "loss": 0.2461746782064438, "memory_gb": 7.721559524536133, "step_time_ms": 3359.78364944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:14] (step=0007268) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.14123591138748542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7269, "loss": 0.2959805727005005, "memory_gb": 7.721559524536133, "step_time_ms": 3342.0557975769043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:17] (step=0007269) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.14125534395647105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7270, "loss": 0.3196398615837097, "memory_gb": 7.721559524536133, "step_time_ms": 3361.652135848999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:21] (step=0007270) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.14127477652545667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7271, "loss": 0.284071683883667, "memory_gb": 7.721559524536133, "step_time_ms": 3357.943058013916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:24] (step=0007271) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.1412942090944423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7272, "loss": 0.23376145958900452, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3312969207764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:28] (step=0007272) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.1413136416634279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7273, "loss": 0.23946133255958557, "memory_gb": 7.721559524536133, "step_time_ms": 3352.091073989868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:32] (step=0007273) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.14133307423241354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7274, "loss": 0.27227431535720825, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1271686553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:35] (step=0007274) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.14135250680139916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7275, "loss": 0.22868448495864868, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8010749816895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:39] (step=0007275) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.14137193937038475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7276, "loss": 0.1619548499584198, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5023155212402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:42] (step=0007276) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.14139137193937037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7277, "loss": 0.3664679527282715, "memory_gb": 7.721559524536133, "step_time_ms": 3358.616590499878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:46] (step=0007277) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.141410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7278, "loss": 0.16584211587905884, "memory_gb": 7.721559524536133, "step_time_ms": 3355.278968811035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:50] (step=0007278) Train Loss: 0.1680, Train Steps/Sec: 0.28, Epoch: 0.14143023707734162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7279, "loss": 0.22374257445335388, "memory_gb": 7.721559524536133, "step_time_ms": 3361.964225769043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:53] (step=0007279) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.14144966964632724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7280, "loss": 0.21436406672000885, "memory_gb": 7.721559524536133, "step_time_ms": 3362.082004547119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:18:57] (step=0007280) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.14146910221531286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7281, "loss": 0.20513565838336945, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9790382385254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:00] (step=0007281) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.1414885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7282, "loss": 0.2520551085472107, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3928604125977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:04] (step=0007282) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.1415079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7283, "loss": 0.20337775349617004, "memory_gb": 7.721559524536133, "step_time_ms": 3359.64035987854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:08] (step=0007283) Train Loss: 0.1787, Train Steps/Sec: 0.28, Epoch: 0.14152739992226973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7284, "loss": 0.1813274323940277, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4345531463623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:11] (step=0007284) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.14154683249125535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7285, "loss": 0.23108619451522827, "memory_gb": 7.721559524536133, "step_time_ms": 3357.163429260254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:15] (step=0007285) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.14156626506024098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7286, "loss": 0.19021785259246826, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4349155426025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:18] (step=0007286) Train Loss: 0.1869, Train Steps/Sec: 0.27, Epoch: 0.14158569762922657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7287, "loss": 0.20363710820674896, "memory_gb": 7.721559524536133, "step_time_ms": 3359.083652496338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:22] (step=0007287) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.1416051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7288, "loss": 0.3044995963573456, "memory_gb": 7.721559524536133, "step_time_ms": 3357.236862182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:26] (step=0007288) Train Loss: 0.3189, Train Steps/Sec: 0.28, Epoch: 0.14162456276719781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7289, "loss": 0.20885464549064636, "memory_gb": 7.721559524536133, "step_time_ms": 3351.1505126953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:29] (step=0007289) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.14164399533618344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7290, "loss": 0.24945276975631714, "memory_gb": 7.721559524536133, "step_time_ms": 3358.342170715332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:33] (step=0007290) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.14166342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7291, "loss": 0.2705419659614563, "memory_gb": 7.721559524536133, "step_time_ms": 3357.550859451294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:36] (step=0007291) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.14168286047415468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7292, "loss": 0.24550414085388184, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1769981384277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:40] (step=0007292) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.1417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7293, "loss": 0.3007500171661377, "memory_gb": 7.721559524536133, "step_time_ms": 3356.34708404541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:44] (step=0007293) Train Loss: 0.3201, Train Steps/Sec: 0.28, Epoch: 0.14172172561212593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7294, "loss": 0.29638803005218506, "memory_gb": 7.721559524536133, "step_time_ms": 3360.670328140259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:47] (step=0007294) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.14174115818111155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7295, "loss": 0.24616029858589172, "memory_gb": 7.721559524536133, "step_time_ms": 3505.9049129486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:51] (step=0007295) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.14176059075009717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7296, "loss": 0.27472221851348877, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5104427337646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:54] (step=0007296) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.1417800233190828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:19:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7297, "loss": 0.2588910460472107, "memory_gb": 7.721559524536133, "step_time_ms": 3359.973907470703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:19:58] (step=0007297) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.14179945588806842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7298, "loss": 0.22679877281188965, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6018085479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:02] (step=0007298) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.141818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7299, "loss": 0.2775357961654663, "memory_gb": 7.721559524536133, "step_time_ms": 3363.635301589966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:05] (step=0007299) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.14183832102603963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7300, "loss": 0.25021806359291077, "memory_gb": 7.721559524536133, "step_time_ms": 3360.208749771118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:09] (step=0007300) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.14185775359502525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7301, "loss": 0.2180446833372116, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0756187438965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:12] (step=0007301) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.14187718616401088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7302, "loss": 0.32493215799331665, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1696491241455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:16] (step=0007302) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.1418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7303, "loss": 0.17964142560958862, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8036556243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:20] (step=0007303) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.14191605130198212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7304, "loss": 0.157198965549469, "memory_gb": 7.721559524536133, "step_time_ms": 3358.383893966675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:23] (step=0007304) Train Loss: 0.1347, Train Steps/Sec: 0.28, Epoch: 0.14193548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7305, "loss": 0.19780310988426208, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4048957824707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:27] (step=0007305) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.14195491643995337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7306, "loss": 0.24394050240516663, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0292415618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:30] (step=0007306) Train Loss: 0.3092, Train Steps/Sec: 0.28, Epoch: 0.141974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7307, "loss": 0.28191205859184265, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4909648895264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:34] (step=0007307) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.1419937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7308, "loss": 0.31367453932762146, "memory_gb": 7.721559524536133, "step_time_ms": 3356.273651123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:37] (step=0007308) Train Loss: 0.2452, Train Steps/Sec: 0.28, Epoch: 0.14201321414691023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7309, "loss": 0.19131174683570862, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9918823242188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:41] (step=0007309) Train Loss: 0.2065, Train Steps/Sec: 0.28, Epoch: 0.14203264671589585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7310, "loss": 0.35397934913635254, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8907718658447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:45] (step=0007310) Train Loss: 0.2884, Train Steps/Sec: 0.28, Epoch: 0.14205207928488145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7311, "loss": 0.2484813630580902, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9088401794434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:48] (step=0007311) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.14207151185386707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7312, "loss": 0.23979796469211578, "memory_gb": 7.721559524536133, "step_time_ms": 3361.637830734253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:52] (step=0007312) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.1420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7313, "loss": 0.22635935246944427, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2379512786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:55] (step=0007313) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.14211037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:20:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7314, "loss": 0.375306099653244, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6814613342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:20:59] (step=0007314) Train Loss: 0.3021, Train Steps/Sec: 0.28, Epoch: 0.14212980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7315, "loss": 0.22259044647216797, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2048835754395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:03] (step=0007315) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.14214924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7316, "loss": 0.2263949066400528, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3065223693848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:06] (step=0007316) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.14216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7317, "loss": 0.3206822872161865, "memory_gb": 7.721559524536133, "step_time_ms": 3358.720064163208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:10] (step=0007317) Train Loss: 0.3057, Train Steps/Sec: 0.28, Epoch: 0.1421881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7318, "loss": 0.09400345385074615, "memory_gb": 7.721559524536133, "step_time_ms": 3345.3731536865234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:13] (step=0007318) Train Loss: 0.1729, Train Steps/Sec: 0.28, Epoch: 0.14220753983676643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7319, "loss": 0.34687650203704834, "memory_gb": 7.721559524536133, "step_time_ms": 3360.349178314209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:17] (step=0007319) Train Loss: 0.3004, Train Steps/Sec: 0.28, Epoch: 0.14222697240575205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7320, "loss": 0.26866501569747925, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0670051574707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:21] (step=0007320) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.14224640497473767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7321, "loss": 0.16657763719558716, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6147861480713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:24] (step=0007321) Train Loss: 0.1503, Train Steps/Sec: 0.28, Epoch: 0.14226583754372327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7322, "loss": 0.11864347755908966, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8342361450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:28] (step=0007322) Train Loss: 0.1774, Train Steps/Sec: 0.28, Epoch: 0.1422852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7323, "loss": 0.3078291416168213, "memory_gb": 7.721559524536133, "step_time_ms": 3367.539882659912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:31] (step=0007323) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.1423047026816945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7324, "loss": 0.22034096717834473, "memory_gb": 7.721559524536133, "step_time_ms": 3371.2191581726074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:35] (step=0007324) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.14232413525068013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7325, "loss": 0.2289334237575531, "memory_gb": 7.721559524536133, "step_time_ms": 3361.705780029297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:38] (step=0007325) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.14234356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7326, "loss": 0.30521726608276367, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6413764953613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:42] (step=0007326) Train Loss: 0.3562, Train Steps/Sec: 0.28, Epoch: 0.14236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7327, "loss": 0.29375216364860535, "memory_gb": 7.721559524536133, "step_time_ms": 3366.506576538086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:46] (step=0007327) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.142382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7328, "loss": 0.25871482491493225, "memory_gb": 7.715639114379883, "step_time_ms": 3330.1193714141846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:49] (step=0007328) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.14240186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7329, "loss": 0.2204016149044037, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6482906341553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:53] (step=0007329) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.14242129809560825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:21:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7330, "loss": 0.21746689081192017, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2188053131104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:21:56] (step=0007330) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.14244073066459387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7331, "loss": 0.23889777064323425, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3369884490967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:00] (step=0007331) Train Loss: 0.1884, Train Steps/Sec: 0.28, Epoch: 0.1424601632335795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7332, "loss": 0.25815626978874207, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7517642974854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:04] (step=0007332) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.1424795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7333, "loss": 0.2569388151168823, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9886589050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:07] (step=0007333) Train Loss: 0.2643, Train Steps/Sec: 0.27, Epoch: 0.1424990283715507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7334, "loss": 0.26720505952835083, "memory_gb": 7.721559524536133, "step_time_ms": 3360.732078552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:11] (step=0007334) Train Loss: 0.3131, Train Steps/Sec: 0.28, Epoch: 0.14251846094053633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7335, "loss": 0.20710641145706177, "memory_gb": 7.721559524536133, "step_time_ms": 3507.5905323028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:15] (step=0007335) Train Loss: 0.2144, Train Steps/Sec: 0.28, Epoch: 0.14253789350952195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7336, "loss": 0.31783902645111084, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5593395233154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:18] (step=0007336) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.14255732607850757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7337, "loss": 0.24162769317626953, "memory_gb": 7.721559524536133, "step_time_ms": 3361.318349838257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:22] (step=0007337) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.1425767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7338, "loss": 0.21732071042060852, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9070472717285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:25] (step=0007338) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.14259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7339, "loss": 0.20474082231521606, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9162521362305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:29] (step=0007339) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.14261562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7340, "loss": 0.2693455219268799, "memory_gb": 7.721559524536133, "step_time_ms": 3368.277072906494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:33] (step=0007340) Train Loss: 0.3006, Train Steps/Sec: 0.28, Epoch: 0.14263505635445006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7341, "loss": 0.25772106647491455, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5240745544434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:36] (step=0007341) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.14265448892343568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7342, "loss": 0.2517321705818176, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1821823120117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:40] (step=0007342) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.1426739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7343, "loss": 0.19804775714874268, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7963485717773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:43] (step=0007343) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.14269335406140693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7344, "loss": 0.23505353927612305, "memory_gb": 7.721559524536133, "step_time_ms": 3371.2191581726074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:47] (step=0007344) Train Loss: 0.2350, Train Steps/Sec: 0.28, Epoch: 0.14271278663039252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7345, "loss": 0.15163598954677582, "memory_gb": 7.721559524536133, "step_time_ms": 3372.3607063293457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:51] (step=0007345) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.14273221919937815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7346, "loss": 0.1200726106762886, "memory_gb": 7.721559524536133, "step_time_ms": 3371.579647064209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:54] (step=0007346) Train Loss: 0.1319, Train Steps/Sec: 0.28, Epoch: 0.14275165176836377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:22:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7347, "loss": 0.2796253561973572, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4587478637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:22:58] (step=0007347) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.1427710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7348, "loss": 0.22210481762886047, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5379028320312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:01] (step=0007348) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.142790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7349, "loss": 0.199868842959404, "memory_gb": 7.721559524536133, "step_time_ms": 3370.089054107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:05] (step=0007349) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.14280994947532064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7350, "loss": 0.22282110154628754, "memory_gb": 7.721559524536133, "step_time_ms": 3372.0405101776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:09] (step=0007350) Train Loss: 0.1892, Train Steps/Sec: 0.28, Epoch: 0.14282938204430626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7351, "loss": 0.22977332770824432, "memory_gb": 7.721559524536133, "step_time_ms": 3364.121675491333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:12] (step=0007351) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.14284881461329188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7352, "loss": 0.18420791625976562, "memory_gb": 7.721559524536133, "step_time_ms": 3369.659423828125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:16] (step=0007352) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.1428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7353, "loss": 0.22240032255649567, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2073097229004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:19] (step=0007353) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.14288767975126312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7354, "loss": 0.2508236765861511, "memory_gb": 7.721559524536133, "step_time_ms": 3372.0712661743164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:23] (step=0007354) Train Loss: 0.2882, Train Steps/Sec: 0.28, Epoch: 0.14290711232024875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7355, "loss": 0.3387272357940674, "memory_gb": 7.721559524536133, "step_time_ms": 3369.159698486328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:27] (step=0007355) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.14292654488923437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7356, "loss": 0.24573901295661926, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4847145080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:30] (step=0007356) Train Loss: 0.3081, Train Steps/Sec: 0.28, Epoch: 0.14294597745821996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7357, "loss": 0.1981419324874878, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9702281951904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:34] (step=0007357) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.14296541002720559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7358, "loss": 0.2507319450378418, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1458797454834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:37] (step=0007358) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.1429848425961912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7359, "loss": 0.25451236963272095, "memory_gb": 7.721559524536133, "step_time_ms": 3368.664264678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:41] (step=0007359) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.14300427516517683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7360, "loss": 0.1731039583683014, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7360801696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:45] (step=0007360) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.14302370773416245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7361, "loss": 0.22683657705783844, "memory_gb": 7.721559524536133, "step_time_ms": 3364.041328430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:48] (step=0007361) Train Loss: 0.1874, Train Steps/Sec: 0.28, Epoch: 0.14304314030314808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7362, "loss": 0.24368909001350403, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6985244750977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:52] (step=0007362) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.1430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7363, "loss": 0.3162040710449219, "memory_gb": 7.721559524536133, "step_time_ms": 3367.511987686157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:56] (step=0007363) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.14308200544111932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:23:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7364, "loss": 0.2920193076133728, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0175800323486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:23:59] (step=0007364) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.14310143801010494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7365, "loss": 0.30629652738571167, "memory_gb": 7.715639114379883, "step_time_ms": 3334.8770141601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:03] (step=0007365) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.14312087057909056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7366, "loss": 0.23145978152751923, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8720512390137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:06] (step=0007366) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.1431403031480762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7367, "loss": 0.37734997272491455, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2283210754395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:10] (step=0007367) Train Loss: 0.3610, Train Steps/Sec: 0.28, Epoch: 0.1431597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7368, "loss": 0.1638987958431244, "memory_gb": 7.721559524536133, "step_time_ms": 3366.59574508667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:14] (step=0007368) Train Loss: 0.1446, Train Steps/Sec: 0.28, Epoch: 0.1431791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7369, "loss": 0.2954835891723633, "memory_gb": 7.715639114379883, "step_time_ms": 3327.421188354492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:17] (step=0007369) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.14319860085503303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7370, "loss": 0.24425393342971802, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9982051849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:21] (step=0007370) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.14321803342401865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7371, "loss": 0.23301346600055695, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4317016601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:24] (step=0007371) Train Loss: 0.2488, Train Steps/Sec: 0.28, Epoch: 0.14323746599300427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7372, "loss": 0.3123016357421875, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2520179748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:28] (step=0007372) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.1432568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7373, "loss": 0.1529937982559204, "memory_gb": 7.721559524536133, "step_time_ms": 3363.295793533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:32] (step=0007373) Train Loss: 0.2065, Train Steps/Sec: 0.28, Epoch: 0.14327633113097551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7374, "loss": 0.3168995678424835, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1232509613037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:35] (step=0007374) Train Loss: 0.2916, Train Steps/Sec: 0.27, Epoch: 0.14329576369996114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7375, "loss": 0.26045292615890503, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7680797576904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:39] (step=0007375) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.14331519626894676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7376, "loss": 0.2810857892036438, "memory_gb": 7.721559524536133, "step_time_ms": 3355.337619781494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:43] (step=0007376) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.14333462883793238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7377, "loss": 0.2561671733856201, "memory_gb": 7.721559524536133, "step_time_ms": 3502.9420852661133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:46] (step=0007377) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.143354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7378, "loss": 0.1846570372581482, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4671955108643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:50] (step=0007378) Train Loss: 0.1839, Train Steps/Sec: 0.28, Epoch: 0.14337349397590363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7379, "loss": 0.2658829689025879, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0662689208984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:53] (step=0007379) Train Loss: 0.3122, Train Steps/Sec: 0.28, Epoch: 0.14339292654488922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7380, "loss": 0.3672637343406677, "memory_gb": 7.721559524536133, "step_time_ms": 3359.691619873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:24:57] (step=0007380) Train Loss: 0.3211, Train Steps/Sec: 0.28, Epoch: 0.14341235911387484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7381, "loss": 0.28054624795913696, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7560863494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:01] (step=0007381) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.14343179168286047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7382, "loss": 0.3897656202316284, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9770793914795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:04] (step=0007382) Train Loss: 0.3232, Train Steps/Sec: 0.28, Epoch: 0.1434512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7383, "loss": 0.29295364022254944, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6902618408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:08] (step=0007383) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.1434706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7384, "loss": 0.2942047417163849, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4295978546143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:11] (step=0007384) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.14349008938981733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7385, "loss": 0.1512155532836914, "memory_gb": 7.721559524536133, "step_time_ms": 3363.11411857605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:15] (step=0007385) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.14350952195880295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7386, "loss": 0.27489280700683594, "memory_gb": 7.721559524536133, "step_time_ms": 3355.358839035034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:19] (step=0007386) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.14352895452778858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7387, "loss": 0.2583003640174866, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3058586120605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:22] (step=0007387) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.1435483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7388, "loss": 0.18599933385849, "memory_gb": 7.721559524536133, "step_time_ms": 3352.473258972168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:26] (step=0007388) Train Loss: 0.1590, Train Steps/Sec: 0.28, Epoch: 0.14356781966575982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7389, "loss": 0.2322182059288025, "memory_gb": 7.721559524536133, "step_time_ms": 3359.032392501831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:29] (step=0007389) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.14358725223474544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7390, "loss": 0.22151726484298706, "memory_gb": 7.721559524536133, "step_time_ms": 3357.365846633911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:33] (step=0007390) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.14360668480373107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7391, "loss": 0.3150683641433716, "memory_gb": 7.715639114379883, "step_time_ms": 3329.991340637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:37] (step=0007391) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.14362611737271666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7392, "loss": 0.2394462525844574, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1721057891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:40] (step=0007392) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.14364554994170228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7393, "loss": 0.18909473717212677, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6816997528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:44] (step=0007393) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.1436649825106879, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7394, "loss": 0.29024258255958557, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7703914642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:47] (step=0007394) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.14368441507967353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7395, "loss": 0.2345849722623825, "memory_gb": 7.721559524536133, "step_time_ms": 3360.633611679077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:51] (step=0007395) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.14370384764865915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7396, "loss": 0.24253982305526733, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3953380584717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:55] (step=0007396) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.14372328021764477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7397, "loss": 0.3191438317298889, "memory_gb": 7.721559524536133, "step_time_ms": 3352.780818939209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:25:58] (step=0007397) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.1437427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7398, "loss": 0.2955286502838135, "memory_gb": 7.721559524536133, "step_time_ms": 3359.466075897217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:02] (step=0007398) Train Loss: 0.2966, Train Steps/Sec: 0.28, Epoch: 0.14376214535561602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7399, "loss": 0.22674980759620667, "memory_gb": 7.721559524536133, "step_time_ms": 3341.1571979522705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:05] (step=0007399) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.14378157792460164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7400, "loss": 0.175387904047966, "memory_gb": 7.721559524536133, "step_time_ms": 3347.081184387207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:09] (step=0007400) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.14380101049358726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7401, "loss": 0.3316381573677063, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2872104644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:12] (step=0007401) Train Loss: 0.3112, Train Steps/Sec: 0.28, Epoch: 0.14382044306257288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7402, "loss": 0.15121933817863464, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0214557647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:16] (step=0007402) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.1438398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7403, "loss": 0.17936035990715027, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7991466522217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:20] (step=0007403) Train Loss: 0.1753, Train Steps/Sec: 0.28, Epoch: 0.1438593082005441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7404, "loss": 0.2181692123413086, "memory_gb": 7.721559524536133, "step_time_ms": 3361.387252807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:23] (step=0007404) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.14387874076952972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7405, "loss": 0.2864518165588379, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6832542419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:27] (step=0007405) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.14389817333851534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7406, "loss": 0.173649400472641, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7159385681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:30] (step=0007406) Train Loss: 0.1992, Train Steps/Sec: 0.28, Epoch: 0.14391760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7407, "loss": 0.23934012651443481, "memory_gb": 7.721559524536133, "step_time_ms": 3361.236572265625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:34] (step=0007407) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.1439370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7408, "loss": 0.21509161591529846, "memory_gb": 7.721559524536133, "step_time_ms": 3358.820915222168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:38] (step=0007408) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.1439564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7409, "loss": 0.21381047368049622, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1300201416016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:41] (step=0007409) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.14397590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7410, "loss": 0.20280584692955017, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0566902160645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:45] (step=0007410) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.14399533618344346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7411, "loss": 0.22620701789855957, "memory_gb": 7.721559524536133, "step_time_ms": 3362.110376358032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:48] (step=0007411) Train Loss: 0.1934, Train Steps/Sec: 0.28, Epoch: 0.14401476875242908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7412, "loss": 0.22161024808883667, "memory_gb": 7.721559524536133, "step_time_ms": 3362.687349319458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:52] (step=0007412) Train Loss: 0.2843, Train Steps/Sec: 0.28, Epoch: 0.1440342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7413, "loss": 0.24802187085151672, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7851943969727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:56] (step=0007413) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.14405363389040032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7414, "loss": 0.29310929775238037, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4719848632812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:26:59] (step=0007414) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.14407306645938592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7415, "loss": 0.16059517860412598, "memory_gb": 7.721559524536133, "step_time_ms": 3350.517511367798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:03] (step=0007415) Train Loss: 0.1834, Train Steps/Sec: 0.28, Epoch: 0.14409249902837154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7416, "loss": 0.19718950986862183, "memory_gb": 7.721559524536133, "step_time_ms": 3362.877130508423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:06] (step=0007416) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.14411193159735716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7417, "loss": 0.3126910328865051, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0165119171143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:10] (step=0007417) Train Loss: 0.3422, Train Steps/Sec: 0.28, Epoch: 0.14413136416634278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7418, "loss": 0.20853003859519958, "memory_gb": 7.721559524536133, "step_time_ms": 3365.638017654419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:14] (step=0007418) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.1441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7419, "loss": 0.24009519815444946, "memory_gb": 7.721559524536133, "step_time_ms": 3358.513832092285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:17] (step=0007419) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.14417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7420, "loss": 0.28626346588134766, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7031993865967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:21] (step=0007420) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.14418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7421, "loss": 0.19312481582164764, "memory_gb": 7.721559524536133, "step_time_ms": 3365.436553955078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:24] (step=0007421) Train Loss: 0.1874, Train Steps/Sec: 0.28, Epoch: 0.14420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7422, "loss": 0.2378610223531723, "memory_gb": 7.721559524536133, "step_time_ms": 3364.238977432251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:28] (step=0007422) Train Loss: 0.2533, Train Steps/Sec: 0.27, Epoch: 0.1442285270112709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7423, "loss": 0.22944720089435577, "memory_gb": 7.721559524536133, "step_time_ms": 3361.997127532959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:32] (step=0007423) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.14424795958025652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7424, "loss": 0.25617367029190063, "memory_gb": 7.721559524536133, "step_time_ms": 3511.564254760742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:35] (step=0007424) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.14426739214924214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7425, "loss": 0.3152431547641754, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2992553710938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:39] (step=0007425) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.14428682471822776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7426, "loss": 0.2722519040107727, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3131771087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:42] (step=0007426) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.14430625728721336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7427, "loss": 0.19394241273403168, "memory_gb": 7.721559524536133, "step_time_ms": 3363.415002822876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:46] (step=0007427) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.14432568985619898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7428, "loss": 0.2245955467224121, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5902404785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:50] (step=0007428) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.1443451224251846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7429, "loss": 0.24489739537239075, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5293502807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:53] (step=0007429) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.14436455499417022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7430, "loss": 0.234248548746109, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5167865753174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:27:57] (step=0007430) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.14438398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7431, "loss": 0.1989677995443344, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0850315093994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:00] (step=0007431) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.14440342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7432, "loss": 0.22245952486991882, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5433654785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:04] (step=0007432) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.1444228527011271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7433, "loss": 0.264423131942749, "memory_gb": 7.721559524536133, "step_time_ms": 3366.903066635132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:08] (step=0007433) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.1444422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7434, "loss": 0.2575589418411255, "memory_gb": 7.721559524536133, "step_time_ms": 3364.207983016968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:11] (step=0007434) Train Loss: 0.2810, Train Steps/Sec: 0.28, Epoch: 0.14446171783909834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7435, "loss": 0.3065325915813446, "memory_gb": 7.721559524536133, "step_time_ms": 3370.626449584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:15] (step=0007435) Train Loss: 0.3061, Train Steps/Sec: 0.28, Epoch: 0.14448115040808396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7436, "loss": 0.1481148600578308, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7214641571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:19] (step=0007436) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.14450058297706958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7437, "loss": 0.14398963749408722, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8430786132812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:22] (step=0007437) Train Loss: 0.1656, Train Steps/Sec: 0.28, Epoch: 0.14452001554605517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7438, "loss": 0.29525768756866455, "memory_gb": 7.721559524536133, "step_time_ms": 3353.382349014282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:26] (step=0007438) Train Loss: 0.3293, Train Steps/Sec: 0.28, Epoch: 0.1445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7439, "loss": 0.2522395849227905, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7193908691406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:29] (step=0007439) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.14455888068402642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7440, "loss": 0.1935884952545166, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8336143493652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:33] (step=0007440) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.14457831325301204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7441, "loss": 0.3265928030014038, "memory_gb": 7.715639114379883, "step_time_ms": 3334.980249404907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:37] (step=0007441) Train Loss: 0.3042, Train Steps/Sec: 0.28, Epoch: 0.14459774582199766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7442, "loss": 0.36910703778266907, "memory_gb": 7.721559524536133, "step_time_ms": 3366.725444793701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:40] (step=0007442) Train Loss: 0.3247, Train Steps/Sec: 0.28, Epoch: 0.1446171783909833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7443, "loss": 0.2602557837963104, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7961616516113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:44] (step=0007443) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.1446366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7444, "loss": 0.235337033867836, "memory_gb": 7.721559524536133, "step_time_ms": 3371.6681003570557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:47] (step=0007444) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.14465604352895453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7445, "loss": 0.2077454924583435, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5338497161865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:51] (step=0007445) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.14467547609794015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7446, "loss": 0.20299793779850006, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7097091674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:55] (step=0007446) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.14469490866692578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7447, "loss": 0.22181683778762817, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5077896118164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:28:58] (step=0007447) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.1447143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7448, "loss": 0.31174179911613464, "memory_gb": 7.721559524536133, "step_time_ms": 3370.5694675445557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:02] (step=0007448) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.14473377380489702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7449, "loss": 0.20025020837783813, "memory_gb": 7.721559524536133, "step_time_ms": 3367.260456085205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:05] (step=0007449) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.14475320637388261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7450, "loss": 0.19042912125587463, "memory_gb": 7.721559524536133, "step_time_ms": 3366.548776626587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:09] (step=0007450) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.14477263894286824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7451, "loss": 0.2514542043209076, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0172691345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:13] (step=0007451) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.14479207151185386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7452, "loss": 0.28360456228256226, "memory_gb": 7.721559524536133, "step_time_ms": 3355.313301086426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:16] (step=0007452) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.14481150408083948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7453, "loss": 0.23892313241958618, "memory_gb": 7.721559524536133, "step_time_ms": 3362.168073654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:20] (step=0007453) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.1448309366498251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7454, "loss": 0.22090375423431396, "memory_gb": 7.721559524536133, "step_time_ms": 3364.898681640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:23] (step=0007454) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.14485036921881073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7455, "loss": 0.2729206681251526, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7003688812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:27] (step=0007455) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.14486980178779635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7456, "loss": 0.28220105171203613, "memory_gb": 7.721559524536133, "step_time_ms": 3362.823724746704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:31] (step=0007456) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.14488923435678197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7457, "loss": 0.10241421312093735, "memory_gb": 7.721559524536133, "step_time_ms": 3368.22509765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:34] (step=0007457) Train Loss: 0.1530, Train Steps/Sec: 0.28, Epoch: 0.1449086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7458, "loss": 0.25367245078086853, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1513328552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:38] (step=0007458) Train Loss: 0.2194, Train Steps/Sec: 0.28, Epoch: 0.14492809949475322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7459, "loss": 0.26509514451026917, "memory_gb": 7.721559524536133, "step_time_ms": 3365.612506866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:42] (step=0007459) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.14494753206373884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7460, "loss": 0.2729403078556061, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5564365386963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:45] (step=0007460) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.14496696463272446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7461, "loss": 0.3289443254470825, "memory_gb": 7.721559524536133, "step_time_ms": 3358.637809753418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:49] (step=0007461) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.14498639720171005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7462, "loss": 0.23970669507980347, "memory_gb": 7.721559524536133, "step_time_ms": 3364.525318145752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:52] (step=0007462) Train Loss: 0.2668, Train Steps/Sec: 0.27, Epoch: 0.14500582977069568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:29:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7463, "loss": 0.1325901448726654, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0734939575195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:29:56] (step=0007463) Train Loss: 0.1641, Train Steps/Sec: 0.28, Epoch: 0.1450252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7464, "loss": 0.14088848233222961, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1184616088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:00] (step=0007464) Train Loss: 0.1752, Train Steps/Sec: 0.28, Epoch: 0.14504469490866692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7465, "loss": 0.2576371133327484, "memory_gb": 7.721559524536133, "step_time_ms": 3504.96244430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:03] (step=0007465) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.14506412747765254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7466, "loss": 0.26852479577064514, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5999641418457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:07] (step=0007466) Train Loss: 0.1893, Train Steps/Sec: 0.28, Epoch: 0.14508356004663817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7467, "loss": 0.241996631026268, "memory_gb": 7.721559524536133, "step_time_ms": 3359.45200920105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:10] (step=0007467) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.1451029926156238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7468, "loss": 0.15867723524570465, "memory_gb": 7.715639114379883, "step_time_ms": 3323.911428451538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:14] (step=0007468) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.1451224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7469, "loss": 0.3129670023918152, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3251190185547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:18] (step=0007469) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.14514185775359503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7470, "loss": 0.2722768783569336, "memory_gb": 7.721559524536133, "step_time_ms": 3361.415386199951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:21] (step=0007470) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.14516129032258066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7471, "loss": 0.13659965991973877, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8428707122803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:25] (step=0007471) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.14518072289156628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7472, "loss": 0.18913638591766357, "memory_gb": 7.721559524536133, "step_time_ms": 3358.405590057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:29] (step=0007472) Train Loss: 0.1618, Train Steps/Sec: 0.28, Epoch: 0.14520015546055187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7473, "loss": 0.3376615643501282, "memory_gb": 7.721559524536133, "step_time_ms": 3359.736919403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:32] (step=0007473) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.1452195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7474, "loss": 0.31852659583091736, "memory_gb": 7.721559524536133, "step_time_ms": 3362.182378768921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:36] (step=0007474) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.14523902059852312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7475, "loss": 0.17277425527572632, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3191890716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:39] (step=0007475) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.14525845316750874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7476, "loss": 0.10589192062616348, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2841415405273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:43] (step=0007476) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.14527788573649436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7477, "loss": 0.21294640004634857, "memory_gb": 7.721559524536133, "step_time_ms": 3357.234001159668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:47] (step=0007477) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.14529731830547998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7478, "loss": 0.20284181833267212, "memory_gb": 7.721559524536133, "step_time_ms": 3355.846643447876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:50] (step=0007478) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.1453167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7479, "loss": 0.13748809695243835, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5286865234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:54] (step=0007479) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.14533618344345123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:30:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7480, "loss": 0.34248054027557373, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1275005340576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:30:57] (step=0007480) Train Loss: 0.2713, Train Steps/Sec: 0.28, Epoch: 0.14535561601243685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7481, "loss": 0.2823698818683624, "memory_gb": 7.721559524536133, "step_time_ms": 3354.642152786255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:01] (step=0007481) Train Loss: 0.3451, Train Steps/Sec: 0.28, Epoch: 0.14537504858142247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7482, "loss": 0.2870321273803711, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5662841796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:05] (step=0007482) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.1453944811504081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7483, "loss": 0.25894102454185486, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2352867126465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:08] (step=0007483) Train Loss: 0.1941, Train Steps/Sec: 0.28, Epoch: 0.14541391371939372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7484, "loss": 0.2128320336341858, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9467067718506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:12] (step=0007484) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.1454333462883793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7485, "loss": 0.13620391488075256, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6857833862305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:15] (step=0007485) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.14545277885736493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7486, "loss": 0.30890870094299316, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4700565338135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:19] (step=0007486) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.14547221142635056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7487, "loss": 0.2249404788017273, "memory_gb": 7.721559524536133, "step_time_ms": 3358.21270942688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:23] (step=0007487) Train Loss: 0.2009, Train Steps/Sec: 0.28, Epoch: 0.14549164399533618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7488, "loss": 0.35807621479034424, "memory_gb": 7.721559524536133, "step_time_ms": 3355.168104171753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:26] (step=0007488) Train Loss: 0.2897, Train Steps/Sec: 0.28, Epoch: 0.1455110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7489, "loss": 0.24668283760547638, "memory_gb": 7.721559524536133, "step_time_ms": 3357.50675201416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:30] (step=0007489) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.14553050913330742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7490, "loss": 0.28386732935905457, "memory_gb": 7.721559524536133, "step_time_ms": 3357.469320297241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:33] (step=0007490) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.14554994170229305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7491, "loss": 0.25577306747436523, "memory_gb": 7.721559524536133, "step_time_ms": 3355.367422103882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:37] (step=0007491) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.14556937427127867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7492, "loss": 0.2889630198478699, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8661937713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:41] (step=0007492) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.1455888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7493, "loss": 0.20709867775440216, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9676361083984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:44] (step=0007493) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.1456082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7494, "loss": 0.2633892297744751, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9061965942383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:48] (step=0007494) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.14562767197823553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7495, "loss": 0.24631375074386597, "memory_gb": 7.721559524536133, "step_time_ms": 3357.663154602051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:51] (step=0007495) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.14564710454722113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7496, "loss": 0.22246582806110382, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3242168426514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:55] (step=0007496) Train Loss: 0.2823, Train Steps/Sec: 0.28, Epoch: 0.14566653711620675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:31:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7497, "loss": 0.20926931500434875, "memory_gb": 7.721559524536133, "step_time_ms": 3355.405807495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:31:58] (step=0007497) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.14568596968519237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7498, "loss": 0.284562885761261, "memory_gb": 7.721559524536133, "step_time_ms": 3359.429359436035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:02] (step=0007498) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.145705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7499, "loss": 0.2365477979183197, "memory_gb": 7.721559524536133, "step_time_ms": 3352.276563644409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:06] (step=0007499) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.14572483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7500, "loss": 0.29868248105049133, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1388931274414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:09] (step=0007500) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.14574426739214924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7501, "loss": 0.24211028218269348, "memory_gb": 7.721559524536133, "step_time_ms": 3359.867811203003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:13] (step=0007501) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.14576369996113486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7502, "loss": 0.24521757662296295, "memory_gb": 7.721559524536133, "step_time_ms": 3354.095458984375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:16] (step=0007502) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.14578313253012049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7503, "loss": 0.30012601613998413, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1832389831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:20] (step=0007503) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.1458025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7504, "loss": 0.1140148714184761, "memory_gb": 7.721559524536133, "step_time_ms": 3359.461784362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:24] (step=0007504) Train Loss: 0.1748, Train Steps/Sec: 0.28, Epoch: 0.14582199766809173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7505, "loss": 0.16515782475471497, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6389293670654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:27] (step=0007505) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.14584143023707735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7506, "loss": 0.31120938062667847, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3389987945557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:31] (step=0007506) Train Loss: 0.2696, Train Steps/Sec: 0.28, Epoch: 0.14586086280606297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7507, "loss": 0.2631533443927765, "memory_gb": 7.715639114379883, "step_time_ms": 3322.4339485168457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:34] (step=0007507) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.14588029537504857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7508, "loss": 0.1916303038597107, "memory_gb": 7.721559524536133, "step_time_ms": 3358.82830619812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:38] (step=0007508) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.1458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7509, "loss": 0.29410243034362793, "memory_gb": 7.721559524536133, "step_time_ms": 3361.926794052124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:42] (step=0007509) Train Loss: 0.2749, Train Steps/Sec: 0.27, Epoch: 0.1459191605130198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7510, "loss": 0.3334949016571045, "memory_gb": 7.721559524536133, "step_time_ms": 3360.245704650879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:45] (step=0007510) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.14593859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7511, "loss": 0.2564355134963989, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2627353668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:49] (step=0007511) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.14595802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7512, "loss": 0.3227614760398865, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1579456329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:53] (step=0007512) Train Loss: 0.3353, Train Steps/Sec: 0.28, Epoch: 0.14597745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:32:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7513, "loss": 0.28943759202957153, "memory_gb": 7.721559524536133, "step_time_ms": 3501.0457038879395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:32:56] (step=0007513) Train Loss: 0.3135, Train Steps/Sec: 0.28, Epoch: 0.1459968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7514, "loss": 0.31467294692993164, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8580360412598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:00] (step=0007514) Train Loss: 0.2768, Train Steps/Sec: 0.28, Epoch: 0.14601632335794792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7515, "loss": 0.1542535126209259, "memory_gb": 7.721559524536133, "step_time_ms": 3359.434127807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:03] (step=0007515) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.14603575592693355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7516, "loss": 0.24700774252414703, "memory_gb": 7.721559524536133, "step_time_ms": 3359.187602996826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:07] (step=0007516) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.14605518849591917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7517, "loss": 0.2248181253671646, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8760318756104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:11] (step=0007517) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.1460746210649048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7518, "loss": 0.2357107549905777, "memory_gb": 7.715639114379883, "step_time_ms": 3316.9801235198975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:14] (step=0007518) Train Loss: 0.2059, Train Steps/Sec: 0.28, Epoch: 0.14609405363389041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7519, "loss": 0.20103834569454193, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5830669403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:18] (step=0007519) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.146113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7520, "loss": 0.22358837723731995, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0688705444336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:21] (step=0007520) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.14613291877186163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7521, "loss": 0.08073539286851883, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7367839813232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:25] (step=0007521) Train Loss: 0.1799, Train Steps/Sec: 0.28, Epoch: 0.14615235134084725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7522, "loss": 0.2078172117471695, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3552112579346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:29] (step=0007522) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.14617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7523, "loss": 0.24630779027938843, "memory_gb": 7.721559524536133, "step_time_ms": 3362.048625946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:32] (step=0007523) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.1461912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7524, "loss": 0.20283502340316772, "memory_gb": 7.721559524536133, "step_time_ms": 3353.503465652466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:36] (step=0007524) Train Loss: 0.2972, Train Steps/Sec: 0.28, Epoch: 0.14621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7525, "loss": 0.2627876102924347, "memory_gb": 7.721559524536133, "step_time_ms": 3357.46431350708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:39] (step=0007525) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.14623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7526, "loss": 0.17033840715885162, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3293380737305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:43] (step=0007526) Train Loss: 0.1438, Train Steps/Sec: 0.28, Epoch: 0.14624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7527, "loss": 0.2472616285085678, "memory_gb": 7.721559524536133, "step_time_ms": 3357.682228088379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:47] (step=0007527) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.146268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7528, "loss": 0.31700822710990906, "memory_gb": 7.721559524536133, "step_time_ms": 3364.290475845337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:50] (step=0007528) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.1462883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7529, "loss": 0.16174063086509705, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3576889038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:54] (step=0007529) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.14630781189273223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7530, "loss": 0.18153323233127594, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0644855499268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:33:57] (step=0007530) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.14632724446171783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7531, "loss": 0.28044337034225464, "memory_gb": 7.721559524536133, "step_time_ms": 3342.59295463562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:01] (step=0007531) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.14634667703070345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7532, "loss": 0.29853689670562744, "memory_gb": 7.721559524536133, "step_time_ms": 3359.874486923218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:05] (step=0007532) Train Loss: 0.2745, Train Steps/Sec: 0.28, Epoch: 0.14636610959968907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7533, "loss": 0.3604327440261841, "memory_gb": 7.721559524536133, "step_time_ms": 3345.0493812561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:08] (step=0007533) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.1463855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7534, "loss": 0.1874295026063919, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6083602905273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:12] (step=0007534) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.14640497473766031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7535, "loss": 0.22444167733192444, "memory_gb": 7.721559524536133, "step_time_ms": 3359.333276748657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:15] (step=0007535) Train Loss: 0.1924, Train Steps/Sec: 0.28, Epoch: 0.14642440730664594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7536, "loss": 0.28110960125923157, "memory_gb": 7.721559524536133, "step_time_ms": 3360.701560974121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:19] (step=0007536) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.14644383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7537, "loss": 0.30361735820770264, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8177642822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:23] (step=0007537) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.14646327244461718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7538, "loss": 0.23055604100227356, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3128871917725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:26] (step=0007538) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.1464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7539, "loss": 0.20365139842033386, "memory_gb": 7.721559524536133, "step_time_ms": 3364.311933517456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:30] (step=0007539) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.14650213758258843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7540, "loss": 0.15803609788417816, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2771759033203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:33] (step=0007540) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.14652157015157405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7541, "loss": 0.22913210093975067, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0799522399902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:37] (step=0007541) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.14654100272055967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7542, "loss": 0.1575969159603119, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3149604797363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:41] (step=0007542) Train Loss: 0.1534, Train Steps/Sec: 0.28, Epoch: 0.14656043528954527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7543, "loss": 0.2203463762998581, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7705574035645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:44] (step=0007543) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.1465798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7544, "loss": 0.20296955108642578, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3320331573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:48] (step=0007544) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.1465993004275165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7545, "loss": 0.28804415464401245, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9556427001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:51] (step=0007545) Train Loss: 0.2693, Train Steps/Sec: 0.28, Epoch: 0.14661873299650213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7546, "loss": 0.21252021193504333, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7612075805664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:55] (step=0007546) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.14663816556548775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7547, "loss": 0.14135846495628357, "memory_gb": 7.721559524536133, "step_time_ms": 3360.182285308838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:34:59] (step=0007547) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.14665759813447338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7548, "loss": 0.16168580949306488, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2803897857666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:02] (step=0007548) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.146677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7549, "loss": 0.18300174176692963, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0641441345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:06] (step=0007549) Train Loss: 0.1827, Train Steps/Sec: 0.28, Epoch: 0.14669646327244462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7550, "loss": 0.23881760239601135, "memory_gb": 7.721559524536133, "step_time_ms": 3363.393545150757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:09] (step=0007550) Train Loss: 0.2372, Train Steps/Sec: 0.27, Epoch: 0.14671589584143024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7551, "loss": 0.21556417644023895, "memory_gb": 7.721559524536133, "step_time_ms": 3364.950180053711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:13] (step=0007551) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.14673532841041587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7552, "loss": 0.16146135330200195, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9446964263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:17] (step=0007552) Train Loss: 0.1707, Train Steps/Sec: 0.28, Epoch: 0.1467547609794015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7553, "loss": 0.263958603143692, "memory_gb": 7.721559524536133, "step_time_ms": 3492.180109024048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:20] (step=0007553) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.14677419354838708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7554, "loss": 0.1166524738073349, "memory_gb": 7.721559524536133, "step_time_ms": 3364.474296569824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:24] (step=0007554) Train Loss: 0.1826, Train Steps/Sec: 0.28, Epoch: 0.1467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7555, "loss": 0.18508900701999664, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5687103271484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:27] (step=0007555) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.14681305868635833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7556, "loss": 0.27519848942756653, "memory_gb": 7.721559524536133, "step_time_ms": 3366.992950439453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:31] (step=0007556) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.14683249125534395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7557, "loss": 0.16977161169052124, "memory_gb": 7.721559524536133, "step_time_ms": 3366.87970161438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:35] (step=0007557) Train Loss: 0.1795, Train Steps/Sec: 0.28, Epoch: 0.14685192382432957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7558, "loss": 0.283186674118042, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3857707977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:38] (step=0007558) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.1468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7559, "loss": 0.2689506709575653, "memory_gb": 7.721559524536133, "step_time_ms": 3363.988161087036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:42] (step=0007559) Train Loss: 0.3060, Train Steps/Sec: 0.28, Epoch: 0.14689078896230082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7560, "loss": 0.20593149960041046, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1888885498047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:45] (step=0007560) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.14691022153128644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7561, "loss": 0.2564506530761719, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8342361450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:49] (step=0007561) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.14692965410027206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7562, "loss": 0.27679362893104553, "memory_gb": 7.721559524536133, "step_time_ms": 3366.525173187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:53] (step=0007562) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.14694908666925768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:35:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7563, "loss": 0.26532280445098877, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3416633605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:35:56] (step=0007563) Train Loss: 0.3011, Train Steps/Sec: 0.28, Epoch: 0.1469685192382433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7564, "loss": 0.21195977926254272, "memory_gb": 7.721559524536133, "step_time_ms": 3366.710662841797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:00] (step=0007564) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.14698795180722893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7565, "loss": 0.20554161071777344, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4356212615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:03] (step=0007565) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.14700738437621452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7566, "loss": 0.26076120138168335, "memory_gb": 7.721559524536133, "step_time_ms": 3363.875389099121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:07] (step=0007566) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.14702681694520014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7567, "loss": 0.2901795506477356, "memory_gb": 7.721559524536133, "step_time_ms": 3367.616653442383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:11] (step=0007567) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.14704624951418577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7568, "loss": 0.20303812623023987, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6280040740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:14] (step=0007568) Train Loss: 0.2150, Train Steps/Sec: 0.28, Epoch: 0.1470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7569, "loss": 0.2441897690296173, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0192489624023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:18] (step=0007569) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.147085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7570, "loss": 0.14520548284053802, "memory_gb": 7.721559524536133, "step_time_ms": 3365.122079849243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:22] (step=0007570) Train Loss: 0.1689, Train Steps/Sec: 0.28, Epoch: 0.14710454722114263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7571, "loss": 0.17111660540103912, "memory_gb": 7.721559524536133, "step_time_ms": 3366.412878036499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:25] (step=0007571) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.14712397979012826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7572, "loss": 0.18472689390182495, "memory_gb": 7.721559524536133, "step_time_ms": 3367.110252380371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:29] (step=0007572) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.14714341235911388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7573, "loss": 0.25640231370925903, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8946495056152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:32] (step=0007573) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.1471628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7574, "loss": 0.22625377774238586, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4233474731445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:36] (step=0007574) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.14718227749708512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7575, "loss": 0.20416787266731262, "memory_gb": 7.721559524536133, "step_time_ms": 3361.430883407593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:40] (step=0007575) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.14720171006607075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7576, "loss": 0.23629038035869598, "memory_gb": 7.721559524536133, "step_time_ms": 3363.325834274292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:43] (step=0007576) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.14722114263505637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7577, "loss": 0.18794305622577667, "memory_gb": 7.721559524536133, "step_time_ms": 3356.426000595093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:47] (step=0007577) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.14724057520404196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7578, "loss": 0.17261618375778198, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9559020996094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:50] (step=0007578) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.14726000777302758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7579, "loss": 0.17907650768756866, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0735874176025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:54] (step=0007579) Train Loss: 0.2014, Train Steps/Sec: 0.28, Epoch: 0.1472794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:36:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7580, "loss": 0.2660139501094818, "memory_gb": 7.721559524536133, "step_time_ms": 3361.103296279907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:36:58] (step=0007580) Train Loss: 0.3386, Train Steps/Sec: 0.28, Epoch: 0.14729887291099883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7581, "loss": 0.2519431710243225, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8365993499756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:01] (step=0007581) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.14731830547998445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7582, "loss": 0.2245972752571106, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4001998901367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:05] (step=0007582) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.14733773804897007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7583, "loss": 0.2887762784957886, "memory_gb": 7.715639114379883, "step_time_ms": 3328.6023139953613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:08] (step=0007583) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.1473571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7584, "loss": 0.22171688079833984, "memory_gb": 7.721559524536133, "step_time_ms": 3360.128879547119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:12] (step=0007584) Train Loss: 0.2523, Train Steps/Sec: 0.28, Epoch: 0.14737660318694132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7585, "loss": 0.1899813860654831, "memory_gb": 7.721559524536133, "step_time_ms": 3356.822729110718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:16] (step=0007585) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.14739603575592694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7586, "loss": 0.2704674005508423, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3523502349854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:19] (step=0007586) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.14741546832491256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7587, "loss": 0.25685447454452515, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7021827697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:23] (step=0007587) Train Loss: 0.2002, Train Steps/Sec: 0.28, Epoch: 0.14743490089389819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7588, "loss": 0.2479577362537384, "memory_gb": 7.721559524536133, "step_time_ms": 3362.575054168701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:26] (step=0007588) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.14745433346288378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7589, "loss": 0.2319929301738739, "memory_gb": 7.721559524536133, "step_time_ms": 3362.279176712036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:30] (step=0007589) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.1474737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7590, "loss": 0.35607314109802246, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9112758636475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:34] (step=0007590) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.14749319860085502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7591, "loss": 0.3289158046245575, "memory_gb": 7.721559524536133, "step_time_ms": 3361.705541610718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:37] (step=0007591) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.14751263116984065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7592, "loss": 0.1381479799747467, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8852367401123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:41] (step=0007592) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.14753206373882627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7593, "loss": 0.267713725566864, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9463233947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:45] (step=0007593) Train Loss: 0.3057, Train Steps/Sec: 0.28, Epoch: 0.1475514963078119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7594, "loss": 0.21139928698539734, "memory_gb": 7.721559524536133, "step_time_ms": 3345.7820415496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:48] (step=0007594) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.1475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7595, "loss": 0.18108007311820984, "memory_gb": 7.721559524536133, "step_time_ms": 3357.342481613159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:52] (step=0007595) Train Loss: 0.1636, Train Steps/Sec: 0.28, Epoch: 0.14759036144578314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7596, "loss": 0.2071736454963684, "memory_gb": 7.721559524536133, "step_time_ms": 3358.201503753662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:55] (step=0007596) Train Loss: 0.1788, Train Steps/Sec: 0.28, Epoch: 0.14760979401476876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:37:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7597, "loss": 0.23053471744060516, "memory_gb": 7.721559524536133, "step_time_ms": 3356.882333755493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:37:59] (step=0007597) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.14762922658375438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7598, "loss": 0.22429460287094116, "memory_gb": 7.721559524536133, "step_time_ms": 3358.873128890991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:03] (step=0007598) Train Loss: 0.2191, Train Steps/Sec: 0.27, Epoch: 0.14764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7599, "loss": 0.25456175208091736, "memory_gb": 7.721559524536133, "step_time_ms": 3353.219747543335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:06] (step=0007599) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.14766809172172563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7600, "loss": 0.2679247558116913, "memory_gb": 7.721559524536133, "step_time_ms": 3502.17604637146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:10] (step=0007600) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.14768752429071122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7601, "loss": 0.27376824617385864, "memory_gb": 7.721559524536133, "step_time_ms": 3353.921413421631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:13] (step=0007601) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.14770695685969684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7602, "loss": 0.21949362754821777, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6151905059814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:17] (step=0007602) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.14772638942868246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7603, "loss": 0.14166578650474548, "memory_gb": 7.715639114379883, "step_time_ms": 3316.1699771881104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:21] (step=0007603) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.1477458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7604, "loss": 0.30225199460983276, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6691875457764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:24] (step=0007604) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.1477652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7605, "loss": 0.26638054847717285, "memory_gb": 7.721559524536133, "step_time_ms": 3356.947660446167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:28] (step=0007605) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.14778468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7606, "loss": 0.2544013261795044, "memory_gb": 7.721559524536133, "step_time_ms": 3358.49928855896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:31] (step=0007606) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.14780411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7607, "loss": 0.26816099882125854, "memory_gb": 7.721559524536133, "step_time_ms": 3350.0816822052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:35] (step=0007607) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.14782355227361058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7608, "loss": 0.18482640385627747, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0492267608643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:39] (step=0007608) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.1478429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7609, "loss": 0.2514059543609619, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2633571624756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:42] (step=0007609) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.14786241741158182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7610, "loss": 0.36179065704345703, "memory_gb": 7.721559524536133, "step_time_ms": 3357.966184616089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:46] (step=0007610) Train Loss: 0.3270, Train Steps/Sec: 0.28, Epoch: 0.14788184998056744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7611, "loss": 0.21602092683315277, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6696643829346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:49] (step=0007611) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.14790128254955306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7612, "loss": 0.3573305606842041, "memory_gb": 7.715639114379883, "step_time_ms": 3318.4638023376465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:53] (step=0007612) Train Loss: 0.3045, Train Steps/Sec: 0.28, Epoch: 0.14792071511853866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7613, "loss": 0.3291943669319153, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8596420288086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:38:56] (step=0007613) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.14794014768752428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7614, "loss": 0.31730878353118896, "memory_gb": 7.721559524536133, "step_time_ms": 3357.954263687134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:00] (step=0007614) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.1479595802565099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7615, "loss": 0.23708824813365936, "memory_gb": 7.721559524536133, "step_time_ms": 3355.107545852661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:04] (step=0007615) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.14797901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7616, "loss": 0.23766443133354187, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4398708343506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:07] (step=0007616) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.14799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7617, "loss": 0.11608663201332092, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8778553009033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:11] (step=0007617) Train Loss: 0.1847, Train Steps/Sec: 0.28, Epoch: 0.14801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7618, "loss": 0.19418345391750336, "memory_gb": 7.721559524536133, "step_time_ms": 3361.294984817505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:14] (step=0007618) Train Loss: 0.1948, Train Steps/Sec: 0.28, Epoch: 0.1480373105324524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7619, "loss": 0.252335786819458, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8430366516113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:18] (step=0007619) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.14805674310143802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7620, "loss": 0.2469763457775116, "memory_gb": 7.721559524536133, "step_time_ms": 3355.386257171631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:22] (step=0007620) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.14807617567042364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7621, "loss": 0.2987937033176422, "memory_gb": 7.721559524536133, "step_time_ms": 3362.448215484619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:25] (step=0007621) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.14809560823940926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7622, "loss": 0.20644095540046692, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5098724365234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:29] (step=0007622) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.14811504080839488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7623, "loss": 0.19411452114582062, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1248359680176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:32] (step=0007623) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.14813447337738048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7624, "loss": 0.3192099332809448, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0081672668457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:36] (step=0007624) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.1481539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7625, "loss": 0.25444960594177246, "memory_gb": 7.721559524536133, "step_time_ms": 3359.372854232788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:40] (step=0007625) Train Loss: 0.2645, Train Steps/Sec: 0.28, Epoch: 0.14817333851535172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7626, "loss": 0.21764151751995087, "memory_gb": 7.721559524536133, "step_time_ms": 3364.044427871704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:43] (step=0007626) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.14819277108433734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7627, "loss": 0.271826833486557, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7561588287354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:47] (step=0007627) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.14821220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7628, "loss": 0.23787984251976013, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6739253997803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:50] (step=0007628) Train Loss: 0.1988, Train Steps/Sec: 0.28, Epoch: 0.1482316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7629, "loss": 0.2925615906715393, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4610691070557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:54] (step=0007629) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.1482510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7630, "loss": 0.21409913897514343, "memory_gb": 7.721559524536133, "step_time_ms": 3362.271785736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:39:58] (step=0007630) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.14827050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7631, "loss": 0.30368369817733765, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5760078430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:01] (step=0007631) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.14828993392926546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7632, "loss": 0.30165669322013855, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1670989990234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:05] (step=0007632) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.14830936649825108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7633, "loss": 0.2867720425128937, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8343811035156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:08] (step=0007633) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.1483287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7634, "loss": 0.2552376687526703, "memory_gb": 7.721559524536133, "step_time_ms": 3366.025686264038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:12] (step=0007634) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.14834823163622232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7635, "loss": 0.29450711607933044, "memory_gb": 7.721559524536133, "step_time_ms": 3359.224319458008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:15] (step=0007635) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.14836766420520792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7636, "loss": 0.33519062399864197, "memory_gb": 7.721559524536133, "step_time_ms": 3360.248327255249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:19] (step=0007636) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.14838709677419354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7637, "loss": 0.23912882804870605, "memory_gb": 7.721559524536133, "step_time_ms": 3360.521078109741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:23] (step=0007637) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.14840652934317916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7638, "loss": 0.37247854471206665, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2192821502686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:26] (step=0007638) Train Loss: 0.3100, Train Steps/Sec: 0.27, Epoch: 0.14842596191216478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7639, "loss": 0.3251403570175171, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0782413482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:30] (step=0007639) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.1484453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7640, "loss": 0.18567484617233276, "memory_gb": 7.721559524536133, "step_time_ms": 3364.907741546631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:34] (step=0007640) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.14846482705013603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7641, "loss": 0.26724839210510254, "memory_gb": 7.721559524536133, "step_time_ms": 3513.1916999816895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:37] (step=0007641) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.14848425961912165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7642, "loss": 0.29891759157180786, "memory_gb": 7.721559524536133, "step_time_ms": 3363.619089126587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:41] (step=0007642) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.14850369218810727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7643, "loss": 0.36879223585128784, "memory_gb": 7.721559524536133, "step_time_ms": 3365.225076675415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:44] (step=0007643) Train Loss: 0.3683, Train Steps/Sec: 0.28, Epoch: 0.1485231247570929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7644, "loss": 0.17753764986991882, "memory_gb": 7.721559524536133, "step_time_ms": 3359.961986541748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:48] (step=0007644) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.14854255732607852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7645, "loss": 0.2629087269306183, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6186542510986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:52] (step=0007645) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.14856198989506414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7646, "loss": 0.2804678678512573, "memory_gb": 7.721559524536133, "step_time_ms": 3363.269329071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:55] (step=0007646) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.14858142246404973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:40:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7647, "loss": 0.22729328274726868, "memory_gb": 7.721559524536133, "step_time_ms": 3364.006519317627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:40:59] (step=0007647) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.14860085503303536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7648, "loss": 0.1341819018125534, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2705421447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:02] (step=0007648) Train Loss: 0.2419, Train Steps/Sec: 0.28, Epoch: 0.14862028760202098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7649, "loss": 0.09913265705108643, "memory_gb": 7.721559524536133, "step_time_ms": 3365.201234817505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:06] (step=0007649) Train Loss: 0.1750, Train Steps/Sec: 0.28, Epoch: 0.1486397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7650, "loss": 0.1934167444705963, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3408641815186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:09] (step=0007650) Train Loss: 0.1756, Train Steps/Sec: 0.28, Epoch: 0.14865915273999222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7651, "loss": 0.27737319469451904, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7027225494385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:13] (step=0007651) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.14867858530897785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7652, "loss": 0.24548479914665222, "memory_gb": 7.721559524536133, "step_time_ms": 3363.318681716919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:17] (step=0007652) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.14869801787796347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7653, "loss": 0.24860697984695435, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1424598693848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:20] (step=0007653) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.1487174504469491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7654, "loss": 0.2588430941104889, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8602962493896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:24] (step=0007654) Train Loss: 0.3112, Train Steps/Sec: 0.28, Epoch: 0.1487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7655, "loss": 0.25025463104248047, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7495250701904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:27] (step=0007655) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.14875631558492033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7656, "loss": 0.2964961528778076, "memory_gb": 7.721559524536133, "step_time_ms": 3348.9391803741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:31] (step=0007656) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.14877574815390596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7657, "loss": 0.29133158922195435, "memory_gb": 7.721559524536133, "step_time_ms": 3366.271495819092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:35] (step=0007657) Train Loss: 0.3145, Train Steps/Sec: 0.28, Epoch: 0.14879518072289158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7658, "loss": 0.24271488189697266, "memory_gb": 7.721559524536133, "step_time_ms": 3360.764265060425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:38] (step=0007658) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.14881461329187717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7659, "loss": 0.27531862258911133, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2847537994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:42] (step=0007659) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.1488340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7660, "loss": 0.0818922221660614, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1969108581543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:45] (step=0007660) Train Loss: 0.1276, Train Steps/Sec: 0.28, Epoch: 0.14885347842984842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7661, "loss": 0.17532426118850708, "memory_gb": 7.721559524536133, "step_time_ms": 3369.739532470703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:49] (step=0007661) Train Loss: 0.1781, Train Steps/Sec: 0.28, Epoch: 0.14887291099883404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7662, "loss": 0.26128068566322327, "memory_gb": 7.721559524536133, "step_time_ms": 3366.677761077881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:53] (step=0007662) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.14889234356781966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:41:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7663, "loss": 0.12824514508247375, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9814853668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:41:56] (step=0007663) Train Loss: 0.2005, Train Steps/Sec: 0.28, Epoch: 0.14891177613680529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7664, "loss": 0.33639243245124817, "memory_gb": 7.721559524536133, "step_time_ms": 3374.9072551727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:00] (step=0007664) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.1489312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7665, "loss": 0.25187039375305176, "memory_gb": 7.721559524536133, "step_time_ms": 3365.248441696167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:03] (step=0007665) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.14895064127477653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7666, "loss": 0.26195961236953735, "memory_gb": 7.721559524536133, "step_time_ms": 3368.644952774048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:07] (step=0007666) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.14897007384376215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7667, "loss": 0.17969253659248352, "memory_gb": 7.721559524536133, "step_time_ms": 3369.234323501587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:11] (step=0007667) Train Loss: 0.1568, Train Steps/Sec: 0.28, Epoch: 0.14898950641274777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7668, "loss": 0.2739212214946747, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0454235076904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:14] (step=0007668) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.1490089389817334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7669, "loss": 0.3620312213897705, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3994331359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:18] (step=0007669) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.14902837155071902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7670, "loss": 0.2741973400115967, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6701316833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:21] (step=0007670) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.1490478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7671, "loss": 0.20989936590194702, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0661239624023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:25] (step=0007671) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.14906723668869024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7672, "loss": 0.23966169357299805, "memory_gb": 7.721559524536133, "step_time_ms": 3365.232467651367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:29] (step=0007672) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.14908666925767586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7673, "loss": 0.1689300239086151, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4277324676514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:32] (step=0007673) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.14910610182666148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7674, "loss": 0.38008081912994385, "memory_gb": 7.721559524536133, "step_time_ms": 3367.32816696167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:36] (step=0007674) Train Loss: 0.3232, Train Steps/Sec: 0.28, Epoch: 0.1491255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7675, "loss": 0.2905290722846985, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5756244659424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:40] (step=0007675) Train Loss: 0.3219, Train Steps/Sec: 0.28, Epoch: 0.14914496696463272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7676, "loss": 0.13171307742595673, "memory_gb": 7.721559524536133, "step_time_ms": 3363.975763320923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:43] (step=0007676) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.14916439953361835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7677, "loss": 0.23549868166446686, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3482971191406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:47] (step=0007677) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.14918383210260397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7678, "loss": 0.2005210667848587, "memory_gb": 7.721559524536133, "step_time_ms": 3362.762928009033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:50] (step=0007678) Train Loss: 0.2150, Train Steps/Sec: 0.28, Epoch: 0.1492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7679, "loss": 0.16763900220394135, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7477416992188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:54] (step=0007679) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.14922269724057521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:42:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7680, "loss": 0.2514130771160126, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7590923309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:42:58] (step=0007680) Train Loss: 0.1881, Train Steps/Sec: 0.28, Epoch: 0.14924212980956084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7681, "loss": 0.22615790367126465, "memory_gb": 7.721559524536133, "step_time_ms": 3362.020254135132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:01] (step=0007681) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.14926156237854643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7682, "loss": 0.24273261427879333, "memory_gb": 7.721559524536133, "step_time_ms": 3492.408275604248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:05] (step=0007682) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.14928099494753205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7683, "loss": 0.2985619306564331, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2403354644775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:08] (step=0007683) Train Loss: 0.3211, Train Steps/Sec: 0.28, Epoch: 0.14930042751651768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7684, "loss": 0.2144956886768341, "memory_gb": 7.721559524536133, "step_time_ms": 3362.177848815918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:12] (step=0007684) Train Loss: 0.1919, Train Steps/Sec: 0.28, Epoch: 0.1493198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7685, "loss": 0.20401634275913239, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6787662506104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:16] (step=0007685) Train Loss: 0.2141, Train Steps/Sec: 0.27, Epoch: 0.14933929265448892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7686, "loss": 0.211374893784523, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4347496032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:19] (step=0007686) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.14935872522347454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7687, "loss": 0.23217761516571045, "memory_gb": 7.721559524536133, "step_time_ms": 3357.508659362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:23] (step=0007687) Train Loss: 0.2758, Train Steps/Sec: 0.28, Epoch: 0.14937815779246016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7688, "loss": 0.2993069589138031, "memory_gb": 7.721559524536133, "step_time_ms": 3356.290817260742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:27] (step=0007688) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.1493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7689, "loss": 0.23658215999603271, "memory_gb": 7.721559524536133, "step_time_ms": 3361.825466156006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:30] (step=0007689) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.1494170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7690, "loss": 0.12627574801445007, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9016456604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:34] (step=0007690) Train Loss: 0.1849, Train Steps/Sec: 0.28, Epoch: 0.14943645549941703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7691, "loss": 0.25045859813690186, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3639812469482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:37] (step=0007691) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.14945588806840265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7692, "loss": 0.2267451137304306, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3763675689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:41] (step=0007692) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.14947532063738828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7693, "loss": 0.22604894638061523, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9629917144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:45] (step=0007693) Train Loss: 0.2047, Train Steps/Sec: 0.28, Epoch: 0.14949475320637387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7694, "loss": 0.19423680007457733, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7335090637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:48] (step=0007694) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.1495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7695, "loss": 0.32630443572998047, "memory_gb": 7.721559524536133, "step_time_ms": 3365.571975708008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:52] (step=0007695) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.14953361834434512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7696, "loss": 0.21999068558216095, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4351749420166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:55] (step=0007696) Train Loss: 0.1932, Train Steps/Sec: 0.28, Epoch: 0.14955305091333074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7697, "loss": 0.1510055661201477, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6174716949463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:43:59] (step=0007697) Train Loss: 0.1660, Train Steps/Sec: 0.28, Epoch: 0.14957248348231636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7698, "loss": 0.1828828752040863, "memory_gb": 7.721559524536133, "step_time_ms": 3360.475540161133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:03] (step=0007698) Train Loss: 0.1914, Train Steps/Sec: 0.28, Epoch: 0.14959191605130198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7699, "loss": 0.22198382019996643, "memory_gb": 7.721559524536133, "step_time_ms": 3360.027313232422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:06] (step=0007699) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.1496113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7700, "loss": 0.13581973314285278, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8248958587646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:10] (step=0007700) Train Loss: 0.1914, Train Steps/Sec: 0.28, Epoch: 0.14963078118927323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7701, "loss": 0.2548549175262451, "memory_gb": 7.721559524536133, "step_time_ms": 3338.5133743286133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:13] (step=0007701) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.14965021375825885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7702, "loss": 0.26076582074165344, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5992889404297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:17] (step=0007702) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.14966964632724447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7703, "loss": 0.285388708114624, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0821285247803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:21] (step=0007703) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.1496890788962301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7704, "loss": 0.2977049946784973, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6789836883545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:24] (step=0007704) Train Loss: 0.3262, Train Steps/Sec: 0.28, Epoch: 0.1497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7705, "loss": 0.2797696590423584, "memory_gb": 7.721559524536133, "step_time_ms": 3359.042167663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:28] (step=0007705) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.1497279440342013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7706, "loss": 0.32638126611709595, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9628467559814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:31] (step=0007706) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.14974737660318693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7707, "loss": 0.20003780722618103, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7403297424316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:35] (step=0007707) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.14976680917217255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7708, "loss": 0.1517878770828247, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7914447784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:39] (step=0007708) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.14978624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7709, "loss": 0.2650299668312073, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8356761932373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:42] (step=0007709) Train Loss: 0.2164, Train Steps/Sec: 0.28, Epoch: 0.1498056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7710, "loss": 0.34800344705581665, "memory_gb": 7.721559524536133, "step_time_ms": 3341.769218444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:46] (step=0007710) Train Loss: 0.3241, Train Steps/Sec: 0.28, Epoch: 0.14982510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7711, "loss": 0.14779026806354523, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1181297302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:49] (step=0007711) Train Loss: 0.2131, Train Steps/Sec: 0.28, Epoch: 0.14984453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7712, "loss": 0.2729552388191223, "memory_gb": 7.721559524536133, "step_time_ms": 3353.400707244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:53] (step=0007712) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.14986397201710067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:44:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7713, "loss": 0.2352602481842041, "memory_gb": 7.721559524536133, "step_time_ms": 3353.962182998657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:44:56] (step=0007713) Train Loss: 0.1880, Train Steps/Sec: 0.28, Epoch: 0.1498834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7714, "loss": 0.19233158230781555, "memory_gb": 7.721559524536133, "step_time_ms": 3356.126308441162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:00] (step=0007714) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.1499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7715, "loss": 0.3176301121711731, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2894592285156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:04] (step=0007715) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.14992226972405753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7716, "loss": 0.21106019616127014, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6128482818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:07] (step=0007716) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.14994170229304313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7717, "loss": 0.2233658730983734, "memory_gb": 7.721559524536133, "step_time_ms": 3342.698812484741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:11] (step=0007717) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.14996113486202875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7718, "loss": 0.29800915718078613, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6827354431152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:14] (step=0007718) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.14998056743101437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7719, "loss": 0.32479938864707947, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8990020751953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:18] (step=0007719) Train Loss: 0.3041, Train Steps/Sec: 0.28, Epoch: 0.15, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7720, "loss": 0.1926984339952469, "memory_gb": 7.721559524536133, "step_time_ms": 3355.431318283081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:22] (step=0007720) Train Loss: 0.1810, Train Steps/Sec: 0.28, Epoch: 0.15001943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7721, "loss": 0.24790425598621368, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0914993286133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:25] (step=0007721) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.15003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7722, "loss": 0.2221715897321701, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4108352661133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:29] (step=0007722) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.15005829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7723, "loss": 0.16832399368286133, "memory_gb": 7.721559524536133, "step_time_ms": 3358.004093170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:32] (step=0007723) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.15007773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7724, "loss": 0.21875786781311035, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3645820617676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:36] (step=0007724) Train Loss: 0.2494, Train Steps/Sec: 0.28, Epoch: 0.1500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7725, "loss": 0.21916073560714722, "memory_gb": 7.721559524536133, "step_time_ms": 3364.144802093506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:39] (step=0007725) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.15011659541391373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7726, "loss": 0.18689826130867004, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6157817840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:43] (step=0007726) Train Loss: 0.1732, Train Steps/Sec: 0.27, Epoch: 0.15013602798289935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7727, "loss": 0.32357844710350037, "memory_gb": 7.721559524536133, "step_time_ms": 3359.361410140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:47] (step=0007727) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.15015546055188497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7728, "loss": 0.20204825699329376, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5001487731934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:50] (step=0007728) Train Loss: 0.1861, Train Steps/Sec: 0.28, Epoch: 0.15017489312087057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7729, "loss": 0.29359084367752075, "memory_gb": 7.721559524536133, "step_time_ms": 3364.419937133789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:54] (step=0007729) Train Loss: 0.2875, Train Steps/Sec: 0.28, Epoch: 0.1501943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:45:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7730, "loss": 0.23498858511447906, "memory_gb": 7.721559524536133, "step_time_ms": 3501.1651515960693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:45:58] (step=0007730) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.1502137582588418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7731, "loss": 0.23029440641403198, "memory_gb": 7.721559524536133, "step_time_ms": 3351.606845855713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:01] (step=0007731) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.15023319082782743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7732, "loss": 0.2753552794456482, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4724922180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:05] (step=0007732) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.15025262339681306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7733, "loss": 0.13530781865119934, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2729778289795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:08] (step=0007733) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.15027205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7734, "loss": 0.2177051305770874, "memory_gb": 7.721559524536133, "step_time_ms": 3360.276460647583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:12] (step=0007734) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.1502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7735, "loss": 0.2157350480556488, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0620288848877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:16] (step=0007735) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.15031092110376992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7736, "loss": 0.23360595107078552, "memory_gb": 7.721559524536133, "step_time_ms": 3362.184524536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:19] (step=0007736) Train Loss: 0.3056, Train Steps/Sec: 0.28, Epoch: 0.15033035367275555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7737, "loss": 0.18864330649375916, "memory_gb": 7.721559524536133, "step_time_ms": 3362.046241760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:23] (step=0007737) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.15034978624174117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7738, "loss": 0.21106548607349396, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7422370910645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:26] (step=0007738) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.1503692188107268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7739, "loss": 0.36259156465530396, "memory_gb": 7.721559524536133, "step_time_ms": 3363.262414932251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:30] (step=0007739) Train Loss: 0.3317, Train Steps/Sec: 0.28, Epoch: 0.15038865137971238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7740, "loss": 0.2467736303806305, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0267429351807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:34] (step=0007740) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.150408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7741, "loss": 0.24286362528800964, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0709018707275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:37] (step=0007741) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.15042751651768363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7742, "loss": 0.30221694707870483, "memory_gb": 7.721559524536133, "step_time_ms": 3365.011692047119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:41] (step=0007742) Train Loss: 0.3027, Train Steps/Sec: 0.28, Epoch: 0.15044694908666925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7743, "loss": 0.268475741147995, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7894859313965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:44] (step=0007743) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.15046638165565487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7744, "loss": 0.28982022404670715, "memory_gb": 7.721559524536133, "step_time_ms": 3352.580785751343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:48] (step=0007744) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.1504858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7745, "loss": 0.1692812740802765, "memory_gb": 7.721559524536133, "step_time_ms": 3346.5893268585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:51] (step=0007745) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.15050524679362612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7746, "loss": 0.28800010681152344, "memory_gb": 7.721559524536133, "step_time_ms": 3365.43345451355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:55] (step=0007746) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.15052467936261174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7747, "loss": 0.1895935833454132, "memory_gb": 7.721559524536133, "step_time_ms": 3356.945037841797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:46:59] (step=0007747) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.15054411193159736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7748, "loss": 0.12714746594429016, "memory_gb": 7.721559524536133, "step_time_ms": 3360.816240310669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:02] (step=0007748) Train Loss: 0.1814, Train Steps/Sec: 0.28, Epoch: 0.15056354450058299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7749, "loss": 0.27735280990600586, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5475425720215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:06] (step=0007749) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.1505829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7750, "loss": 0.17053255438804626, "memory_gb": 7.721559524536133, "step_time_ms": 3360.795497894287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:10] (step=0007750) Train Loss: 0.1784, Train Steps/Sec: 0.28, Epoch: 0.15060240963855423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7751, "loss": 0.3070496618747711, "memory_gb": 7.721559524536133, "step_time_ms": 3362.438917160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:13] (step=0007751) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.15062184220753982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7752, "loss": 0.22998544573783875, "memory_gb": 7.721559524536133, "step_time_ms": 3360.826015472412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:17] (step=0007752) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.15064127477652545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7753, "loss": 0.30116015672683716, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7775440216064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:20] (step=0007753) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.15066070734551107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7754, "loss": 0.3882765769958496, "memory_gb": 7.715639114379883, "step_time_ms": 3327.5904655456543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:24] (step=0007754) Train Loss: 0.3228, Train Steps/Sec: 0.28, Epoch: 0.1506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7755, "loss": 0.16732144355773926, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7400398254395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:28] (step=0007755) Train Loss: 0.1754, Train Steps/Sec: 0.28, Epoch: 0.1506995724834823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7756, "loss": 0.29240038990974426, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2615852355957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:31] (step=0007756) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.15071900505246794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7757, "loss": 0.275537371635437, "memory_gb": 7.721559524536133, "step_time_ms": 3363.999128341675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:35] (step=0007757) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.15073843762145356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7758, "loss": 0.18042126297950745, "memory_gb": 7.721559524536133, "step_time_ms": 3363.267660140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:38] (step=0007758) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.15075787019043918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7759, "loss": 0.40707260370254517, "memory_gb": 7.721559524536133, "step_time_ms": 3361.95707321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:42] (step=0007759) Train Loss: 0.2996, Train Steps/Sec: 0.28, Epoch: 0.1507773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7760, "loss": 0.22629475593566895, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6366176605225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:46] (step=0007760) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.15079673532841043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7761, "loss": 0.31557172536849976, "memory_gb": 7.721559524536133, "step_time_ms": 3365.837574005127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:49] (step=0007761) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.15081616789739605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7762, "loss": 0.2248488962650299, "memory_gb": 7.721559524536133, "step_time_ms": 3363.727807998657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:53] (step=0007762) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.15083560046638164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7763, "loss": 0.12386134266853333, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8858795166016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:47:56] (step=0007763) Train Loss: 0.1638, Train Steps/Sec: 0.28, Epoch: 0.15085503303536726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7764, "loss": 0.25169432163238525, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1936054229736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:00] (step=0007764) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.1508744656043529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7765, "loss": 0.2804217040538788, "memory_gb": 7.715639114379883, "step_time_ms": 3332.8425884246826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:04] (step=0007765) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.1508938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7766, "loss": 0.24741387367248535, "memory_gb": 7.721559524536133, "step_time_ms": 3360.074996948242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:07] (step=0007766) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.15091333074232413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7767, "loss": 0.2331247329711914, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5059127807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:11] (step=0007767) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.15093276331130975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7768, "loss": 0.2538120150566101, "memory_gb": 7.721559524536133, "step_time_ms": 3362.870693206787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:14] (step=0007768) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.15095219588029538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7769, "loss": 0.19375178217887878, "memory_gb": 7.721559524536133, "step_time_ms": 3366.114377975464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:18] (step=0007769) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.150971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7770, "loss": 0.27744826674461365, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4467849731445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:22] (step=0007770) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.15099106101826662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7771, "loss": 0.2977752387523651, "memory_gb": 7.721559524536133, "step_time_ms": 3511.775016784668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:25] (step=0007771) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.15101049358725224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7772, "loss": 0.27056899666786194, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7944202423096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:29] (step=0007772) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.15102992615623786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7773, "loss": 0.17242006957530975, "memory_gb": 7.721559524536133, "step_time_ms": 3352.468967437744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:32] (step=0007773) Train Loss: 0.2047, Train Steps/Sec: 0.28, Epoch: 0.1510493587252235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7774, "loss": 0.21947692334651947, "memory_gb": 7.715639114379883, "step_time_ms": 3338.0191326141357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:36] (step=0007774) Train Loss: 0.2409, Train Steps/Sec: 0.27, Epoch: 0.15106879129420908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7775, "loss": 0.25429099798202515, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9888763427734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:40] (step=0007775) Train Loss: 0.1898, Train Steps/Sec: 0.28, Epoch: 0.1510882238631947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7776, "loss": 0.233432799577713, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0128116607666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:43] (step=0007776) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.15110765643218033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7777, "loss": 0.1505422294139862, "memory_gb": 7.721559524536133, "step_time_ms": 3367.696762084961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:47] (step=0007777) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.15112708900116595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7778, "loss": 0.25677061080932617, "memory_gb": 7.721559524536133, "step_time_ms": 3368.595838546753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:51] (step=0007778) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.15114652157015157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7779, "loss": 0.2547738254070282, "memory_gb": 7.721559524536133, "step_time_ms": 3365.070104598999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:54] (step=0007779) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.1511659541391372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:48:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7780, "loss": 0.2597288489341736, "memory_gb": 7.721559524536133, "step_time_ms": 3361.942768096924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:48:58] (step=0007780) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.15118538670812282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7781, "loss": 0.1539783626794815, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0616455078125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:01] (step=0007781) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.15120481927710844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7782, "loss": 0.24091078341007233, "memory_gb": 7.721559524536133, "step_time_ms": 3362.344741821289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:05] (step=0007782) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.15122425184609406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7783, "loss": 0.19938835501670837, "memory_gb": 7.721559524536133, "step_time_ms": 3351.609945297241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:09] (step=0007783) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.15124368441507968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7784, "loss": 0.3300742208957672, "memory_gb": 7.721559524536133, "step_time_ms": 3362.546443939209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:12] (step=0007784) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.1512631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7785, "loss": 0.2476048320531845, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6536598205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:16] (step=0007785) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.15128254955305093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7786, "loss": 0.16371099650859833, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0893955230713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:19] (step=0007786) Train Loss: 0.1588, Train Steps/Sec: 0.28, Epoch: 0.15130198212203652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7787, "loss": 0.23977141082286835, "memory_gb": 7.721559524536133, "step_time_ms": 3366.386890411377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:23] (step=0007787) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.15132141469102214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7788, "loss": 0.18128718435764313, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4050827026367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:27] (step=0007788) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.15134084726000777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7789, "loss": 0.16423772275447845, "memory_gb": 7.721559524536133, "step_time_ms": 3363.076686859131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:30] (step=0007789) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.1513602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7790, "loss": 0.21326494216918945, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8076782226562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:34] (step=0007790) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.151379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7791, "loss": 0.25591880083084106, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7330741882324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:38] (step=0007791) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.15139914496696463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7792, "loss": 0.21209126710891724, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4096126556396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:41] (step=0007792) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.15141857753595026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7793, "loss": 0.18296483159065247, "memory_gb": 7.721559524536133, "step_time_ms": 3342.397689819336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:45] (step=0007793) Train Loss: 0.1560, Train Steps/Sec: 0.28, Epoch: 0.15143801010493588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7794, "loss": 0.33997058868408203, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6036853790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:48] (step=0007794) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.1514574426739215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7795, "loss": 0.23359934985637665, "memory_gb": 7.721559524536133, "step_time_ms": 3363.854169845581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:52] (step=0007795) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.15147687524290712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7796, "loss": 0.24142003059387207, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3840293884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:56] (step=0007796) Train Loss: 0.3146, Train Steps/Sec: 0.28, Epoch: 0.15149630781189274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:49:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7797, "loss": 0.24469681084156036, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3765544891357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:49:59] (step=0007797) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.15151574038087834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7798, "loss": 0.2321651130914688, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1581535339355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:03] (step=0007798) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.15153517294986396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7799, "loss": 0.10104985535144806, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9956245422363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:06] (step=0007799) Train Loss: 0.1250, Train Steps/Sec: 0.28, Epoch: 0.15155460551884958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7800, "loss": 0.3347034752368927, "memory_gb": 7.721559524536133, "step_time_ms": 3361.459493637085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:10] (step=0007800) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.1515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7801, "loss": 0.18789085745811462, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7682456970215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:14] (step=0007801) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.15159347065682083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7802, "loss": 0.20228612422943115, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2072257995605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:17] (step=0007802) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.15161290322580645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7803, "loss": 0.31114986538887024, "memory_gb": 7.721559524536133, "step_time_ms": 3363.948106765747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:21] (step=0007803) Train Loss: 0.3124, Train Steps/Sec: 0.28, Epoch: 0.15163233579479207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7804, "loss": 0.25068116188049316, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0261421203613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:24] (step=0007804) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.1516517683637777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7805, "loss": 0.214707612991333, "memory_gb": 7.721559524536133, "step_time_ms": 3361.987352371216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:28] (step=0007805) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.15167120093276332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7806, "loss": 0.2360507994890213, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6251010894775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:32] (step=0007806) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.15169063350174894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7807, "loss": 0.2508368492126465, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0969314575195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:35] (step=0007807) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.15171006607073456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7808, "loss": 0.1783464550971985, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6884994506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:39] (step=0007808) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.15172949863972018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7809, "loss": 0.2993911802768707, "memory_gb": 7.721559524536133, "step_time_ms": 3359.741449356079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:42] (step=0007809) Train Loss: 0.2842, Train Steps/Sec: 0.28, Epoch: 0.15174893120870578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7810, "loss": 0.25413069128990173, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3217391967773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:46] (step=0007810) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.1517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7811, "loss": 0.2765411138534546, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6317043304443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:50] (step=0007811) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.15178779634667702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7812, "loss": 0.29410749673843384, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4483184814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:53] (step=0007812) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.15180722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7813, "loss": 0.20710022747516632, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1854572296143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:50:57] (step=0007813) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.15182666148464827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7814, "loss": 0.28320610523223877, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3625087738037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:01] (step=0007814) Train Loss: 0.2932, Train Steps/Sec: 0.27, Epoch: 0.1518460940536339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7815, "loss": 0.286385178565979, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5562496185303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:04] (step=0007815) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.1518655266226195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7816, "loss": 0.24713771045207977, "memory_gb": 7.721559524536133, "step_time_ms": 3361.274242401123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:08] (step=0007816) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.15188495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7817, "loss": 0.2501941919326782, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6933403015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:11] (step=0007817) Train Loss: 0.3213, Train Steps/Sec: 0.28, Epoch: 0.15190439176059076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7818, "loss": 0.17347249388694763, "memory_gb": 7.721559524536133, "step_time_ms": 3497.84779548645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:15] (step=0007818) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.15192382432957638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7819, "loss": 0.16597971320152283, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7099571228027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:19] (step=0007819) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.151943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7820, "loss": 0.31653520464897156, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8150482177734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:22] (step=0007820) Train Loss: 0.3191, Train Steps/Sec: 0.28, Epoch: 0.15196268946754762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7821, "loss": 0.25461849570274353, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6931743621826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:26] (step=0007821) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.15198212203653322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7822, "loss": 0.19293546676635742, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6277751922607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:29] (step=0007822) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.15200155460551884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7823, "loss": 0.22408165037631989, "memory_gb": 7.721559524536133, "step_time_ms": 3352.750778198242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:33] (step=0007823) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.15202098717450446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7824, "loss": 0.24729038774967194, "memory_gb": 7.721559524536133, "step_time_ms": 3358.860731124878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:37] (step=0007824) Train Loss: 0.3027, Train Steps/Sec: 0.28, Epoch: 0.15204041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7825, "loss": 0.2140045464038849, "memory_gb": 7.715639114379883, "step_time_ms": 3322.9308128356934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:40] (step=0007825) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.1520598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7826, "loss": 0.16103389859199524, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0949306488037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:44] (step=0007826) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.15207928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7827, "loss": 0.1752358078956604, "memory_gb": 7.721559524536133, "step_time_ms": 3358.125686645508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:47] (step=0007827) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.15209871745044695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7828, "loss": 0.19401243329048157, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8057899475098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:51] (step=0007828) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.15211815001943257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7829, "loss": 0.33114540576934814, "memory_gb": 7.721559524536133, "step_time_ms": 3358.94513130188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:54] (step=0007829) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.1521375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:51:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7830, "loss": 0.26668769121170044, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5258255004883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:51:58] (step=0007830) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.15215701515740382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7831, "loss": 0.3279256224632263, "memory_gb": 7.721559524536133, "step_time_ms": 3359.856367111206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:02] (step=0007831) Train Loss: 0.3246, Train Steps/Sec: 0.28, Epoch: 0.15217644772638944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7832, "loss": 0.24704763293266296, "memory_gb": 7.721559524536133, "step_time_ms": 3359.194278717041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:05] (step=0007832) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.15219588029537504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7833, "loss": 0.2261856496334076, "memory_gb": 7.721559524536133, "step_time_ms": 3357.179641723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:09] (step=0007833) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.15221531286436066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7834, "loss": 0.31801509857177734, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7235469818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:12] (step=0007834) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.15223474543334628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7835, "loss": 0.2112523466348648, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4699630737305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:16] (step=0007835) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.1522541780023319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7836, "loss": 0.31495431065559387, "memory_gb": 7.721559524536133, "step_time_ms": 3355.785608291626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:20] (step=0007836) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.15227361057131752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7837, "loss": 0.17679567635059357, "memory_gb": 7.721559524536133, "step_time_ms": 3356.497049331665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:23] (step=0007837) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.15229304314030315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7838, "loss": 0.1530405879020691, "memory_gb": 7.721559524536133, "step_time_ms": 3359.612226486206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:27] (step=0007838) Train Loss: 0.1627, Train Steps/Sec: 0.28, Epoch: 0.15231247570928877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7839, "loss": 0.21483853459358215, "memory_gb": 7.721559524536133, "step_time_ms": 3343.278169631958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:30] (step=0007839) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.1523319082782744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7840, "loss": 0.2608243227005005, "memory_gb": 7.721559524536133, "step_time_ms": 3356.325149536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:34] (step=0007840) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.15235134084726001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7841, "loss": 0.20492148399353027, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6225509643555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:38] (step=0007841) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.15237077341624564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7842, "loss": 0.23862186074256897, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0237369537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:41] (step=0007842) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.15239020598523126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7843, "loss": 0.23688837885856628, "memory_gb": 7.721559524536133, "step_time_ms": 3362.293243408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:45] (step=0007843) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.15240963855421688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7844, "loss": 0.3190753757953644, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7053546905518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:48] (step=0007844) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.15242907112320248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7845, "loss": 0.2740069031715393, "memory_gb": 7.721559524536133, "step_time_ms": 3358.656406402588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:52] (step=0007845) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.1524485036921881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7846, "loss": 0.21218305826187134, "memory_gb": 7.721559524536133, "step_time_ms": 3343.393087387085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:56] (step=0007846) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.15246793626117372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:52:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 7847, "loss": 0.18203237652778625, "memory_gb": 7.721559524536133, "step_time_ms": 3365.569829940796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:52:59] (step=0007847) Train Loss: 0.2014, Train Steps/Sec: 0.28, Epoch: 0.15248736883015934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7848, "loss": 0.260304719209671, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3243312835693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:03] (step=0007848) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.15250680139914496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7849, "loss": 0.23214159905910492, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0523681640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:06] (step=0007849) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.1525262339681306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7850, "loss": 0.22859250009059906, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0028705596924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:10] (step=0007850) Train Loss: 0.2455, Train Steps/Sec: 0.28, Epoch: 0.1525456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7851, "loss": 0.3106692433357239, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0081672668457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:14] (step=0007851) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.15256509910610183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 7852, "loss": 0.3191283941268921, "memory_gb": 7.715639114379883, "step_time_ms": 3325.935125350952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:17] (step=0007852) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.15258453167508745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7853, "loss": 0.30845344066619873, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4184341430664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:21] (step=0007853) Train Loss: 0.2968, Train Steps/Sec: 0.28, Epoch: 0.15260396424407308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7854, "loss": 0.22635966539382935, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0830516815186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:24] (step=0007854) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.1526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7855, "loss": 0.19950130581855774, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7185707092285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:28] (step=0007855) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.1526428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7856, "loss": 0.2824845016002655, "memory_gb": 7.721559524536133, "step_time_ms": 3343.90926361084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:31] (step=0007856) Train Loss: 0.3166, Train Steps/Sec: 0.28, Epoch: 0.15266226195102992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 7857, "loss": 0.3294830322265625, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9044246673584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:35] (step=0007857) Train Loss: 0.2953, Train Steps/Sec: 0.28, Epoch: 0.15268169452001554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7858, "loss": 0.22181211411952972, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0664138793945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:39] (step=0007858) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.15270112708900116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7859, "loss": 0.24921181797981262, "memory_gb": 7.721559524536133, "step_time_ms": 3491.9703006744385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:42] (step=0007859) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.15272055965798678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7860, "loss": 0.27233052253723145, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4332885742188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:46] (step=0007860) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.1527399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7861, "loss": 0.2236301302909851, "memory_gb": 7.721559524536133, "step_time_ms": 3367.971420288086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:50] (step=0007861) Train Loss: 0.2215, Train Steps/Sec: 0.27, Epoch: 0.15275942479595803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 7862, "loss": 0.1957838237285614, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2750511169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:53] (step=0007862) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.15277885736494365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7863, "loss": 0.16112646460533142, "memory_gb": 7.721559524536133, "step_time_ms": 3362.138509750366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:53:57] (step=0007863) Train Loss: 0.1602, Train Steps/Sec: 0.28, Epoch: 0.15279828993392927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7864, "loss": 0.259635865688324, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8366718292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:00] (step=0007864) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.1528177225029149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7865, "loss": 0.21608251333236694, "memory_gb": 7.721559524536133, "step_time_ms": 3367.042303085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:04] (step=0007865) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.15283715507190052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7866, "loss": 0.2061476707458496, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1341667175293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:08] (step=0007866) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.15285658764088614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7867, "loss": 0.1888427436351776, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0709857940674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:11] (step=0007867) Train Loss: 0.1924, Train Steps/Sec: 0.28, Epoch: 0.15287602020987173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7868, "loss": 0.2185652107000351, "memory_gb": 7.721559524536133, "step_time_ms": 3367.990493774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:15] (step=0007868) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.15289545277885735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7869, "loss": 0.1755334436893463, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6700382232666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:18] (step=0007869) Train Loss: 0.1934, Train Steps/Sec: 0.28, Epoch: 0.15291488534784298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7870, "loss": 0.21974590420722961, "memory_gb": 7.721559524536133, "step_time_ms": 3366.482973098755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:22] (step=0007870) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.1529343179168286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7871, "loss": 0.2632540762424469, "memory_gb": 7.721559524536133, "step_time_ms": 3364.877700805664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:26] (step=0007871) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.15295375048581422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7872, "loss": 0.1927100419998169, "memory_gb": 7.721559524536133, "step_time_ms": 3365.250825881958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:29] (step=0007872) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.15297318305479984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7873, "loss": 0.20456427335739136, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3929748535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:33] (step=0007873) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.15299261562378547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7874, "loss": 0.276729017496109, "memory_gb": 7.721559524536133, "step_time_ms": 3364.362955093384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:37] (step=0007874) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.1530120481927711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7875, "loss": 0.2995101809501648, "memory_gb": 7.721559524536133, "step_time_ms": 3363.819122314453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:40] (step=0007875) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.1530314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7876, "loss": 0.227627694606781, "memory_gb": 7.721559524536133, "step_time_ms": 3366.86635017395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:44] (step=0007876) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.15305091333074233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7877, "loss": 0.1623547375202179, "memory_gb": 7.721559524536133, "step_time_ms": 3366.985559463501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:47] (step=0007877) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.15307034589972796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7878, "loss": 0.18387362360954285, "memory_gb": 7.721559524536133, "step_time_ms": 3362.752676010132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:51] (step=0007878) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.15308977846871358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7879, "loss": 0.21988026797771454, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7708892822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:55] (step=0007879) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.15310921103769917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:54:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7880, "loss": 0.3179008960723877, "memory_gb": 7.721559524536133, "step_time_ms": 3365.316867828369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:54:58] (step=0007880) Train Loss: 0.2536, Train Steps/Sec: 0.28, Epoch: 0.1531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7881, "loss": 0.1641809344291687, "memory_gb": 7.721559524536133, "step_time_ms": 3364.772319793701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:02] (step=0007881) Train Loss: 0.1883, Train Steps/Sec: 0.28, Epoch: 0.15314807617567042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7882, "loss": 0.25870373845100403, "memory_gb": 7.721559524536133, "step_time_ms": 3366.621732711792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:05] (step=0007882) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.15316750874465604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7883, "loss": 0.19402751326560974, "memory_gb": 7.721559524536133, "step_time_ms": 3365.750312805176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:09] (step=0007883) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.15318694131364166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7884, "loss": 0.22527191042900085, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5878772735596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:13] (step=0007884) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.15320637388262728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7885, "loss": 0.2704959809780121, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4134483337402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:16] (step=0007885) Train Loss: 0.2246, Train Steps/Sec: 0.28, Epoch: 0.1532258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7886, "loss": 0.2013244926929474, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8127784729004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:20] (step=0007886) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.15324523902059853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7887, "loss": 0.28100043535232544, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7375316619873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:24] (step=0007887) Train Loss: 0.3105, Train Steps/Sec: 0.28, Epoch: 0.15326467158958415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7888, "loss": 0.34552502632141113, "memory_gb": 7.721559524536133, "step_time_ms": 3362.513542175293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:27] (step=0007888) Train Loss: 0.3196, Train Steps/Sec: 0.28, Epoch: 0.15328410415856977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7889, "loss": 0.250368595123291, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7771606445312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:31] (step=0007889) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.1533035367275554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7890, "loss": 0.2490847110748291, "memory_gb": 7.721559524536133, "step_time_ms": 3357.936382293701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:34] (step=0007890) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.153322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7891, "loss": 0.25215214490890503, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9002361297607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:38] (step=0007891) Train Loss: 0.3223, Train Steps/Sec: 0.28, Epoch: 0.1533424018655266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7892, "loss": 0.18460765480995178, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3004264831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:42] (step=0007892) Train Loss: 0.1844, Train Steps/Sec: 0.28, Epoch: 0.15336183443451223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7893, "loss": 0.2560063898563385, "memory_gb": 7.721559524536133, "step_time_ms": 3362.226724624634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:45] (step=0007893) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.15338126700349786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7894, "loss": 0.17359337210655212, "memory_gb": 7.721559524536133, "step_time_ms": 3364.795446395874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:49] (step=0007894) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.15340069957248348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7895, "loss": 0.23146890103816986, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2589111328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:52] (step=0007895) Train Loss: 0.1776, Train Steps/Sec: 0.28, Epoch: 0.1534201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:55:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7896, "loss": 0.3287252187728882, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2564430236816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:55:56] (step=0007896) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.15343956471045472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7897, "loss": 0.21957726776599884, "memory_gb": 7.721559524536133, "step_time_ms": 3365.264892578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:00] (step=0007897) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.15345899727944035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7898, "loss": 0.24307802319526672, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9556636810303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:03] (step=0007898) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.15347842984842597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7899, "loss": 0.28678056597709656, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8431816101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:07] (step=0007899) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.1534978624174116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7900, "loss": 0.16981810331344604, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9397411346436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:10] (step=0007900) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.1535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7901, "loss": 0.18447346985340118, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6514205932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:14] (step=0007901) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.15353672755538283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7902, "loss": 0.19688424468040466, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5187454223633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:18] (step=0007902) Train Loss: 0.2333, Train Steps/Sec: 0.27, Epoch: 0.15355616012436843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7903, "loss": 0.23794840276241302, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9753589630127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:21] (step=0007903) Train Loss: 0.2904, Train Steps/Sec: 0.28, Epoch: 0.15357559269335405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7904, "loss": 0.28129300475120544, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4144535064697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:25] (step=0007904) Train Loss: 0.2728, Train Steps/Sec: 0.28, Epoch: 0.15359502526233967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7905, "loss": 0.30756956338882446, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6152954101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:29] (step=0007905) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.1536144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7906, "loss": 0.33881324529647827, "memory_gb": 7.721559524536133, "step_time_ms": 3361.393451690674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:32] (step=0007906) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.15363389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7907, "loss": 0.3144378960132599, "memory_gb": 7.721559524536133, "step_time_ms": 3506.7684650421143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:36] (step=0007907) Train Loss: 0.2844, Train Steps/Sec: 0.28, Epoch: 0.15365332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7908, "loss": 0.4129951596260071, "memory_gb": 7.721559524536133, "step_time_ms": 3368.184804916382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:39] (step=0007908) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.15367275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7909, "loss": 0.19185099005699158, "memory_gb": 7.721559524536133, "step_time_ms": 3357.163906097412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:43] (step=0007909) Train Loss: 0.1887, Train Steps/Sec: 0.28, Epoch: 0.15369218810726779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7910, "loss": 0.2422928512096405, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5545291900635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:47] (step=0007910) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.1537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7911, "loss": 0.2131132334470749, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6060485839844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:50] (step=0007911) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.15373105324523903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7912, "loss": 0.17014643549919128, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4158840179443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:54] (step=0007912) Train Loss: 0.1715, Train Steps/Sec: 0.28, Epoch: 0.15375048581422465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:56:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7913, "loss": 0.2437768280506134, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0456199645996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:56:57] (step=0007913) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.15376991838321025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7914, "loss": 0.30976617336273193, "memory_gb": 7.721559524536133, "step_time_ms": 3358.25777053833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:01] (step=0007914) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.15378935095219587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7915, "loss": 0.2632961869239807, "memory_gb": 7.721559524536133, "step_time_ms": 3365.522623062134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:05] (step=0007915) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.1538087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7916, "loss": 0.33596184849739075, "memory_gb": 7.721559524536133, "step_time_ms": 3358.532190322876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:08] (step=0007916) Train Loss: 0.3086, Train Steps/Sec: 0.28, Epoch: 0.1538282160901671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 7917, "loss": 0.2435058355331421, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0422401428223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:12] (step=0007917) Train Loss: 0.1772, Train Steps/Sec: 0.28, Epoch: 0.15384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7918, "loss": 0.326171875, "memory_gb": 7.721559524536133, "step_time_ms": 3364.091157913208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:15] (step=0007918) Train Loss: 0.2879, Train Steps/Sec: 0.28, Epoch: 0.15386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7919, "loss": 0.20020830631256104, "memory_gb": 7.721559524536133, "step_time_ms": 3359.250068664551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:19] (step=0007919) Train Loss: 0.1979, Train Steps/Sec: 0.28, Epoch: 0.15388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7920, "loss": 0.12177594006061554, "memory_gb": 7.721559524536133, "step_time_ms": 3356.295347213745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:23] (step=0007920) Train Loss: 0.1624, Train Steps/Sec: 0.28, Epoch: 0.1539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7921, "loss": 0.2519545257091522, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2493953704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:26] (step=0007921) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.15392537893509523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 7922, "loss": 0.22786331176757812, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0704669952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:30] (step=0007922) Train Loss: 0.2069, Train Steps/Sec: 0.28, Epoch: 0.15394481150408085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7923, "loss": 0.2935170531272888, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5856895446777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:33] (step=0007923) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.15396424407306647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7924, "loss": 0.1583031415939331, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9176712036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:37] (step=0007924) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.1539836766420521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 7925, "loss": 0.15071363747119904, "memory_gb": 7.721559524536133, "step_time_ms": 3357.197046279907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:41] (step=0007925) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.1540031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7926, "loss": 0.23195728659629822, "memory_gb": 7.721559524536133, "step_time_ms": 3362.487316131592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:44] (step=0007926) Train Loss: 0.2917, Train Steps/Sec: 0.28, Epoch: 0.1540225417800233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 7927, "loss": 0.1575995832681656, "memory_gb": 7.721559524536133, "step_time_ms": 3366.49489402771, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:48] (step=0007927) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.15404197434900893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7928, "loss": 0.2887609004974365, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8414192199707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:51] (step=0007928) Train Loss: 0.2723, Train Steps/Sec: 0.28, Epoch: 0.15406140691799455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7929, "loss": 0.21612605452537537, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2355670928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:55] (step=0007929) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.15408083948698018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:57:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7930, "loss": 0.24921515583992004, "memory_gb": 7.721559524536133, "step_time_ms": 3363.722562789917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:57:58] (step=0007930) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.1541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7931, "loss": 0.2785501480102539, "memory_gb": 7.721559524536133, "step_time_ms": 3362.967014312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:02] (step=0007931) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.15411970462495142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 7932, "loss": 0.17354005575180054, "memory_gb": 7.721559524536133, "step_time_ms": 3362.734794616699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:06] (step=0007932) Train Loss: 0.1894, Train Steps/Sec: 0.28, Epoch: 0.15413913719393704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7933, "loss": 0.3168661296367645, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9488525390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:09] (step=0007933) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.15415856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7934, "loss": 0.3146302103996277, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4977855682373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:13] (step=0007934) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.1541780023319083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7935, "loss": 0.25051578879356384, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0928058624268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:16] (step=0007935) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.1541974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7936, "loss": 0.27594438195228577, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5762977600098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:20] (step=0007936) Train Loss: 0.3077, Train Steps/Sec: 0.28, Epoch: 0.15421686746987953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 7937, "loss": 0.32053059339523315, "memory_gb": 7.721559524536133, "step_time_ms": 3361.59086227417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:24] (step=0007937) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.15423630003886513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7938, "loss": 0.2576290965080261, "memory_gb": 7.721559524536133, "step_time_ms": 3364.036798477173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:27] (step=0007938) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.15425573260785075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7939, "loss": 0.24701562523841858, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0728092193604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:31] (step=0007939) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.15427516517683637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7940, "loss": 0.16155585646629333, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3154888153076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:34] (step=0007940) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7941, "loss": 0.21642836928367615, "memory_gb": 7.721559524536133, "step_time_ms": 3358.175039291382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:38] (step=0007941) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.15431403031480762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7942, "loss": 0.2644021511077881, "memory_gb": 7.721559524536133, "step_time_ms": 3361.299991607666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:42] (step=0007942) Train Loss: 0.2887, Train Steps/Sec: 0.28, Epoch: 0.15433346288379324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7943, "loss": 0.20586863160133362, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5188179016113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:45] (step=0007943) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.15435289545277886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7944, "loss": 0.2901538610458374, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3219051361084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:49] (step=0007944) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.15437232802176448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7945, "loss": 0.25685954093933105, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5473766326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:52] (step=0007945) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.1543917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7946, "loss": 0.2118481993675232, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5397567749023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:58:56] (step=0007946) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.15441119315973573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7947, "loss": 0.2839721739292145, "memory_gb": 7.721559524536133, "step_time_ms": 3509.6752643585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:00] (step=0007947) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.15443062572872135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7948, "loss": 0.1762043982744217, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3921661376953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:03] (step=0007948) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.15445005829770694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7949, "loss": 0.2800077795982361, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2027473449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:07] (step=0007949) Train Loss: 0.3031, Train Steps/Sec: 0.28, Epoch: 0.15446949086669257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 7950, "loss": 0.22988305985927582, "memory_gb": 7.721559524536133, "step_time_ms": 3363.187074661255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:10] (step=0007950) Train Loss: 0.2511, Train Steps/Sec: 0.27, Epoch: 0.1544889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 7951, "loss": 0.37124359607696533, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9529991149902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:14] (step=0007951) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.1545083560046638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 7952, "loss": 0.31021687388420105, "memory_gb": 7.721559524536133, "step_time_ms": 3358.90531539917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:18] (step=0007952) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.15452778857364943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 7953, "loss": 0.15110178291797638, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7419986724854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:21] (step=0007953) Train Loss: 0.1487, Train Steps/Sec: 0.28, Epoch: 0.15454722114263506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 7954, "loss": 0.1789199411869049, "memory_gb": 7.721559524536133, "step_time_ms": 3358.036518096924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:25] (step=0007954) Train Loss: 0.1703, Train Steps/Sec: 0.28, Epoch: 0.15456665371162068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 7955, "loss": 0.30203041434288025, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7402572631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:28] (step=0007955) Train Loss: 0.3225, Train Steps/Sec: 0.28, Epoch: 0.1545860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 7956, "loss": 0.16515086591243744, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6819171905518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:32] (step=0007956) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.15460551884959192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 7957, "loss": 0.2180163860321045, "memory_gb": 7.721559524536133, "step_time_ms": 3359.018325805664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:36] (step=0007957) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.15462495141857754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 7958, "loss": 0.24931809306144714, "memory_gb": 7.721559524536133, "step_time_ms": 3357.222080230713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:39] (step=0007958) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.15464438398756317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 7959, "loss": 0.29079651832580566, "memory_gb": 7.721559524536133, "step_time_ms": 3362.600088119507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:43] (step=0007959) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.1546638165565488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 7960, "loss": 0.15854205191135406, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6531620025635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:46] (step=0007960) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.15468324912553438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 7961, "loss": 0.2762015461921692, "memory_gb": 7.721559524536133, "step_time_ms": 3361.076831817627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:50] (step=0007961) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.15470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 7962, "loss": 0.2200062870979309, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7195148468018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:54] (step=0007962) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.15472211426350563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 07:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 7963, "loss": 0.3175097703933716, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7991466522217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 07:59:57] (step=0007963) Train Loss: 0.3183, Train Steps/Sec: 0.28, Epoch: 0.15474154683249125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 7964, "loss": 0.2024729698896408, "memory_gb": 7.721559524536133, "step_time_ms": 3363.708257675171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:01] (step=0007964) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.15476097940147687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 7965, "loss": 0.2924826145172119, "memory_gb": 7.721559524536133, "step_time_ms": 3363.936424255371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:04] (step=0007965) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.1547804119704625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 7966, "loss": 0.19674184918403625, "memory_gb": 7.721559524536133, "step_time_ms": 3360.638380050659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:08] (step=0007966) Train Loss: 0.1747, Train Steps/Sec: 0.28, Epoch: 0.15479984453944812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 7967, "loss": 0.2112731635570526, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6087226867676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:11] (step=0007967) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.15481927710843374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 7968, "loss": 0.3179172873497009, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1722202301025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:15] (step=0007968) Train Loss: 0.2748, Train Steps/Sec: 0.28, Epoch: 0.15483870967741936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 7969, "loss": 0.28891971707344055, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7244911193848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:19] (step=0007969) Train Loss: 0.3269, Train Steps/Sec: 0.28, Epoch: 0.15485814224640498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 7970, "loss": 0.25248467922210693, "memory_gb": 7.721559524536133, "step_time_ms": 3363.588333129883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:22] (step=0007970) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.1548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 7971, "loss": 0.19756834208965302, "memory_gb": 7.721559524536133, "step_time_ms": 3366.360902786255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:26] (step=0007971) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.1548970073843762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 7972, "loss": 0.09058453142642975, "memory_gb": 7.721559524536133, "step_time_ms": 3363.213539123535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:29] (step=0007972) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.15491643995336182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 7973, "loss": 0.1766739785671234, "memory_gb": 7.721559524536133, "step_time_ms": 3363.274097442627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:33] (step=0007973) Train Loss: 0.1962, Train Steps/Sec: 0.28, Epoch: 0.15493587252234745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 7974, "loss": 0.24643607437610626, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5847358703613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:37] (step=0007974) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.15495530509133307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 7975, "loss": 0.21404609084129333, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3147315979004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:40] (step=0007975) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.1549747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 7976, "loss": 0.3154292106628418, "memory_gb": 7.721559524536133, "step_time_ms": 3363.285541534424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:44] (step=0007976) Train Loss: 0.2985, Train Steps/Sec: 0.28, Epoch: 0.1549941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 7977, "loss": 0.28966522216796875, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5846424102783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:47] (step=0007977) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.15501360279828993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 7978, "loss": 0.22002504765987396, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1600608825684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:51] (step=0007978) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.15503303536727556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 7979, "loss": 0.14996545016765594, "memory_gb": 7.721559524536133, "step_time_ms": 3367.173910140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:55] (step=0007979) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.15505246793626118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 7980, "loss": 0.24385060369968414, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7824268341064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:00:58] (step=0007980) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.1550719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 7981, "loss": 0.24820701777935028, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2213554382324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:02] (step=0007981) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.15509133307423242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 7982, "loss": 0.2564336359500885, "memory_gb": 7.721559524536133, "step_time_ms": 3368.361473083496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:05] (step=0007982) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.15511076564321805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 7983, "loss": 0.21504992246627808, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7827796936035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:09] (step=0007983) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.15513019821220364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 7984, "loss": 0.18940186500549316, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6703910827637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:13] (step=0007984) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.15514963078118926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 7985, "loss": 0.18396247923374176, "memory_gb": 7.721559524536133, "step_time_ms": 3369.276523590088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:16] (step=0007985) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.15516906335017489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 7986, "loss": 0.12126439064741135, "memory_gb": 7.721559524536133, "step_time_ms": 3371.873140335083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:20] (step=0007986) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.1551884959191605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 7987, "loss": 0.2011551856994629, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5387840270996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:23] (step=0007987) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.15520792848814613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 7988, "loss": 0.1318838894367218, "memory_gb": 7.721559524536133, "step_time_ms": 3366.151809692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:27] (step=0007988) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.15522736105713175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 7989, "loss": 0.20866340398788452, "memory_gb": 7.721559524536133, "step_time_ms": 3515.976667404175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:31] (step=0007989) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.15524679362611737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 7990, "loss": 0.23683342337608337, "memory_gb": 7.721559524536133, "step_time_ms": 3367.922067642212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:34] (step=0007990) Train Loss: 0.2212, Train Steps/Sec: 0.27, Epoch: 0.155266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 7991, "loss": 0.29732879996299744, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1609210968018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:38] (step=0007991) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.15528565876408862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 7992, "loss": 0.3396410346031189, "memory_gb": 7.721559524536133, "step_time_ms": 3369.978189468384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:42] (step=0007992) Train Loss: 0.3086, Train Steps/Sec: 0.28, Epoch: 0.15530509133307424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 7993, "loss": 0.22195695340633392, "memory_gb": 7.721559524536133, "step_time_ms": 3362.555503845215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:45] (step=0007993) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.15532452390205986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 7994, "loss": 0.2687549591064453, "memory_gb": 7.721559524536133, "step_time_ms": 3365.98539352417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:49] (step=0007994) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.15534395647104549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 7995, "loss": 0.26227492094039917, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3603534698486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:52] (step=0007995) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.15536338904003108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:01:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 7996, "loss": 0.23578374087810516, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2430725097656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:01:56] (step=0007996) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.1553828216090167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 7997, "loss": 0.27429741621017456, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4704818725586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:00] (step=0007997) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.15540225417800232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 7998, "loss": 0.3191390633583069, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0238723754883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:03] (step=0007998) Train Loss: 0.2924, Train Steps/Sec: 0.28, Epoch: 0.15542168674698795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 7999, "loss": 0.17809070646762848, "memory_gb": 7.721559524536133, "step_time_ms": 3373.7027645111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:07] (step=0007999) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.15544111931597357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8000, "loss": 0.1606665700674057, "memory_gb": 7.721559524536133, "step_time_ms": 3366.744041442871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:10] (step=0008000) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.1554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:11] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0008000/ [2025-07-29 08:02:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8001, "loss": 0.20929837226867676, "memory_gb": 7.721559524536133, "step_time_ms": 3359.907388687134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:14] (step=0008001) Train Loss: 0.1721, Train Steps/Sec: 0.27, Epoch: 0.15547998445394481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8002, "loss": 0.28873470425605774, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0759201049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:18] (step=0008002) Train Loss: 0.3052, Train Steps/Sec: 0.28, Epoch: 0.15549941702293044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8003, "loss": 0.2740548849105835, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2861328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:21] (step=0008003) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.15551884959191606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8004, "loss": 0.18515148758888245, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8810176849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:25] (step=0008004) Train Loss: 0.1827, Train Steps/Sec: 0.28, Epoch: 0.15553828216090168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8005, "loss": 0.219629168510437, "memory_gb": 7.721559524536133, "step_time_ms": 3357.508420944214, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:29] (step=0008005) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.1555577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8006, "loss": 0.2488555759191513, "memory_gb": 7.721559524536133, "step_time_ms": 3365.96941947937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:32] (step=0008006) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.1555771472988729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8007, "loss": 0.20494389533996582, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1695976257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:36] (step=0008007) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.15559657986785852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8008, "loss": 0.2832046151161194, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4615154266357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:39] (step=0008008) Train Loss: 0.2888, Train Steps/Sec: 0.28, Epoch: 0.15561601243684414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8009, "loss": 0.27601566910743713, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2897396087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:43] (step=0008009) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.15563544500582976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8010, "loss": 0.23473653197288513, "memory_gb": 7.721559524536133, "step_time_ms": 3358.346939086914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:47] (step=0008010) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.1556548775748154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8011, "loss": 0.186583012342453, "memory_gb": 7.721559524536133, "step_time_ms": 3363.443374633789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:50] (step=0008011) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.155674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8012, "loss": 0.19482797384262085, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2731437683105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:54] (step=0008012) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.15569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:02:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8013, "loss": 0.2532534599304199, "memory_gb": 7.721559524536133, "step_time_ms": 3363.103151321411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:02:57] (step=0008013) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.15571317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8014, "loss": 0.20889604091644287, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5387210845947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:01] (step=0008014) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.15573260785075788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8015, "loss": 0.24971413612365723, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5051975250244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:05] (step=0008015) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.1557520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8016, "loss": 0.29652974009513855, "memory_gb": 7.721559524536133, "step_time_ms": 3361.142873764038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:08] (step=0008016) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.15577147298872912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8017, "loss": 0.33631184697151184, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7752323150635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:12] (step=0008017) Train Loss: 0.2523, Train Steps/Sec: 0.28, Epoch: 0.15579090555771474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8018, "loss": 0.37705808877944946, "memory_gb": 7.721559524536133, "step_time_ms": 3358.858108520508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:16] (step=0008018) Train Loss: 0.3189, Train Steps/Sec: 0.28, Epoch: 0.15581033812670034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8019, "loss": 0.27506184577941895, "memory_gb": 7.721559524536133, "step_time_ms": 3359.229803085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:19] (step=0008019) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.15582977069568596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8020, "loss": 0.2585582733154297, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7443313598633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:23] (step=0008020) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.15584920326467158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8021, "loss": 0.357795774936676, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3670902252197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:26] (step=0008021) Train Loss: 0.2768, Train Steps/Sec: 0.28, Epoch: 0.1558686358336572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8022, "loss": 0.22287723422050476, "memory_gb": 7.721559524536133, "step_time_ms": 3352.867841720581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:30] (step=0008022) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.15588806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8023, "loss": 0.25038355588912964, "memory_gb": 7.721559524536133, "step_time_ms": 3356.457471847534, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:34] (step=0008023) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.15590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8024, "loss": 0.2541564106941223, "memory_gb": 7.721559524536133, "step_time_ms": 3359.760522842407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:37] (step=0008024) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.15592693354061407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8025, "loss": 0.1403522789478302, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4467010498047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:41] (step=0008025) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.1559463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8026, "loss": 0.19865408539772034, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0545654296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:44] (step=0008026) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.15596579867858532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8027, "loss": 0.19400468468666077, "memory_gb": 7.721559524536133, "step_time_ms": 3351.541757583618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:48] (step=0008027) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.15598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8028, "loss": 0.26593977212905884, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2021560668945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:51] (step=0008028) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.15600466381655656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8029, "loss": 0.2206384837627411, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1784706115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:55] (step=0008029) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.15602409638554218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:03:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8030, "loss": 0.20399844646453857, "memory_gb": 7.721559524536133, "step_time_ms": 3512.1402740478516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:03:59] (step=0008030) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.15604352895452778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8031, "loss": 0.2881179451942444, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7608852386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:02] (step=0008031) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.1560629615235134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8032, "loss": 0.26615795493125916, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8028984069824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:06] (step=0008032) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.15608239409249902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8033, "loss": 0.2171800434589386, "memory_gb": 7.721559524536133, "step_time_ms": 3353.883743286133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:09] (step=0008033) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.15610182666148464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8034, "loss": 0.2283332347869873, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2192192077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:13] (step=0008034) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.15612125923047027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8035, "loss": 0.25541096925735474, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8666496276855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:17] (step=0008035) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.1561406917994559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8036, "loss": 0.23479154706001282, "memory_gb": 7.721559524536133, "step_time_ms": 3357.367992401123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:20] (step=0008036) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.1561601243684415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8037, "loss": 0.2923513948917389, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8127365112305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:24] (step=0008037) Train Loss: 0.2166, Train Steps/Sec: 0.27, Epoch: 0.15617955693742713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8038, "loss": 0.2100924402475357, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5592041015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:28] (step=0008038) Train Loss: 0.1889, Train Steps/Sec: 0.28, Epoch: 0.15619898950641276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8039, "loss": 0.2705920338630676, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0791015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:31] (step=0008039) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.15621842207539838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8040, "loss": 0.24309486150741577, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0703735351562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:35] (step=0008040) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8041, "loss": 0.31899118423461914, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1979274749756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:38] (step=0008041) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.1562572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8042, "loss": 0.24489212036132812, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5711975097656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:42] (step=0008042) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.15627671978235522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8043, "loss": 0.26044774055480957, "memory_gb": 7.721559524536133, "step_time_ms": 3355.220317840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:46] (step=0008043) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.15629615235134084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8044, "loss": 0.2513153851032257, "memory_gb": 7.721559524536133, "step_time_ms": 3353.579044342041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:49] (step=0008044) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.15631558492032646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8045, "loss": 0.31133192777633667, "memory_gb": 7.721559524536133, "step_time_ms": 3353.882312774658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:53] (step=0008045) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.15633501748931208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:04:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8046, "loss": 0.21470819413661957, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7555980682373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:04:56] (step=0008046) Train Loss: 0.2747, Train Steps/Sec: 0.28, Epoch: 0.1563544500582977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8047, "loss": 0.2966473400592804, "memory_gb": 7.721559524536133, "step_time_ms": 3359.320640563965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:00] (step=0008047) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.15637388262728333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8048, "loss": 0.2813098132610321, "memory_gb": 7.721559524536133, "step_time_ms": 3359.431028366089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:04] (step=0008048) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.15639331519626895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8049, "loss": 0.23016393184661865, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6318283081055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:07] (step=0008049) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.15641274776525457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8050, "loss": 0.27759140729904175, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3248596191406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:11] (step=0008050) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.1564321803342402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8051, "loss": 0.20273613929748535, "memory_gb": 7.721559524536133, "step_time_ms": 3355.26967048645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:14] (step=0008051) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.15645161290322582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8052, "loss": 0.2807620167732239, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2406044006348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:18] (step=0008052) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.15647104547221144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8053, "loss": 0.17143511772155762, "memory_gb": 7.721559524536133, "step_time_ms": 3357.599973678589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:21] (step=0008053) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.15649047804119703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8054, "loss": 0.2046988308429718, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8009395599365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:25] (step=0008054) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.15650991061018266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8055, "loss": 0.28823310136795044, "memory_gb": 7.721559524536133, "step_time_ms": 3353.677272796631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:29] (step=0008055) Train Loss: 0.3223, Train Steps/Sec: 0.28, Epoch: 0.15652934317916828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8056, "loss": 0.1925111711025238, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5617542266846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:32] (step=0008056) Train Loss: 0.2599, Train Steps/Sec: 0.28, Epoch: 0.1565487757481539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8057, "loss": 0.17059621214866638, "memory_gb": 7.721559524536133, "step_time_ms": 3358.426570892334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:36] (step=0008057) Train Loss: 0.2087, Train Steps/Sec: 0.28, Epoch: 0.15656820831713952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8058, "loss": 0.3791477084159851, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4267578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:39] (step=0008058) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.15658764088612515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8059, "loss": 0.19285894930362701, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8574867248535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:43] (step=0008059) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.15660707345511077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8060, "loss": 0.3328242897987366, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3323650360107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:47] (step=0008060) Train Loss: 0.2882, Train Steps/Sec: 0.28, Epoch: 0.1566265060240964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8061, "loss": 0.17626596987247467, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7826232910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:50] (step=0008061) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.156645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8062, "loss": 0.28309088945388794, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2489280700684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:54] (step=0008062) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.15666537116206763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:05:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8063, "loss": 0.21442648768424988, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6434078216553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:05:57] (step=0008063) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.15668480373105326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8064, "loss": 0.34600192308425903, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4189414978027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:01] (step=0008064) Train Loss: 0.3370, Train Steps/Sec: 0.28, Epoch: 0.15670423630003885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8065, "loss": 0.3271528482437134, "memory_gb": 7.721559524536133, "step_time_ms": 3364.098072052002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:05] (step=0008065) Train Loss: 0.2921, Train Steps/Sec: 0.28, Epoch: 0.15672366886902447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8066, "loss": 0.2740732431411743, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3945598602295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:08] (step=0008066) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.1567431014380101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8067, "loss": 0.18190248310565948, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8813285827637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:12] (step=0008067) Train Loss: 0.1971, Train Steps/Sec: 0.28, Epoch: 0.15676253400699572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8068, "loss": 0.14429882168769836, "memory_gb": 7.721559524536133, "step_time_ms": 3365.133047103882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:15] (step=0008068) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.15678196657598134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8069, "loss": 0.2840847969055176, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0431423187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:19] (step=0008069) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.15680139914496696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8070, "loss": 0.18760403990745544, "memory_gb": 7.721559524536133, "step_time_ms": 3361.964464187622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:23] (step=0008070) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.15682083171395259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8071, "loss": 0.3349790573120117, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8906269073486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:26] (step=0008071) Train Loss: 0.3140, Train Steps/Sec: 0.28, Epoch: 0.1568402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8072, "loss": 0.21259135007858276, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5176887512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:30] (step=0008072) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.15685969685192383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8073, "loss": 0.33298689126968384, "memory_gb": 7.721559524536133, "step_time_ms": 3356.346607208252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:33] (step=0008073) Train Loss: 0.3217, Train Steps/Sec: 0.28, Epoch: 0.15687912942090945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8074, "loss": 0.19532153010368347, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6349697113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:37] (step=0008074) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.15689856198989507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8075, "loss": 0.14518550038337708, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7382049560547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:41] (step=0008075) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.1569179945588807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8076, "loss": 0.16529127955436707, "memory_gb": 7.721559524536133, "step_time_ms": 3361.541509628296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:44] (step=0008076) Train Loss: 0.1773, Train Steps/Sec: 0.28, Epoch: 0.1569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8077, "loss": 0.29350075125694275, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9189262390137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:48] (step=0008077) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.1569568596968519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8078, "loss": 0.28272348642349243, "memory_gb": 7.721559524536133, "step_time_ms": 3506.5815448760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:51] (step=0008078) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.15697629226583754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8079, "loss": 0.2253669649362564, "memory_gb": 7.721559524536133, "step_time_ms": 3363.431692123413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:55] (step=0008079) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.15699572483482316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8080, "loss": 0.40644121170043945, "memory_gb": 7.721559524536133, "step_time_ms": 3360.685110092163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:06:59] (step=0008080) Train Loss: 0.3450, Train Steps/Sec: 0.28, Epoch: 0.15701515740380878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8081, "loss": 0.2819262742996216, "memory_gb": 7.721559524536133, "step_time_ms": 3358.250856399536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:02] (step=0008081) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.1570345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8082, "loss": 0.2065073549747467, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3979301452637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:06] (step=0008082) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.15705402254178003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8083, "loss": 0.2668713331222534, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8436069488525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:09] (step=0008083) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.15707345511076565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8084, "loss": 0.3416183888912201, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0076179504395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:13] (step=0008084) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.15709288767975127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8085, "loss": 0.23943568766117096, "memory_gb": 7.721559524536133, "step_time_ms": 3371.622085571289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:17] (step=0008085) Train Loss: 0.2072, Train Steps/Sec: 0.27, Epoch: 0.1571123202487369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8086, "loss": 0.30309316515922546, "memory_gb": 7.721559524536133, "step_time_ms": 3366.962432861328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:20] (step=0008086) Train Loss: 0.3053, Train Steps/Sec: 0.28, Epoch: 0.15713175281772251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8087, "loss": 0.19209924340248108, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8974895477295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:24] (step=0008087) Train Loss: 0.2122, Train Steps/Sec: 0.28, Epoch: 0.15715118538670814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8088, "loss": 0.19411355257034302, "memory_gb": 7.721559524536133, "step_time_ms": 3360.891103744507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:27] (step=0008088) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.15717061795569373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8089, "loss": 0.307773232460022, "memory_gb": 7.721559524536133, "step_time_ms": 3367.799758911133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:31] (step=0008089) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.15719005052467935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8090, "loss": 0.16484566032886505, "memory_gb": 7.721559524536133, "step_time_ms": 3364.816427230835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:35] (step=0008090) Train Loss: 0.1533, Train Steps/Sec: 0.28, Epoch: 0.15720948309366498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8091, "loss": 0.29218509793281555, "memory_gb": 7.721559524536133, "step_time_ms": 3370.7756996154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:38] (step=0008091) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.1572289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8092, "loss": 0.2443089783191681, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0481910705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:42] (step=0008092) Train Loss: 0.2088, Train Steps/Sec: 0.28, Epoch: 0.15724834823163622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8093, "loss": 0.21779529750347137, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4585914611816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:46] (step=0008093) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.15726778080062184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8094, "loss": 0.20713366568088531, "memory_gb": 7.721559524536133, "step_time_ms": 3359.510898590088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:49] (step=0008094) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.15728721336960746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8095, "loss": 0.24888236820697784, "memory_gb": 7.721559524536133, "step_time_ms": 3361.992120742798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:53] (step=0008095) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.1573066459385931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8096, "loss": 0.22697152197360992, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2286529541016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:07:56] (step=0008096) Train Loss: 0.2133, Train Steps/Sec: 0.28, Epoch: 0.1573260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8097, "loss": 0.30596399307250977, "memory_gb": 7.721559524536133, "step_time_ms": 3364.302635192871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:00] (step=0008097) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.15734551107656433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8098, "loss": 0.2931237816810608, "memory_gb": 7.721559524536133, "step_time_ms": 3365.45991897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:04] (step=0008098) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.15736494364554995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8099, "loss": 0.22137236595153809, "memory_gb": 7.721559524536133, "step_time_ms": 3364.320993423462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:07] (step=0008099) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.15738437621453555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8100, "loss": 0.2949366569519043, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8176498413086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:11] (step=0008100) Train Loss: 0.3187, Train Steps/Sec: 0.28, Epoch: 0.15740380878352117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8101, "loss": 0.24194225668907166, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6093559265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:14] (step=0008101) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.1574232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8102, "loss": 0.2602424621582031, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1437759399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:18] (step=0008102) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.15744267392149242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8103, "loss": 0.29212307929992676, "memory_gb": 7.721559524536133, "step_time_ms": 3365.186929702759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:22] (step=0008103) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.15746210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8104, "loss": 0.19471530616283417, "memory_gb": 7.721559524536133, "step_time_ms": 3363.140344619751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:25] (step=0008104) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.15748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8105, "loss": 0.23856209218502045, "memory_gb": 7.721559524536133, "step_time_ms": 3367.227792739868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:29] (step=0008105) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.15750097162844928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8106, "loss": 0.24805928766727448, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6678924560547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:32] (step=0008106) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.1575204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8107, "loss": 0.1998714655637741, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2281761169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:36] (step=0008107) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.15753983676642053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8108, "loss": 0.307317316532135, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2586212158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:40] (step=0008108) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.15755926933540615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8109, "loss": 0.22257885336875916, "memory_gb": 7.721559524536133, "step_time_ms": 3362.133264541626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:43] (step=0008109) Train Loss: 0.2247, Train Steps/Sec: 0.28, Epoch: 0.15757870190439177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8110, "loss": 0.22845743596553802, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5451068878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:47] (step=0008110) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.1575981344733774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8111, "loss": 0.2495330274105072, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6718730926514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:50] (step=0008111) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.157617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8112, "loss": 0.2727012038230896, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9662475585938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:54] (step=0008112) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.1576369996113486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8113, "loss": 0.2622447907924652, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4658374786377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:08:58] (step=0008113) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.15765643218033423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8114, "loss": 0.2559146285057068, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6229763031006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:01] (step=0008114) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.15767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8115, "loss": 0.2736249566078186, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3946437835693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:05] (step=0008115) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.15769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8116, "loss": 0.2093096226453781, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4496879577637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:08] (step=0008116) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.1577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8117, "loss": 0.12681010365486145, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3861541748047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:12] (step=0008117) Train Loss: 0.1800, Train Steps/Sec: 0.28, Epoch: 0.15773416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8118, "loss": 0.17327702045440674, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7609481811523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:16] (step=0008118) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.15775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8119, "loss": 0.13028061389923096, "memory_gb": 7.721559524536133, "step_time_ms": 3510.7991695404053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:19] (step=0008119) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.15777302759424797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8120, "loss": 0.3122595548629761, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6343994140625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:23] (step=0008120) Train Loss: 0.3254, Train Steps/Sec: 0.28, Epoch: 0.1577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8121, "loss": 0.33798038959503174, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6923866271973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:26] (step=0008121) Train Loss: 0.3089, Train Steps/Sec: 0.28, Epoch: 0.1578118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8122, "loss": 0.21965673565864563, "memory_gb": 7.721559524536133, "step_time_ms": 3364.830493927002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:30] (step=0008122) Train Loss: 0.1862, Train Steps/Sec: 0.28, Epoch: 0.1578313253012048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8123, "loss": 0.28543877601623535, "memory_gb": 7.721559524536133, "step_time_ms": 3361.778497695923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:34] (step=0008123) Train Loss: 0.3043, Train Steps/Sec: 0.28, Epoch: 0.15785075787019043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8124, "loss": 0.19770190119743347, "memory_gb": 7.721559524536133, "step_time_ms": 3362.295150756836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:37] (step=0008124) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.15787019043917605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8125, "loss": 0.23793570697307587, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7349605560303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:41] (step=0008125) Train Loss: 0.1920, Train Steps/Sec: 0.27, Epoch: 0.15788962300816167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8126, "loss": 0.2340823858976364, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2042713165283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:44] (step=0008126) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.1579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8127, "loss": 0.29030048847198486, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1652851104736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:48] (step=0008127) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.15792848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8128, "loss": 0.3613991141319275, "memory_gb": 7.721559524536133, "step_time_ms": 3359.429121017456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:52] (step=0008128) Train Loss: 0.3327, Train Steps/Sec: 0.28, Epoch: 0.15794792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8129, "loss": 0.23744915425777435, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1088829040527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:55] (step=0008129) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.15796735328410416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8130, "loss": 0.25319352746009827, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8096675872803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:09:59] (step=0008130) Train Loss: 0.2715, Train Steps/Sec: 0.28, Epoch: 0.15798678585308978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8131, "loss": 0.2742404639720917, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2993488311768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:02] (step=0008131) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.1580062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8132, "loss": 0.2812086343765259, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7921390533447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:06] (step=0008132) Train Loss: 0.2804, Train Steps/Sec: 0.28, Epoch: 0.15802565099106103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8133, "loss": 0.2012753188610077, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1900386810303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:10] (step=0008133) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.15804508356004665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8134, "loss": 0.21696093678474426, "memory_gb": 7.721559524536133, "step_time_ms": 3351.559638977051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:13] (step=0008134) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.15806451612903225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8135, "loss": 0.2857320308685303, "memory_gb": 7.721559524536133, "step_time_ms": 3359.35640335083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:17] (step=0008135) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.15808394869801787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8136, "loss": 0.1726711094379425, "memory_gb": 7.721559524536133, "step_time_ms": 3361.328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:20] (step=0008136) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.1581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8137, "loss": 0.1213173121213913, "memory_gb": 7.721559524536133, "step_time_ms": 3361.558437347412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:24] (step=0008137) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.1581228138359891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8138, "loss": 0.3051971197128296, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3106994628906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:27] (step=0008138) Train Loss: 0.3141, Train Steps/Sec: 0.28, Epoch: 0.15814224640497473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8139, "loss": 0.2602441906929016, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3814373016357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:31] (step=0008139) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.15816167897396036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8140, "loss": 0.30137139558792114, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3411750793457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:35] (step=0008140) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.15818111154294598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8141, "loss": 0.19799235463142395, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0220260620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:38] (step=0008141) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.1582005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8142, "loss": 0.33325672149658203, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4538021087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:42] (step=0008142) Train Loss: 0.3234, Train Steps/Sec: 0.28, Epoch: 0.15821997668091722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8143, "loss": 0.16801835596561432, "memory_gb": 7.721559524536133, "step_time_ms": 3358.375310897827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:45] (step=0008143) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.15823940924990285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8144, "loss": 0.28185248374938965, "memory_gb": 7.721559524536133, "step_time_ms": 3358.184814453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:49] (step=0008144) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.15825884181888847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8145, "loss": 0.20061080157756805, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8573417663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:53] (step=0008145) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.1582782743878741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:10:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8146, "loss": 0.18068285286426544, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8031063079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:10:56] (step=0008146) Train Loss: 0.2032, Train Steps/Sec: 0.28, Epoch: 0.15829770695685969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8147, "loss": 0.22495576739311218, "memory_gb": 7.721559524536133, "step_time_ms": 3359.027862548828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:00] (step=0008147) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.1583171395258453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8148, "loss": 0.14664191007614136, "memory_gb": 7.721559524536133, "step_time_ms": 3341.972827911377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:03] (step=0008148) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.15833657209483093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8149, "loss": 0.3169333338737488, "memory_gb": 7.721559524536133, "step_time_ms": 3342.616081237793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:07] (step=0008149) Train Loss: 0.2923, Train Steps/Sec: 0.28, Epoch: 0.15835600466381655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8150, "loss": 0.3148518204689026, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2094230651855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:10] (step=0008150) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.15837543723280217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8151, "loss": 0.3647218346595764, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5408668518066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:14] (step=0008151) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.1583948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8152, "loss": 0.26062706112861633, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7125282287598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:18] (step=0008152) Train Loss: 0.2229, Train Steps/Sec: 0.28, Epoch: 0.15841430237077342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8153, "loss": 0.27127307653427124, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5854511260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:21] (step=0008153) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.15843373493975904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8154, "loss": 0.24133512377738953, "memory_gb": 7.721559524536133, "step_time_ms": 3355.67307472229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:25] (step=0008154) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.15845316750874466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8155, "loss": 0.251309335231781, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4988327026367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:28] (step=0008155) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.15847260007773029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8156, "loss": 0.23655416071414948, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6908740997314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:32] (step=0008156) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.1584920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8157, "loss": 0.22943486273288727, "memory_gb": 7.721559524536133, "step_time_ms": 3360.769748687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:36] (step=0008157) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.1585114652157015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8158, "loss": 0.22316738963127136, "memory_gb": 7.721559524536133, "step_time_ms": 3361.992120742798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:39] (step=0008158) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.15853089778468712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8159, "loss": 0.20287053287029266, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1996173858643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:43] (step=0008159) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.15855033035367275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8160, "loss": 0.31832849979400635, "memory_gb": 7.721559524536133, "step_time_ms": 3362.032413482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:46] (step=0008160) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.15856976292265837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8161, "loss": 0.18999476730823517, "memory_gb": 7.721559524536133, "step_time_ms": 3361.095666885376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:50] (step=0008161) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.158589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8162, "loss": 0.30977848172187805, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9187812805176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:53] (step=0008162) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.15860862806062961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:11:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8163, "loss": 0.17909474670886993, "memory_gb": 7.721559524536133, "step_time_ms": 3362.271308898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:11:57] (step=0008163) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.15862806062961524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8164, "loss": 0.16948926448822021, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6228103637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:01] (step=0008164) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.15864749319860086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8165, "loss": 0.262984037399292, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5833053588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:04] (step=0008165) Train Loss: 0.3001, Train Steps/Sec: 0.28, Epoch: 0.15866692576758648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8166, "loss": 0.292212575674057, "memory_gb": 7.721559524536133, "step_time_ms": 3511.638641357422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:08] (step=0008166) Train Loss: 0.3137, Train Steps/Sec: 0.28, Epoch: 0.1586863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8167, "loss": 0.3174394369125366, "memory_gb": 7.721559524536133, "step_time_ms": 3364.105224609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:11] (step=0008167) Train Loss: 0.2749, Train Steps/Sec: 0.28, Epoch: 0.15870579090555773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8168, "loss": 0.28869014978408813, "memory_gb": 7.721559524536133, "step_time_ms": 3364.542245864868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:15] (step=0008168) Train Loss: 0.3005, Train Steps/Sec: 0.28, Epoch: 0.15872522347454335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8169, "loss": 0.151407390832901, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7581901550293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:19] (step=0008169) Train Loss: 0.1616, Train Steps/Sec: 0.28, Epoch: 0.15874465604352894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8170, "loss": 0.22269432246685028, "memory_gb": 7.721559524536133, "step_time_ms": 3361.408233642578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:22] (step=0008170) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.15876408861251456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8171, "loss": 0.17543670535087585, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6837730407715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:26] (step=0008171) Train Loss: 0.2414, Train Steps/Sec: 0.27, Epoch: 0.1587835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8172, "loss": 0.3171122074127197, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4038486480713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:29] (step=0008172) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.1588029537504858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8173, "loss": 0.24263134598731995, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4049682617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:33] (step=0008173) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.15882238631947143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8174, "loss": 0.31271272897720337, "memory_gb": 7.721559524536133, "step_time_ms": 3361.858129501343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:37] (step=0008174) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.15884181888845705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8175, "loss": 0.19051587581634521, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6771183013916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:40] (step=0008175) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.15886125145744268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8176, "loss": 0.2755252718925476, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3205890655518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:44] (step=0008176) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.1588806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8177, "loss": 0.20583905279636383, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3289337158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:47] (step=0008177) Train Loss: 0.1785, Train Steps/Sec: 0.28, Epoch: 0.15890011659541392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8178, "loss": 0.3434157371520996, "memory_gb": 7.721559524536133, "step_time_ms": 3361.517906188965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:51] (step=0008178) Train Loss: 0.3111, Train Steps/Sec: 0.28, Epoch: 0.15891954916439954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8179, "loss": 0.32654041051864624, "memory_gb": 7.721559524536133, "step_time_ms": 3364.292860031128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:55] (step=0008179) Train Loss: 0.3239, Train Steps/Sec: 0.28, Epoch: 0.15893898173338517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:12:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8180, "loss": 0.2703627943992615, "memory_gb": 7.721559524536133, "step_time_ms": 3365.349769592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:12:58] (step=0008180) Train Loss: 0.2989, Train Steps/Sec: 0.28, Epoch: 0.15895841430237076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8181, "loss": 0.3142987787723541, "memory_gb": 7.721559524536133, "step_time_ms": 3366.163492202759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:02] (step=0008181) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.15897784687135638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8182, "loss": 0.27669885754585266, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2170848846436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:05] (step=0008182) Train Loss: 0.3088, Train Steps/Sec: 0.28, Epoch: 0.158997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8183, "loss": 0.2238522320985794, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6343688964844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:09] (step=0008183) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.15901671200932763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8184, "loss": 0.24418896436691284, "memory_gb": 7.721559524536133, "step_time_ms": 3362.976312637329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:12] (step=0008184) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.15903614457831325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8185, "loss": 0.07609067112207413, "memory_gb": 7.721559524536133, "step_time_ms": 3362.610340118408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:16] (step=0008185) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.15905557714729887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8186, "loss": 0.19552569091320038, "memory_gb": 7.721559524536133, "step_time_ms": 3366.945505142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:20] (step=0008186) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.1590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8187, "loss": 0.21733582019805908, "memory_gb": 7.721559524536133, "step_time_ms": 3364.203453063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:23] (step=0008187) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.15909444228527012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8188, "loss": 0.2797394394874573, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9755249023438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:27] (step=0008188) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.15911387485425574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8189, "loss": 0.14948123693466187, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7380599975586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:31] (step=0008189) Train Loss: 0.1482, Train Steps/Sec: 0.28, Epoch: 0.15913330742324136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8190, "loss": 0.24487990140914917, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9255809783936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:34] (step=0008190) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.15915273999222698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8191, "loss": 0.15883716940879822, "memory_gb": 7.721559524536133, "step_time_ms": 3362.368583679199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:38] (step=0008191) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.1591721725612126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8192, "loss": 0.20384573936462402, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2261543273926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:41] (step=0008192) Train Loss: 0.1676, Train Steps/Sec: 0.28, Epoch: 0.1591916051301982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8193, "loss": 0.2213420569896698, "memory_gb": 7.721559524536133, "step_time_ms": 3362.215518951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:45] (step=0008193) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.15921103769918382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8194, "loss": 0.35072046518325806, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8270626068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:48] (step=0008194) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.15923047026816944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8195, "loss": 0.2698058485984802, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2417125701904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:52] (step=0008195) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.15924990283715507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8196, "loss": 0.22881411015987396, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1504306793213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:56] (step=0008196) Train Loss: 0.2098, Train Steps/Sec: 0.28, Epoch: 0.1592693354061407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:13:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8197, "loss": 0.16787652671337128, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2539463043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:13:59] (step=0008197) Train Loss: 0.2054, Train Steps/Sec: 0.28, Epoch: 0.1592887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8198, "loss": 0.2954787015914917, "memory_gb": 7.721559524536133, "step_time_ms": 3360.565185546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:03] (step=0008198) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.15930820054411193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8199, "loss": 0.31541553139686584, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0612621307373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:06] (step=0008199) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.15932763311309756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8200, "loss": 0.3224748373031616, "memory_gb": 7.721559524536133, "step_time_ms": 3369.980573654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:10] (step=0008200) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.15934706568208318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8201, "loss": 0.22496460378170013, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6981201171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:14] (step=0008201) Train Loss: 0.1694, Train Steps/Sec: 0.28, Epoch: 0.1593664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8202, "loss": 0.24542103707790375, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9232387542725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:17] (step=0008202) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.15938593082005442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8203, "loss": 0.12761640548706055, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2256259918213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:21] (step=0008203) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.15940536338904004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8204, "loss": 0.2474181354045868, "memory_gb": 7.721559524536133, "step_time_ms": 3367.584705352783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:24] (step=0008204) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.15942479595802564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8205, "loss": 0.26803499460220337, "memory_gb": 7.721559524536133, "step_time_ms": 3365.158796310425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:28] (step=0008205) Train Loss: 0.2957, Train Steps/Sec: 0.28, Epoch: 0.15944422852701126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8206, "loss": 0.22784502804279327, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3526096343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:32] (step=0008206) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.15946366109599688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8207, "loss": 0.2863605320453644, "memory_gb": 7.721559524536133, "step_time_ms": 3505.971908569336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:35] (step=0008207) Train Loss: 0.2878, Train Steps/Sec: 0.28, Epoch: 0.1594830936649825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8208, "loss": 0.27551180124282837, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3552532196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:39] (step=0008208) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.15950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8209, "loss": 0.22487257421016693, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2143573760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:42] (step=0008209) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.15952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8210, "loss": 0.1328859031200409, "memory_gb": 7.721559524536133, "step_time_ms": 3358.668804168701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:46] (step=0008210) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.15954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8211, "loss": 0.25477734208106995, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4416332244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:50] (step=0008211) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.159560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8212, "loss": 0.233059823513031, "memory_gb": 7.721559524536133, "step_time_ms": 3346.82035446167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:53] (step=0008212) Train Loss: 0.2422, Train Steps/Sec: 0.27, Epoch: 0.15958025650991062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8213, "loss": 0.23944947123527527, "memory_gb": 7.721559524536133, "step_time_ms": 3360.208034515381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:14:57] (step=0008213) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.15959968907889624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8214, "loss": 0.3422767221927643, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2172088623047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:00] (step=0008214) Train Loss: 0.3075, Train Steps/Sec: 0.28, Epoch: 0.15961912164788186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8215, "loss": 0.154946967959404, "memory_gb": 7.721559524536133, "step_time_ms": 3364.333152770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:04] (step=0008215) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.15963855421686746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8216, "loss": 0.29127585887908936, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0920181274414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:08] (step=0008216) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.15965798678585308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8217, "loss": 0.29969650506973267, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8783740997314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:11] (step=0008217) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.1596774193548387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8218, "loss": 0.25795769691467285, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3997230529785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:15] (step=0008218) Train Loss: 0.2696, Train Steps/Sec: 0.28, Epoch: 0.15969685192382432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8219, "loss": 0.2202247530221939, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3452701568604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:18] (step=0008219) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.15971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8220, "loss": 0.27876341342926025, "memory_gb": 7.721559524536133, "step_time_ms": 3360.759973526001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:22] (step=0008220) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.15973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8221, "loss": 0.21953126788139343, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1673583984375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:26] (step=0008221) Train Loss: 0.2828, Train Steps/Sec: 0.28, Epoch: 0.1597551496307812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8222, "loss": 0.21767142415046692, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3741188049316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:29] (step=0008222) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.1597745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8223, "loss": 0.17245322465896606, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1764183044434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:33] (step=0008223) Train Loss: 0.1714, Train Steps/Sec: 0.28, Epoch: 0.15979401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8224, "loss": 0.2817394435405731, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4371547698975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:36] (step=0008224) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.15981344733773806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8225, "loss": 0.23098498582839966, "memory_gb": 7.721559524536133, "step_time_ms": 3359.780788421631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:40] (step=0008225) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.15983287990672368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8226, "loss": 0.22535143792629242, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8131713867188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:44] (step=0008226) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.1598523124757093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8227, "loss": 0.25127503275871277, "memory_gb": 7.721559524536133, "step_time_ms": 3363.840103149414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:47] (step=0008227) Train Loss: 0.3122, Train Steps/Sec: 0.28, Epoch: 0.1598717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8228, "loss": 0.310519814491272, "memory_gb": 7.721559524536133, "step_time_ms": 3361.794948577881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:51] (step=0008228) Train Loss: 0.3310, Train Steps/Sec: 0.28, Epoch: 0.15989117761368052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8229, "loss": 0.21790984272956848, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6839904785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:54] (step=0008229) Train Loss: 0.1729, Train Steps/Sec: 0.28, Epoch: 0.15991061018266614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8230, "loss": 0.17345042526721954, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5546951293945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:15:58] (step=0008230) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.15993004275165176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8231, "loss": 0.17977860569953918, "memory_gb": 7.721559524536133, "step_time_ms": 3354.442358016968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:02] (step=0008231) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.15994947532063739, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8232, "loss": 0.2516208589076996, "memory_gb": 7.721559524536133, "step_time_ms": 3359.872579574585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:05] (step=0008232) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.159968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8233, "loss": 0.20331411063671112, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4695796966553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:09] (step=0008233) Train Loss: 0.2350, Train Steps/Sec: 0.28, Epoch: 0.15998834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8234, "loss": 0.34002503752708435, "memory_gb": 7.721559524536133, "step_time_ms": 3361.345052719116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:12] (step=0008234) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.16000777302759425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8235, "loss": 0.23417460918426514, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2772274017334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:16] (step=0008235) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.16002720559657987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8236, "loss": 0.2667296826839447, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8202724456787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:19] (step=0008236) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.1600466381655655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8237, "loss": 0.34527865052223206, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5831394195557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:23] (step=0008237) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.16006607073455112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8238, "loss": 0.24809424579143524, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3237915039062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:27] (step=0008238) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.16008550330353674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8239, "loss": 0.28758376836776733, "memory_gb": 7.721559524536133, "step_time_ms": 3358.865261077881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:30] (step=0008239) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.16010493587252234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8240, "loss": 0.3808666467666626, "memory_gb": 7.721559524536133, "step_time_ms": 3341.0847187042236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:34] (step=0008240) Train Loss: 0.3092, Train Steps/Sec: 0.28, Epoch: 0.16012436844150796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8241, "loss": 0.1316683143377304, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9361743927, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:37] (step=0008241) Train Loss: 0.1668, Train Steps/Sec: 0.28, Epoch: 0.16014380101049358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8242, "loss": 0.23771974444389343, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6252460479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:41] (step=0008242) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.1601632335794792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8243, "loss": 0.30306708812713623, "memory_gb": 7.721559524536133, "step_time_ms": 3356.109857559204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:45] (step=0008243) Train Loss: 0.3020, Train Steps/Sec: 0.28, Epoch: 0.16018266614846483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8244, "loss": 0.32039177417755127, "memory_gb": 7.721559524536133, "step_time_ms": 3357.987403869629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:48] (step=0008244) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.16020209871745045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8245, "loss": 0.2733766734600067, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8031272888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:52] (step=0008245) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.16022153128643607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8246, "loss": 0.3133314251899719, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2278232574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:55] (step=0008246) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.1602409638554217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8247, "loss": 0.31158432364463806, "memory_gb": 7.721559524536133, "step_time_ms": 3358.919143676758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:16:59] (step=0008247) Train Loss: 0.2790, Train Steps/Sec: 0.28, Epoch: 0.16026039642440731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8248, "loss": 0.17712005972862244, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8365364074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:02] (step=0008248) Train Loss: 0.1683, Train Steps/Sec: 0.28, Epoch: 0.16027982899339294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8249, "loss": 0.2931886315345764, "memory_gb": 7.721559524536133, "step_time_ms": 3361.337423324585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:06] (step=0008249) Train Loss: 0.2901, Train Steps/Sec: 0.28, Epoch: 0.16029926156237856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8250, "loss": 0.23738187551498413, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0373573303223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:10] (step=0008250) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.16031869413136415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8251, "loss": 0.14785301685333252, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2513332366943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:13] (step=0008251) Train Loss: 0.1801, Train Steps/Sec: 0.28, Epoch: 0.16033812670034978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8252, "loss": 0.23270867764949799, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8547706604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:17] (step=0008252) Train Loss: 0.2194, Train Steps/Sec: 0.28, Epoch: 0.1603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8253, "loss": 0.21190382540225983, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7699871063232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:20] (step=0008253) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.16037699183832102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8254, "loss": 0.22975172102451324, "memory_gb": 7.721559524536133, "step_time_ms": 3352.018356323242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:24] (step=0008254) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.16039642440730664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8255, "loss": 0.1108618825674057, "memory_gb": 7.721559524536133, "step_time_ms": 3503.2520294189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:28] (step=0008255) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.16041585697629226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8256, "loss": 0.10399875044822693, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6694774627686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:31] (step=0008256) Train Loss: 0.1724, Train Steps/Sec: 0.28, Epoch: 0.1604352895452779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8257, "loss": 0.22359353303909302, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4771366119385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:35] (step=0008257) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.1604547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8258, "loss": 0.2560217082500458, "memory_gb": 7.721559524536133, "step_time_ms": 3352.259635925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:38] (step=0008258) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.16047415468324913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8259, "loss": 0.19059446454048157, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2164726257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:42] (step=0008259) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.16049358725223475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8260, "loss": 0.12604479491710663, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0170917510986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:46] (step=0008260) Train Loss: 0.1535, Train Steps/Sec: 0.27, Epoch: 0.16051301982122038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8261, "loss": 0.15130096673965454, "memory_gb": 7.721559524536133, "step_time_ms": 3350.050687789917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:49] (step=0008261) Train Loss: 0.1982, Train Steps/Sec: 0.28, Epoch: 0.160532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8262, "loss": 0.23544344305992126, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7091178894043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:53] (step=0008262) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.1605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8263, "loss": 0.30282020568847656, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2638454437256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:17:56] (step=0008263) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.16057131752817722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8264, "loss": 0.16927677392959595, "memory_gb": 7.721559524536133, "step_time_ms": 3362.196683883667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:00] (step=0008264) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.16059075009716284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8265, "loss": 0.279718279838562, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9609184265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:03] (step=0008265) Train Loss: 0.3029, Train Steps/Sec: 0.28, Epoch: 0.16061018266614846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8266, "loss": 0.3293445110321045, "memory_gb": 7.721559524536133, "step_time_ms": 3355.186700820923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:07] (step=0008266) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.16062961523513408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8267, "loss": 0.31521373987197876, "memory_gb": 7.721559524536133, "step_time_ms": 3362.077236175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:11] (step=0008267) Train Loss: 0.3076, Train Steps/Sec: 0.28, Epoch: 0.1606490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8268, "loss": 0.2997526526451111, "memory_gb": 7.721559524536133, "step_time_ms": 3366.276502609253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:14] (step=0008268) Train Loss: 0.3245, Train Steps/Sec: 0.28, Epoch: 0.16066848037310533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8269, "loss": 0.1604568362236023, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7490272521973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:18] (step=0008269) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.16068791294209095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8270, "loss": 0.31329208612442017, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8832569122314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:21] (step=0008270) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.16070734551107657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8271, "loss": 0.19338569045066833, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8925552368164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:25] (step=0008271) Train Loss: 0.1718, Train Steps/Sec: 0.28, Epoch: 0.1607267780800622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8272, "loss": 0.27011072635650635, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8173599243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:29] (step=0008272) Train Loss: 0.2419, Train Steps/Sec: 0.28, Epoch: 0.16074621064904782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8273, "loss": 0.1760912388563156, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0952110290527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:32] (step=0008273) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.1607656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8274, "loss": 0.1321844905614853, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1037940979004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:36] (step=0008274) Train Loss: 0.1927, Train Steps/Sec: 0.28, Epoch: 0.16078507578701903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8275, "loss": 0.296739786863327, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9000396728516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:39] (step=0008275) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.16080450835600466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8276, "loss": 0.26075950264930725, "memory_gb": 7.721559524536133, "step_time_ms": 3361.318349838257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:43] (step=0008276) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.16082394092499028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8277, "loss": 0.21596506237983704, "memory_gb": 7.715639114379883, "step_time_ms": 3317.1887397766113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:47] (step=0008277) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.1608433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8278, "loss": 0.2213677167892456, "memory_gb": 7.721559524536133, "step_time_ms": 3367.919445037842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:50] (step=0008278) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.16086280606296152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8279, "loss": 0.21402224898338318, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1179542541504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:54] (step=0008279) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.16088223863194714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8280, "loss": 0.1747981607913971, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1083965301514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:18:57] (step=0008280) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.16090167120093277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8281, "loss": 0.25318604707717896, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4248390197754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:01] (step=0008281) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.1609211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8282, "loss": 0.2166767716407776, "memory_gb": 7.721559524536133, "step_time_ms": 3363.941431045532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:04] (step=0008282) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.160940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8283, "loss": 0.20639070868492126, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1833534240723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:08] (step=0008283) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.16095996890788963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8284, "loss": 0.11888159811496735, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8928146362305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:12] (step=0008284) Train Loss: 0.1697, Train Steps/Sec: 0.28, Epoch: 0.16097940147687526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8285, "loss": 0.2058350145816803, "memory_gb": 7.721559524536133, "step_time_ms": 3362.448215484619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:15] (step=0008285) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.16099883404586085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8286, "loss": 0.15395531058311462, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6832447052, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:19] (step=0008286) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.16101826661484647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8287, "loss": 0.23676033318042755, "memory_gb": 7.721559524536133, "step_time_ms": 3366.793155670166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:22] (step=0008287) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.1610376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8288, "loss": 0.17543645203113556, "memory_gb": 7.721559524536133, "step_time_ms": 3364.028215408325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:26] (step=0008288) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.16105713175281772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8289, "loss": 0.27225521206855774, "memory_gb": 7.721559524536133, "step_time_ms": 3356.377363204956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:30] (step=0008289) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.16107656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8290, "loss": 0.15788599848747253, "memory_gb": 7.721559524536133, "step_time_ms": 3363.496780395508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:33] (step=0008290) Train Loss: 0.1472, Train Steps/Sec: 0.28, Epoch: 0.16109599689078896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8291, "loss": 0.2542702555656433, "memory_gb": 7.721559524536133, "step_time_ms": 3361.423969268799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:37] (step=0008291) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.16111542945977458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8292, "loss": 0.335379421710968, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2968196868896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:40] (step=0008292) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.1611348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8293, "loss": 0.18429183959960938, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9813919067383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:44] (step=0008293) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.16115429459774583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8294, "loss": 0.3666308522224426, "memory_gb": 7.721559524536133, "step_time_ms": 3363.77215385437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:48] (step=0008294) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.16117372716673145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8295, "loss": 0.22656702995300293, "memory_gb": 7.721559524536133, "step_time_ms": 3498.6257553100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:51] (step=0008295) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.16119315973571707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8296, "loss": 0.2310313880443573, "memory_gb": 7.721559524536133, "step_time_ms": 3358.158588409424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:55] (step=0008296) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.1612125923047027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:19:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8297, "loss": 0.24851085245609283, "memory_gb": 7.721559524536133, "step_time_ms": 3344.432830810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:19:58] (step=0008297) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.1612320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8298, "loss": 0.1745930016040802, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5965843200684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:02] (step=0008298) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.1612514574426739, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8299, "loss": 0.20951375365257263, "memory_gb": 7.721559524536133, "step_time_ms": 3366.933584213257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:05] (step=0008299) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.16127089001165953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8300, "loss": 0.121026411652565, "memory_gb": 7.721559524536133, "step_time_ms": 3359.736919403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:09] (step=0008300) Train Loss: 0.1503, Train Steps/Sec: 0.27, Epoch: 0.16129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8301, "loss": 0.24385084211826324, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3031635284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:13] (step=0008301) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.16130975514963078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8302, "loss": 0.24917030334472656, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7045669555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:16] (step=0008302) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.1613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8303, "loss": 0.31296783685684204, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1396083831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:20] (step=0008303) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.16134862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8304, "loss": 0.2136378437280655, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6322746276855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:24] (step=0008304) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.16136805285658765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8305, "loss": 0.17275211215019226, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9905967712402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:27] (step=0008305) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.16138748542557327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8306, "loss": 0.17511604726314545, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7500324249268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:31] (step=0008306) Train Loss: 0.1645, Train Steps/Sec: 0.28, Epoch: 0.1614069179945589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8307, "loss": 0.2534559369087219, "memory_gb": 7.721559524536133, "step_time_ms": 3357.327699661255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:34] (step=0008307) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.1614263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8308, "loss": 0.17364491522312164, "memory_gb": 7.721559524536133, "step_time_ms": 3361.294984817505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:38] (step=0008308) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.1614457831325301, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8309, "loss": 0.34108948707580566, "memory_gb": 7.721559524536133, "step_time_ms": 3356.736660003662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:42] (step=0008309) Train Loss: 0.3208, Train Steps/Sec: 0.28, Epoch: 0.16146521570151573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8310, "loss": 0.2244688719511032, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7850799560547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:45] (step=0008310) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.16148464827050135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8311, "loss": 0.2783493399620056, "memory_gb": 7.721559524536133, "step_time_ms": 3359.003782272339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:49] (step=0008311) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.16150408083948697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8312, "loss": 0.20488597452640533, "memory_gb": 7.721559524536133, "step_time_ms": 3360.898733139038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:52] (step=0008312) Train Loss: 0.1932, Train Steps/Sec: 0.28, Epoch: 0.1615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:20:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8313, "loss": 0.20409192144870758, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0426445007324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:20:56] (step=0008313) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.16154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8314, "loss": 0.17939196527004242, "memory_gb": 7.721559524536133, "step_time_ms": 3356.839418411255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:00] (step=0008314) Train Loss: 0.1617, Train Steps/Sec: 0.28, Epoch: 0.16156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8315, "loss": 0.22729569673538208, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8167686462402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:03] (step=0008315) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.16158181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8316, "loss": 0.22149938344955444, "memory_gb": 7.721559524536133, "step_time_ms": 3354.393243789673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:07] (step=0008316) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.16160124368441509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8317, "loss": 0.25650399923324585, "memory_gb": 7.721559524536133, "step_time_ms": 3355.304956436157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:10] (step=0008317) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.1616206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8318, "loss": 0.22362813353538513, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3148040771484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:14] (step=0008318) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.16164010882238633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8319, "loss": 0.19532370567321777, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9437522888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:17] (step=0008319) Train Loss: 0.1934, Train Steps/Sec: 0.28, Epoch: 0.16165954139137195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8320, "loss": 0.22823397815227509, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4635047912598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:21] (step=0008320) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.16167897396035755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8321, "loss": 0.2102392315864563, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1859855651855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:25] (step=0008321) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.16169840652934317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8322, "loss": 0.23147466778755188, "memory_gb": 7.721559524536133, "step_time_ms": 3350.475549697876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:28] (step=0008322) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.1617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8323, "loss": 0.23766396939754486, "memory_gb": 7.721559524536133, "step_time_ms": 3353.224992752075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:32] (step=0008323) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.16173727166731441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8324, "loss": 0.16570639610290527, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7660369873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:35] (step=0008324) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.16175670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8325, "loss": 0.1700565218925476, "memory_gb": 7.721559524536133, "step_time_ms": 3354.26664352417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:39] (step=0008325) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.16177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8326, "loss": 0.2454032301902771, "memory_gb": 7.721559524536133, "step_time_ms": 3348.245143890381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:42] (step=0008326) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.16179556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8327, "loss": 0.2059875726699829, "memory_gb": 7.721559524536133, "step_time_ms": 3352.186918258667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:46] (step=0008327) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.1618150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8328, "loss": 0.19788941740989685, "memory_gb": 7.721559524536133, "step_time_ms": 3355.250597000122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:50] (step=0008328) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.16183443451224253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8329, "loss": 0.19863350689411163, "memory_gb": 7.721559524536133, "step_time_ms": 3353.742837905884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:53] (step=0008329) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.16185386708122815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8330, "loss": 0.3384995460510254, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8316230773926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:21:57] (step=0008330) Train Loss: 0.3255, Train Steps/Sec: 0.28, Epoch: 0.16187329965021377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8331, "loss": 0.16411727666854858, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9414615631104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:00] (step=0008331) Train Loss: 0.1581, Train Steps/Sec: 0.28, Epoch: 0.16189273221919936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8332, "loss": 0.10387752205133438, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5798320770264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:04] (step=0008332) Train Loss: 0.1591, Train Steps/Sec: 0.28, Epoch: 0.161912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8333, "loss": 0.26375335454940796, "memory_gb": 7.721559524536133, "step_time_ms": 3344.1035747528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:08] (step=0008333) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.1619315973571706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8334, "loss": 0.20330817997455597, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1996479034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:11] (step=0008334) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.16195102992615623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8335, "loss": 0.308626264333725, "memory_gb": 7.721559524536133, "step_time_ms": 3351.594924926758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:15] (step=0008335) Train Loss: 0.3063, Train Steps/Sec: 0.28, Epoch: 0.16197046249514185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8336, "loss": 0.2550867199897766, "memory_gb": 7.721559524536133, "step_time_ms": 3349.489450454712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:18] (step=0008336) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.16198989506412748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8337, "loss": 0.2797660231590271, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3404388427734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:22] (step=0008337) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.1620093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8338, "loss": 0.24401916563510895, "memory_gb": 7.721559524536133, "step_time_ms": 3341.6340351104736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:25] (step=0008338) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.16202876020209872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8339, "loss": 0.2940840721130371, "memory_gb": 7.721559524536133, "step_time_ms": 3354.529619216919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:29] (step=0008339) Train Loss: 0.3185, Train Steps/Sec: 0.28, Epoch: 0.16204819277108434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8340, "loss": 0.32252562046051025, "memory_gb": 7.721559524536133, "step_time_ms": 3354.727268218994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:33] (step=0008340) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.16206762534006997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8341, "loss": 0.21690276265144348, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5779247283936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:36] (step=0008341) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.1620870579090556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8342, "loss": 0.203882098197937, "memory_gb": 7.721559524536133, "step_time_ms": 3494.8630332946777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:40] (step=0008342) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.1621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8343, "loss": 0.22240141034126282, "memory_gb": 7.715639114379883, "step_time_ms": 3316.283941268921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:43] (step=0008343) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.1621259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8344, "loss": 0.20334933698177338, "memory_gb": 7.721559524536133, "step_time_ms": 3348.5708236694336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:47] (step=0008344) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.16214535561601243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8345, "loss": 0.1551666259765625, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3357124328613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:50] (step=0008345) Train Loss: 0.1492, Train Steps/Sec: 0.28, Epoch: 0.16216478818499805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8346, "loss": 0.19270551204681396, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7847785949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:54] (step=0008346) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.16218422075398367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:22:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8347, "loss": 0.13180460035800934, "memory_gb": 7.721559524536133, "step_time_ms": 3354.945182800293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:22:58] (step=0008347) Train Loss: 0.1712, Train Steps/Sec: 0.27, Epoch: 0.1622036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8348, "loss": 0.3006243109703064, "memory_gb": 7.721559524536133, "step_time_ms": 3358.433961868286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:01] (step=0008348) Train Loss: 0.2657, Train Steps/Sec: 0.28, Epoch: 0.16222308589195492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8349, "loss": 0.14792871475219727, "memory_gb": 7.721559524536133, "step_time_ms": 3339.012622833252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:05] (step=0008349) Train Loss: 0.1661, Train Steps/Sec: 0.28, Epoch: 0.16224251846094054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8350, "loss": 0.24288415908813477, "memory_gb": 7.721559524536133, "step_time_ms": 3352.301597595215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:08] (step=0008350) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.16226195102992616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8351, "loss": 0.08734691143035889, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7261486053467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:12] (step=0008351) Train Loss: 0.1830, Train Steps/Sec: 0.28, Epoch: 0.16228138359891178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8352, "loss": 0.19882836937904358, "memory_gb": 7.721559524536133, "step_time_ms": 3356.436252593994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:16] (step=0008352) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.1623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8353, "loss": 0.15465016663074493, "memory_gb": 7.721559524536133, "step_time_ms": 3345.898389816284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:19] (step=0008353) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.16232024873688303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8354, "loss": 0.2060851752758026, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2541522979736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:23] (step=0008354) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.16233968130586865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8355, "loss": 0.3274770975112915, "memory_gb": 7.721559524536133, "step_time_ms": 3358.365058898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:26] (step=0008355) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.16235911387485424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8356, "loss": 0.1178688108921051, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3081798553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:30] (step=0008356) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.16237854644383987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8357, "loss": 0.1945868283510208, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9700717926025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:33] (step=0008357) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.1623979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8358, "loss": 0.2313007414340973, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0420837402344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:37] (step=0008358) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.1624174115818111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8359, "loss": 0.25133514404296875, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5824756622314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:41] (step=0008359) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.16243684415079673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8360, "loss": 0.15646326541900635, "memory_gb": 7.721559524536133, "step_time_ms": 3356.87255859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:44] (step=0008360) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.16245627671978236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8361, "loss": 0.3041289448738098, "memory_gb": 7.721559524536133, "step_time_ms": 3357.288360595703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:48] (step=0008361) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.16247570928876798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8362, "loss": 0.22765754163265228, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3524951934814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:51] (step=0008362) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.1624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8363, "loss": 0.2708585858345032, "memory_gb": 7.721559524536133, "step_time_ms": 3351.29714012146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:55] (step=0008363) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.16251457442673922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:23:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8364, "loss": 0.328036367893219, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9771099090576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:23:58] (step=0008364) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.16253400699572484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8365, "loss": 0.2163623571395874, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5285205841064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:02] (step=0008365) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.16255343956471047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8366, "loss": 0.30467337369918823, "memory_gb": 7.721559524536133, "step_time_ms": 3357.84912109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:06] (step=0008366) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.16257287213369606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8367, "loss": 0.31514161825180054, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2122116088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:09] (step=0008367) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.16259230470268168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8368, "loss": 0.364545077085495, "memory_gb": 7.715639114379883, "step_time_ms": 3316.7998790740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:13] (step=0008368) Train Loss: 0.3076, Train Steps/Sec: 0.28, Epoch: 0.1626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8369, "loss": 0.15478670597076416, "memory_gb": 7.721559524536133, "step_time_ms": 3342.880964279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:16] (step=0008369) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.16263116984065293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8370, "loss": 0.28150174021720886, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3409061431885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:20] (step=0008370) Train Loss: 0.2940, Train Steps/Sec: 0.28, Epoch: 0.16265060240963855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8371, "loss": 0.24914395809173584, "memory_gb": 7.721559524536133, "step_time_ms": 3349.599838256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:23] (step=0008371) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.16267003497862417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8372, "loss": 0.29683545231819153, "memory_gb": 7.721559524536133, "step_time_ms": 3353.165864944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:27] (step=0008372) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.1626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8373, "loss": 0.25969040393829346, "memory_gb": 7.721559524536133, "step_time_ms": 3357.264518737793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:31] (step=0008373) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.16270890011659542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8374, "loss": 0.19730795919895172, "memory_gb": 7.721559524536133, "step_time_ms": 3357.435703277588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:34] (step=0008374) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.16272833268558104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8375, "loss": 0.15367552638053894, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1084995269775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:38] (step=0008375) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.16274776525456666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8376, "loss": 0.28476524353027344, "memory_gb": 7.721559524536133, "step_time_ms": 3354.12859916687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:41] (step=0008376) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.16276719782355228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8377, "loss": 0.18932044506072998, "memory_gb": 7.721559524536133, "step_time_ms": 3358.661651611328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:45] (step=0008377) Train Loss: 0.1836, Train Steps/Sec: 0.28, Epoch: 0.1627866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8378, "loss": 0.2206191122531891, "memory_gb": 7.721559524536133, "step_time_ms": 3348.2093811035156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:48] (step=0008378) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.1628060629615235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8379, "loss": 0.292957067489624, "memory_gb": 7.721559524536133, "step_time_ms": 3359.355926513672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:52] (step=0008379) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.16282549553050912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8380, "loss": 0.21765287220478058, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2372875213623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:56] (step=0008380) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.16284492809949475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:24:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8381, "loss": 0.2504551112651825, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5643978118896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:24:59] (step=0008381) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.16286436066848037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8382, "loss": 0.20156249403953552, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8475036621094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:03] (step=0008382) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.162883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8383, "loss": 0.3239694833755493, "memory_gb": 7.721559524536133, "step_time_ms": 3501.396417617798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:06] (step=0008383) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.1629032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8384, "loss": 0.25229960680007935, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7357273101807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:10] (step=0008384) Train Loss: 0.2868, Train Steps/Sec: 0.28, Epoch: 0.16292265837543723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8385, "loss": 0.17569996416568756, "memory_gb": 7.721559524536133, "step_time_ms": 3357.288122177124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:13] (step=0008385) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.16294209094442286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8386, "loss": 0.2595190703868866, "memory_gb": 7.721559524536133, "step_time_ms": 3358.391523361206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:17] (step=0008386) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.16296152351340848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8387, "loss": 0.27809926867485046, "memory_gb": 7.721559524536133, "step_time_ms": 3359.639883041382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:21] (step=0008387) Train Loss: 0.3204, Train Steps/Sec: 0.28, Epoch: 0.1629809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8388, "loss": 0.41470104455947876, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5285625457764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:24] (step=0008388) Train Loss: 0.3851, Train Steps/Sec: 0.27, Epoch: 0.16300038865137972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8389, "loss": 0.1880999505519867, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5825386047363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:28] (step=0008389) Train Loss: 0.1855, Train Steps/Sec: 0.28, Epoch: 0.16301982122036532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8390, "loss": 0.24227869510650635, "memory_gb": 7.721559524536133, "step_time_ms": 3361.983060836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:31] (step=0008390) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.16303925378935094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8391, "loss": 0.20525094866752625, "memory_gb": 7.721559524536133, "step_time_ms": 3364.914655685425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:35] (step=0008391) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.16305868635833656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8392, "loss": 0.30387598276138306, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7845001220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:39] (step=0008392) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.16307811892732219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8393, "loss": 0.3231504559516907, "memory_gb": 7.721559524536133, "step_time_ms": 3369.417905807495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:42] (step=0008393) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.1630975514963078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8394, "loss": 0.17575781047344208, "memory_gb": 7.721559524536133, "step_time_ms": 3359.666109085083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:46] (step=0008394) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.16311698406529343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8395, "loss": 0.3741317391395569, "memory_gb": 7.721559524536133, "step_time_ms": 3364.938735961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:49] (step=0008395) Train Loss: 0.3257, Train Steps/Sec: 0.28, Epoch: 0.16313641663427905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8396, "loss": 0.2486395239830017, "memory_gb": 7.721559524536133, "step_time_ms": 3361.037492752075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:53] (step=0008396) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.16315584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:25:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8397, "loss": 0.1736631989479065, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9871139526367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:25:57] (step=0008397) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.1631752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8398, "loss": 0.22937670350074768, "memory_gb": 7.721559524536133, "step_time_ms": 3358.703851699829, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:00] (step=0008398) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.16319471434123592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8399, "loss": 0.2639104127883911, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0363216400146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:04] (step=0008399) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.16321414691022154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8400, "loss": 0.24473625421524048, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5683784484863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:07] (step=0008400) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.16323357947920716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8401, "loss": 0.24161043763160706, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7662658691406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:11] (step=0008401) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.16325301204819276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8402, "loss": 0.31733113527297974, "memory_gb": 7.721559524536133, "step_time_ms": 3356.025218963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:14] (step=0008402) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.16327244461717838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8403, "loss": 0.2158716917037964, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9889793395996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:18] (step=0008403) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.163291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8404, "loss": 0.23620116710662842, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6980361938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:22] (step=0008404) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.16331130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8405, "loss": 0.21643342077732086, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8944206237793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:25] (step=0008405) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.16333074232413525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8406, "loss": 0.2510427236557007, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3498725891113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:29] (step=0008406) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.16335017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8407, "loss": 0.24938522279262543, "memory_gb": 7.721559524536133, "step_time_ms": 3363.969087600708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:32] (step=0008407) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.1633696074621065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8408, "loss": 0.1646515429019928, "memory_gb": 7.721559524536133, "step_time_ms": 3354.962110519409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:36] (step=0008408) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.16338904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8409, "loss": 0.16646872460842133, "memory_gb": 7.721559524536133, "step_time_ms": 3359.069347381592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:40] (step=0008409) Train Loss: 0.1588, Train Steps/Sec: 0.28, Epoch: 0.16340847260007774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8410, "loss": 0.24357545375823975, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8613624572754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:43] (step=0008410) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.16342790516906336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8411, "loss": 0.360099196434021, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0821495056152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:47] (step=0008411) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.16344733773804898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8412, "loss": 0.27880585193634033, "memory_gb": 7.721559524536133, "step_time_ms": 3357.107639312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:50] (step=0008412) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.1634667703070346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8413, "loss": 0.29049885272979736, "memory_gb": 7.721559524536133, "step_time_ms": 3356.260299682617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:54] (step=0008413) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.1634862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:26:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8414, "loss": 0.20198994874954224, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5585613250732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:26:57] (step=0008414) Train Loss: 0.2296, Train Steps/Sec: 0.28, Epoch: 0.16350563544500582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8415, "loss": 0.2547234296798706, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2288494110107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:01] (step=0008415) Train Loss: 0.2949, Train Steps/Sec: 0.28, Epoch: 0.16352506801399144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8416, "loss": 0.33823585510253906, "memory_gb": 7.721559524536133, "step_time_ms": 3352.634906768799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:05] (step=0008416) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.16354450058297706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8417, "loss": 0.23713456094264984, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7566146850586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:08] (step=0008417) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.1635639331519627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8418, "loss": 0.25199073553085327, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1179637908936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:12] (step=0008418) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.1635833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8419, "loss": 0.2501038908958435, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6295166015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:15] (step=0008419) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.16360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8420, "loss": 0.33970320224761963, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7770042419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:19] (step=0008420) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.16362223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8421, "loss": 0.3083800673484802, "memory_gb": 7.721559524536133, "step_time_ms": 3355.344772338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:22] (step=0008421) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.16364166342790518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8422, "loss": 0.36121389269828796, "memory_gb": 7.721559524536133, "step_time_ms": 3354.671001434326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:26] (step=0008422) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.1636610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8423, "loss": 0.3348207175731659, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0004234313965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:30] (step=0008423) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.16368052856587642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8424, "loss": 0.3075680136680603, "memory_gb": 7.721559524536133, "step_time_ms": 3349.727153778076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:33] (step=0008424) Train Loss: 0.2909, Train Steps/Sec: 0.28, Epoch: 0.16369996113486202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8425, "loss": 0.21428440511226654, "memory_gb": 7.721559524536133, "step_time_ms": 3356.541633605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:37] (step=0008425) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.16371939370384764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8426, "loss": 0.18012648820877075, "memory_gb": 7.721559524536133, "step_time_ms": 3354.696035385132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:40] (step=0008426) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.16373882627283326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8427, "loss": 0.20546963810920715, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7820320129395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:44] (step=0008427) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.16375825884181888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8428, "loss": 0.19249548017978668, "memory_gb": 7.721559524536133, "step_time_ms": 3350.52752494812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:47] (step=0008428) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.1637776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8429, "loss": 0.27754122018814087, "memory_gb": 7.721559524536133, "step_time_ms": 3357.999086380005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:51] (step=0008429) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.16379712397979013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8430, "loss": 0.24573829770088196, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4881343841553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:55] (step=0008430) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.16381655654877575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:27:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8431, "loss": 0.24972790479660034, "memory_gb": 7.721559524536133, "step_time_ms": 3495.6352710723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:27:58] (step=0008431) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.16383598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8432, "loss": 0.2504180371761322, "memory_gb": 7.721559524536133, "step_time_ms": 3346.618413925171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:02] (step=0008432) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.163855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8433, "loss": 0.26455017924308777, "memory_gb": 7.721559524536133, "step_time_ms": 3356.292486190796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:05] (step=0008433) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.16387485425573262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8434, "loss": 0.2552371323108673, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8339233398438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:09] (step=0008434) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.16389428682471824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8435, "loss": 0.2322477400302887, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5005321502686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:13] (step=0008435) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.16391371939370386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8436, "loss": 0.15722742676734924, "memory_gb": 7.721559524536133, "step_time_ms": 3357.653856277466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:16] (step=0008436) Train Loss: 0.2129, Train Steps/Sec: 0.27, Epoch: 0.16393315196268946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8437, "loss": 0.3132319748401642, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7984104156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:20] (step=0008437) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.16395258453167508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8438, "loss": 0.24764087796211243, "memory_gb": 7.721559524536133, "step_time_ms": 3349.210023880005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:23] (step=0008438) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.1639720171006607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8439, "loss": 0.09699507057666779, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2143783569336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:27] (step=0008439) Train Loss: 0.1372, Train Steps/Sec: 0.28, Epoch: 0.16399144966964632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8440, "loss": 0.1445745825767517, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0849170684814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:31] (step=0008440) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.16401088223863194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8441, "loss": 0.24954818189144135, "memory_gb": 7.721559524536133, "step_time_ms": 3347.118377685547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:34] (step=0008441) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.16403031480761757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8442, "loss": 0.21794164180755615, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7841777801514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:38] (step=0008442) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.1640497473766032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8443, "loss": 0.1721504032611847, "memory_gb": 7.721559524536133, "step_time_ms": 3360.184669494629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:41] (step=0008443) Train Loss: 0.1679, Train Steps/Sec: 0.28, Epoch: 0.1640691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8444, "loss": 0.1942613422870636, "memory_gb": 7.721559524536133, "step_time_ms": 3357.304334640503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:45] (step=0008444) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.16408861251457443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8445, "loss": 0.15129823982715607, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5779457092285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:48] (step=0008445) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.16410804508356006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8446, "loss": 0.1625278890132904, "memory_gb": 7.721559524536133, "step_time_ms": 3342.8244590759277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:52] (step=0008446) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.16412747765254568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8447, "loss": 0.3003285527229309, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8685989379883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:55] (step=0008447) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.1641469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:28:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8448, "loss": 0.37930285930633545, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1933040618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:28:59] (step=0008448) Train Loss: 0.2828, Train Steps/Sec: 0.28, Epoch: 0.1641663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8449, "loss": 0.22765563428401947, "memory_gb": 7.721559524536133, "step_time_ms": 3350.025177001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:03] (step=0008449) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.16418577535950252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8450, "loss": 0.27603983879089355, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5292778015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:06] (step=0008450) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.16420520792848814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8451, "loss": 0.26390892267227173, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2160472869873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:10] (step=0008451) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.16422464049747376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8452, "loss": 0.19742393493652344, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1340827941895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:13] (step=0008452) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.16424407306645938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8453, "loss": 0.26518940925598145, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2303733825684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:17] (step=0008453) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.164263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8454, "loss": 0.22016547620296478, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3708534240723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:20] (step=0008454) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.16428293820443063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8455, "loss": 0.21591615676879883, "memory_gb": 7.721559524536133, "step_time_ms": 3355.923652648926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:24] (step=0008455) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.16430237077341625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8456, "loss": 0.14602118730545044, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9812774658203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:28] (step=0008456) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.16432180334240187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8457, "loss": 0.2309102565050125, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2311820983887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:31] (step=0008457) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.1643412359113875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8458, "loss": 0.15882667899131775, "memory_gb": 7.721559524536133, "step_time_ms": 3352.022647857666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:35] (step=0008458) Train Loss: 0.1554, Train Steps/Sec: 0.28, Epoch: 0.16436066848037312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8459, "loss": 0.24154558777809143, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2964668273926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:38] (step=0008459) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.1643801010493587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8460, "loss": 0.2147258222103119, "memory_gb": 7.721559524536133, "step_time_ms": 3358.668327331543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:42] (step=0008460) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.16439953361834433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8461, "loss": 0.20263706147670746, "memory_gb": 7.721559524536133, "step_time_ms": 3357.896327972412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:45] (step=0008461) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.16441896618732996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8462, "loss": 0.22050537168979645, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8900051116943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:49] (step=0008462) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.16443839875631558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8463, "loss": 0.12401637434959412, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0502529144287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:53] (step=0008463) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.1644578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:29:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8464, "loss": 0.3658002018928528, "memory_gb": 7.721559524536133, "step_time_ms": 3361.982583999634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:29:56] (step=0008464) Train Loss: 0.3474, Train Steps/Sec: 0.28, Epoch: 0.16447726389428682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8465, "loss": 0.27776193618774414, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8317375183105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:00] (step=0008465) Train Loss: 0.3118, Train Steps/Sec: 0.28, Epoch: 0.16449669646327245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8466, "loss": 0.2580576241016388, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1628284454346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:03] (step=0008466) Train Loss: 0.2274, Train Steps/Sec: 0.28, Epoch: 0.16451612903225807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8467, "loss": 0.13047632575035095, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4510765075684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:07] (step=0008467) Train Loss: 0.1489, Train Steps/Sec: 0.28, Epoch: 0.1645355616012437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8468, "loss": 0.1521822214126587, "memory_gb": 7.721559524536133, "step_time_ms": 3358.452558517456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:10] (step=0008468) Train Loss: 0.1590, Train Steps/Sec: 0.28, Epoch: 0.1645549941702293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8469, "loss": 0.14874638617038727, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2582683563232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:14] (step=0008469) Train Loss: 0.1364, Train Steps/Sec: 0.28, Epoch: 0.16457442673921494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8470, "loss": 0.19739966094493866, "memory_gb": 7.721559524536133, "step_time_ms": 3361.49001121521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:18] (step=0008470) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.16459385930820056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8471, "loss": 0.16324695944786072, "memory_gb": 7.721559524536133, "step_time_ms": 3499.948024749756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:21] (step=0008471) Train Loss: 0.1656, Train Steps/Sec: 0.28, Epoch: 0.16461329187718615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8472, "loss": 0.2823202908039093, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5997562408447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:25] (step=0008472) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.16463272444617177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8473, "loss": 0.1479683518409729, "memory_gb": 7.721559524536133, "step_time_ms": 3359.782934188843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:28] (step=0008473) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.1646521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8474, "loss": 0.21036991477012634, "memory_gb": 7.721559524536133, "step_time_ms": 3351.949691772461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:32] (step=0008474) Train Loss: 0.1717, Train Steps/Sec: 0.28, Epoch: 0.16467158958414302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8475, "loss": 0.26722633838653564, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7329387664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:35] (step=0008475) Train Loss: 0.2867, Train Steps/Sec: 0.28, Epoch: 0.16469102215312864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8476, "loss": 0.2895164489746094, "memory_gb": 7.721559524536133, "step_time_ms": 3363.840341567993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:39] (step=0008476) Train Loss: 0.2801, Train Steps/Sec: 0.27, Epoch: 0.16471045472211426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8477, "loss": 0.15820103883743286, "memory_gb": 7.721559524536133, "step_time_ms": 3347.353219985962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:43] (step=0008477) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.16472988729109989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8478, "loss": 0.10594705492258072, "memory_gb": 7.721559524536133, "step_time_ms": 3357.898950576782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:46] (step=0008478) Train Loss: 0.1637, Train Steps/Sec: 0.28, Epoch: 0.1647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8479, "loss": 0.1989016830921173, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4237308502197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:50] (step=0008479) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.16476875242907113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8480, "loss": 0.19534295797348022, "memory_gb": 7.721559524536133, "step_time_ms": 3362.403631210327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:53] (step=0008480) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.16478818499805675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:30:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8481, "loss": 0.29531943798065186, "memory_gb": 7.721559524536133, "step_time_ms": 3353.816032409668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:30:57] (step=0008481) Train Loss: 0.2714, Train Steps/Sec: 0.28, Epoch: 0.16480761756704237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8482, "loss": 0.2377171367406845, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8266582489014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:01] (step=0008482) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.16482705013602797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8483, "loss": 0.34692150354385376, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7145080566406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:04] (step=0008483) Train Loss: 0.3105, Train Steps/Sec: 0.28, Epoch: 0.1648464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8484, "loss": 0.3516724407672882, "memory_gb": 7.721559524536133, "step_time_ms": 3363.307237625122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:08] (step=0008484) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.16486591527399921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8485, "loss": 0.296614408493042, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0754737854004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:11] (step=0008485) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.16488534784298484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8486, "loss": 0.2590485215187073, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2726669311523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:15] (step=0008486) Train Loss: 0.2692, Train Steps/Sec: 0.28, Epoch: 0.16490478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8487, "loss": 0.23409630358219147, "memory_gb": 7.721559524536133, "step_time_ms": 3356.580972671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:18] (step=0008487) Train Loss: 0.1692, Train Steps/Sec: 0.28, Epoch: 0.16492421298095608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8488, "loss": 0.15533563494682312, "memory_gb": 7.721559524536133, "step_time_ms": 3362.934350967407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:22] (step=0008488) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.1649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8489, "loss": 0.16356392204761505, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0993156433105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:26] (step=0008489) Train Loss: 0.1405, Train Steps/Sec: 0.28, Epoch: 0.16496307811892733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8490, "loss": 0.30451568961143494, "memory_gb": 7.721559524536133, "step_time_ms": 3364.006280899048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:29] (step=0008490) Train Loss: 0.2860, Train Steps/Sec: 0.28, Epoch: 0.16498251068791295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8491, "loss": 0.2470950186252594, "memory_gb": 7.721559524536133, "step_time_ms": 3363.751173019409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:33] (step=0008491) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.16500194325689857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8492, "loss": 0.3450453281402588, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3390197753906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:36] (step=0008492) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.1650213758258842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8493, "loss": 0.16647382080554962, "memory_gb": 7.721559524536133, "step_time_ms": 3358.896017074585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:40] (step=0008493) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.16504080839486981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8494, "loss": 0.20312254130840302, "memory_gb": 7.715639114379883, "step_time_ms": 3328.263521194458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:44] (step=0008494) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.1650602409638554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8495, "loss": 0.27624982595443726, "memory_gb": 7.721559524536133, "step_time_ms": 3353.104829788208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:47] (step=0008495) Train Loss: 0.3033, Train Steps/Sec: 0.28, Epoch: 0.16507967353284103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8496, "loss": 0.2709001302719116, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4278564453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:51] (step=0008496) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.16509910610182665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8497, "loss": 0.25280165672302246, "memory_gb": 7.721559524536133, "step_time_ms": 3358.079195022583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:54] (step=0008497) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.16511853867081228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:31:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8498, "loss": 0.24501341581344604, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6811294555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:31:58] (step=0008498) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.1651379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8499, "loss": 0.16535428166389465, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3199558258057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:01] (step=0008499) Train Loss: 0.1734, Train Steps/Sec: 0.28, Epoch: 0.16515740380878352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8500, "loss": 0.1871730238199234, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7614040374756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:05] (step=0008500) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.16517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8501, "loss": 0.24588972330093384, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8034172058105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:09] (step=0008501) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.16519626894675477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8502, "loss": 0.16404615342617035, "memory_gb": 7.721559524536133, "step_time_ms": 3363.953113555908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:12] (step=0008502) Train Loss: 0.1877, Train Steps/Sec: 0.28, Epoch: 0.1652157015157404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8503, "loss": 0.24899379909038544, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9412231445312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:16] (step=0008503) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.165235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8504, "loss": 0.16178102791309357, "memory_gb": 7.721559524536133, "step_time_ms": 3361.75274848938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:19] (step=0008504) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.16525456665371163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8505, "loss": 0.1913183480501175, "memory_gb": 7.721559524536133, "step_time_ms": 3367.59614944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:23] (step=0008505) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.16527399922269725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8506, "loss": 0.3424963653087616, "memory_gb": 7.721559524536133, "step_time_ms": 3349.9820232391357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:26] (step=0008506) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.16529343179168285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8507, "loss": 0.25848478078842163, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3934001922607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:30] (step=0008507) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.16531286436066847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8508, "loss": 0.33154261112213135, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5171909332275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:34] (step=0008508) Train Loss: 0.2843, Train Steps/Sec: 0.28, Epoch: 0.1653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8509, "loss": 0.2225942611694336, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2658252716064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:37] (step=0008509) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.16535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8510, "loss": 0.2544262111186981, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5050315856934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:41] (step=0008510) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.16537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8511, "loss": 0.15169449150562286, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4909229278564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:44] (step=0008511) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.16539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8512, "loss": 0.23831650614738464, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9747161865234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:48] (step=0008512) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.16541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8513, "loss": 0.2167692482471466, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4510765075684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:51] (step=0008513) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.1654294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8514, "loss": 0.3149980306625366, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5918884277344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:55] (step=0008514) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.16544889234356783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:32:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8515, "loss": 0.19032739102840424, "memory_gb": 7.721559524536133, "step_time_ms": 3345.775365829468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:32:59] (step=0008515) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.16546832491255345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8516, "loss": 0.3333435356616974, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9766750335693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:02] (step=0008516) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.16548775748153907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8517, "loss": 0.179490327835083, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6901893615723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:06] (step=0008517) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.16550719005052467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8518, "loss": 0.22347865998744965, "memory_gb": 7.721559524536133, "step_time_ms": 3504.7616958618164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:09] (step=0008518) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.1655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8519, "loss": 0.2272062450647354, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5807552337646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:13] (step=0008519) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.1655460551884959, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8520, "loss": 0.24840164184570312, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5713844299316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:16] (step=0008520) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.16556548775748153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8521, "loss": 0.2481725811958313, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7328453063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:20] (step=0008521) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.16558492032646716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8522, "loss": 0.2552434504032135, "memory_gb": 7.721559524536133, "step_time_ms": 3356.682300567627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:24] (step=0008522) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.16560435289545278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8523, "loss": 0.16753935813903809, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6555976867676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:27] (step=0008523) Train Loss: 0.2405, Train Steps/Sec: 0.27, Epoch: 0.1656237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8524, "loss": 0.23681142926216125, "memory_gb": 7.715639114379883, "step_time_ms": 3322.681427001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:31] (step=0008524) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.16564321803342402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8525, "loss": 0.19294726848602295, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4713306427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:34] (step=0008525) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.16566265060240964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8526, "loss": 0.24466365575790405, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1201095581055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:38] (step=0008526) Train Loss: 0.2834, Train Steps/Sec: 0.28, Epoch: 0.16568208317139527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8527, "loss": 0.253816694021225, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6155948638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:42] (step=0008527) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.1657015157403809, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8528, "loss": 0.18862546980381012, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4518432617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:45] (step=0008528) Train Loss: 0.2711, Train Steps/Sec: 0.28, Epoch: 0.1657209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8529, "loss": 0.28526532649993896, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7702770233154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:49] (step=0008529) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.1657403808783521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8530, "loss": 0.18299850821495056, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0659370422363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:52] (step=0008530) Train Loss: 0.1611, Train Steps/Sec: 0.28, Epoch: 0.16575981344733773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8531, "loss": 0.22855180501937866, "memory_gb": 7.721559524536133, "step_time_ms": 3352.147340774536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:56] (step=0008531) Train Loss: 0.2638, Train Steps/Sec: 0.28, Epoch: 0.16577924601632335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:33:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8532, "loss": 0.21220162510871887, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2513847351074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:33:59] (step=0008532) Train Loss: 0.1720, Train Steps/Sec: 0.28, Epoch: 0.16579867858530897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8533, "loss": 0.3341241478919983, "memory_gb": 7.721559524536133, "step_time_ms": 3357.015609741211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:03] (step=0008533) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.1658181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8534, "loss": 0.23597702383995056, "memory_gb": 7.721559524536133, "step_time_ms": 3357.177495956421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:07] (step=0008534) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.16583754372328022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8535, "loss": 0.23631049692630768, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7751178741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:10] (step=0008535) Train Loss: 0.3019, Train Steps/Sec: 0.28, Epoch: 0.16585697629226584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8536, "loss": 0.16120342910289764, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0423126220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:14] (step=0008536) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.16587640886125146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8537, "loss": 0.17672906816005707, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9443950653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:17] (step=0008537) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.16589584143023708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8538, "loss": 0.15827447175979614, "memory_gb": 7.721559524536133, "step_time_ms": 3350.898027420044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:21] (step=0008538) Train Loss: 0.1596, Train Steps/Sec: 0.28, Epoch: 0.1659152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8539, "loss": 0.28335994482040405, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6645431518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:24] (step=0008539) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.16593470656820833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8540, "loss": 0.29821479320526123, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9090156555176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:28] (step=0008540) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.16595413913719392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8541, "loss": 0.3022582530975342, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2233448028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:32] (step=0008541) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.16597357170617955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8542, "loss": 0.3581870198249817, "memory_gb": 7.721559524536133, "step_time_ms": 3357.880115509033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:35] (step=0008542) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.16599300427516517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8543, "loss": 0.1251012086868286, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2282581329346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:39] (step=0008543) Train Loss: 0.1510, Train Steps/Sec: 0.28, Epoch: 0.1660124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8544, "loss": 0.1656695306301117, "memory_gb": 7.721559524536133, "step_time_ms": 3359.623670578003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:42] (step=0008544) Train Loss: 0.1934, Train Steps/Sec: 0.28, Epoch: 0.1660318694131364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8545, "loss": 0.2321208119392395, "memory_gb": 7.721559524536133, "step_time_ms": 3357.321262359619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:46] (step=0008545) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.16605130198212203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8546, "loss": 0.3215292990207672, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7087020874023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:49] (step=0008546) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.16607073455110766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8547, "loss": 0.2795233428478241, "memory_gb": 7.721559524536133, "step_time_ms": 3351.658344268799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:53] (step=0008547) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.16609016712009328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8548, "loss": 0.23992517590522766, "memory_gb": 7.721559524536133, "step_time_ms": 3352.830171585083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:34:57] (step=0008548) Train Loss: 0.1832, Train Steps/Sec: 0.28, Epoch: 0.1661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8549, "loss": 0.2989974319934845, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7937774658203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:00] (step=0008549) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.16612903225806452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8550, "loss": 0.1803129017353058, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8334789276123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:04] (step=0008550) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.16614846482705015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8551, "loss": 0.32331281900405884, "memory_gb": 7.721559524536133, "step_time_ms": 3357.058525085449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:07] (step=0008551) Train Loss: 0.2761, Train Steps/Sec: 0.28, Epoch: 0.16616789739603577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8552, "loss": 0.24929173290729523, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4280014038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:11] (step=0008552) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.16618732996502136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8553, "loss": 0.24922361969947815, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3504219055176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:14] (step=0008553) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.16620676253400699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8554, "loss": 0.1489933580160141, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7255687713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:18] (step=0008554) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.1662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8555, "loss": 0.19829729199409485, "memory_gb": 7.721559524536133, "step_time_ms": 3344.7163105010986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:21] (step=0008555) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.16624562767197823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8556, "loss": 0.2382161021232605, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1105003356934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:25] (step=0008556) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.16626506024096385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8557, "loss": 0.28231972455978394, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9562225341797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:29] (step=0008557) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.16628449280994947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8558, "loss": 0.19033114612102509, "memory_gb": 7.721559524536133, "step_time_ms": 3361.586809158325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:32] (step=0008558) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.1663039253789351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8559, "loss": 0.28817984461784363, "memory_gb": 7.721559524536133, "step_time_ms": 3500.1370906829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:36] (step=0008559) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.16632335794792072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8560, "loss": 0.2409680187702179, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1815280914307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:39] (step=0008560) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.16634279051690634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8561, "loss": 0.2005665898323059, "memory_gb": 7.721559524536133, "step_time_ms": 3358.842372894287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:43] (step=0008561) Train Loss: 0.1641, Train Steps/Sec: 0.28, Epoch: 0.16636222308589196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8562, "loss": 0.21527376770973206, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4161014556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:46] (step=0008562) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.16638165565487759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8563, "loss": 0.23255681991577148, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3900299072266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:50] (step=0008563) Train Loss: 0.2816, Train Steps/Sec: 0.28, Epoch: 0.1664010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8564, "loss": 0.33945512771606445, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8275184631348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:54] (step=0008564) Train Loss: 0.2865, Train Steps/Sec: 0.28, Epoch: 0.1664205207928488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:35:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8565, "loss": 0.22519409656524658, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1273555755615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:35:57] (step=0008565) Train Loss: 0.2419, Train Steps/Sec: 0.28, Epoch: 0.16643995336183443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8566, "loss": 0.2233361303806305, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4274311065674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:01] (step=0008566) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.16645938593082005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8567, "loss": 0.21713891625404358, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5124015808105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:04] (step=0008567) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.16647881849980567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8568, "loss": 0.2070547640323639, "memory_gb": 7.721559524536133, "step_time_ms": 3358.469247817993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:08] (step=0008568) Train Loss: 0.2711, Train Steps/Sec: 0.28, Epoch: 0.1664982510687913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8569, "loss": 0.10050851106643677, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8031787872314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:11] (step=0008569) Train Loss: 0.1363, Train Steps/Sec: 0.28, Epoch: 0.16651768363777691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8570, "loss": 0.3202550411224365, "memory_gb": 7.715639114379883, "step_time_ms": 3325.2971172332764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:15] (step=0008570) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.16653711620676254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8571, "loss": 0.24935004115104675, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4495010375977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:19] (step=0008571) Train Loss: 0.2052, Train Steps/Sec: 0.27, Epoch: 0.16655654877574816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8572, "loss": 0.29805946350097656, "memory_gb": 7.721559524536133, "step_time_ms": 3358.653783798218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:22] (step=0008572) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.16657598134473378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8573, "loss": 0.3367857336997986, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8646392822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:26] (step=0008573) Train Loss: 0.2844, Train Steps/Sec: 0.28, Epoch: 0.1665954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8574, "loss": 0.14929251372814178, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9648990631104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:29] (step=0008574) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.16661484648270503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8575, "loss": 0.23341074585914612, "memory_gb": 7.721559524536133, "step_time_ms": 3359.480857849121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:33] (step=0008575) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.16663427905169062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8576, "loss": 0.1661664843559265, "memory_gb": 7.721559524536133, "step_time_ms": 3354.966640472412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:37] (step=0008576) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.16665371162067624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8577, "loss": 0.19965365529060364, "memory_gb": 7.721559524536133, "step_time_ms": 3355.821132659912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:40] (step=0008577) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.16667314418966186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8578, "loss": 0.20481248199939728, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1878204345703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:44] (step=0008578) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.1666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8579, "loss": 0.19456274807453156, "memory_gb": 7.721559524536133, "step_time_ms": 3357.262134552002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:47] (step=0008579) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.1667120093276331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8580, "loss": 0.23169848322868347, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8200759887695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:51] (step=0008580) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.16673144189661873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8581, "loss": 0.3385429084300995, "memory_gb": 7.721559524536133, "step_time_ms": 3361.182451248169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:54] (step=0008581) Train Loss: 0.2921, Train Steps/Sec: 0.28, Epoch: 0.16675087446560435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:36:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8582, "loss": 0.24228695034980774, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6583347320557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:36:58] (step=0008582) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.16677030703458998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8583, "loss": 0.3975173830986023, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5030517578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:01] (step=0008583) Train Loss: 0.3360, Train Steps/Sec: 0.28, Epoch: 0.1667897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8584, "loss": 0.1704026311635971, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5893173217773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:05] (step=0008584) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.16680917217256122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8585, "loss": 0.22324492037296295, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2170009613037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:09] (step=0008585) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.16682860474154684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8586, "loss": 0.23465806245803833, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5759353637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:12] (step=0008586) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.16684803731053247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8587, "loss": 0.29941344261169434, "memory_gb": 7.721559524536133, "step_time_ms": 3352.348566055298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:16] (step=0008587) Train Loss: 0.2745, Train Steps/Sec: 0.28, Epoch: 0.16686746987951806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8588, "loss": 0.22806957364082336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.388568878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:19] (step=0008588) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.16688690244850368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8589, "loss": 0.1679871529340744, "memory_gb": 7.721559524536133, "step_time_ms": 3348.5705852508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:23] (step=0008589) Train Loss: 0.2194, Train Steps/Sec: 0.28, Epoch: 0.1669063350174893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8590, "loss": 0.24491019546985626, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8767986297607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:26] (step=0008590) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.16692576758647493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8591, "loss": 0.23995621502399445, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2000942230225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:30] (step=0008591) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.16694520015546055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8592, "loss": 0.25234758853912354, "memory_gb": 7.721559524536133, "step_time_ms": 3359.318494796753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:34] (step=0008592) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.16696463272444617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8593, "loss": 0.15623952448368073, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2499237060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:37] (step=0008593) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.1669840652934318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8594, "loss": 0.2380913943052292, "memory_gb": 7.721559524536133, "step_time_ms": 3359.376907348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:41] (step=0008594) Train Loss: 0.2887, Train Steps/Sec: 0.28, Epoch: 0.16700349786241742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8595, "loss": 0.2571429908275604, "memory_gb": 7.721559524536133, "step_time_ms": 3362.187623977661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:44] (step=0008595) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.16702293043140304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8596, "loss": 0.26364362239837646, "memory_gb": 7.721559524536133, "step_time_ms": 3365.589380264282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:48] (step=0008596) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.16704236300038866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8597, "loss": 0.23682671785354614, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4890365600586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:51] (step=0008597) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.16706179556937428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8598, "loss": 0.22831660509109497, "memory_gb": 7.721559524536133, "step_time_ms": 3364.670991897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:55] (step=0008598) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.16708122813835988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:37:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8599, "loss": 0.2911989986896515, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4657649993896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:37:59] (step=0008599) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.1671006607073455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8600, "loss": 0.20480003952980042, "memory_gb": 7.721559524536133, "step_time_ms": 3498.541831970215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:02] (step=0008600) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.16712009327633112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8601, "loss": 0.3006596565246582, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7373657226562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:06] (step=0008601) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.16713952584531674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8602, "loss": 0.12518490850925446, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2216453552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:09] (step=0008602) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.16715895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8603, "loss": 0.16792315244674683, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0049228668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:13] (step=0008603) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.167178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8604, "loss": 0.2909669876098633, "memory_gb": 7.721559524536133, "step_time_ms": 3362.785577774048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:17] (step=0008604) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.1671978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8605, "loss": 0.2793727517127991, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3154678344727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:20] (step=0008605) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.16721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8606, "loss": 0.29372042417526245, "memory_gb": 7.721559524536133, "step_time_ms": 3363.178014755249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:24] (step=0008606) Train Loss: 0.2870, Train Steps/Sec: 0.28, Epoch: 0.16723668869024486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8607, "loss": 0.32482534646987915, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4281158447266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:27] (step=0008607) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.16725612125923048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8608, "loss": 0.23920023441314697, "memory_gb": 7.721559524536133, "step_time_ms": 3345.7186222076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:31] (step=0008608) Train Loss: 0.2870, Train Steps/Sec: 0.28, Epoch: 0.1672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8609, "loss": 0.231429785490036, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3053092956543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:34] (step=0008609) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.16729498639720172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8610, "loss": 0.16326917707920074, "memory_gb": 7.721559524536133, "step_time_ms": 3360.511064529419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:38] (step=0008610) Train Loss: 0.1702, Train Steps/Sec: 0.28, Epoch: 0.16731441896618732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8611, "loss": 0.1730169653892517, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6720695495605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:41] (step=0008611) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.16733385153517294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8612, "loss": 0.21500283479690552, "memory_gb": 7.721559524536133, "step_time_ms": 3364.898681640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:45] (step=0008612) Train Loss: 0.2485, Train Steps/Sec: 0.27, Epoch: 0.16735328410415856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8613, "loss": 0.3239288926124573, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8167476654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:49] (step=0008613) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.16737271667314418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8614, "loss": 0.21485862135887146, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2521209716797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:52] (step=0008614) Train Loss: 0.2138, Train Steps/Sec: 0.28, Epoch: 0.1673921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8615, "loss": 0.22980332374572754, "memory_gb": 7.721559524536133, "step_time_ms": 3359.257698059082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:56] (step=0008615) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.16741158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8616, "loss": 0.3191871643066406, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6108684539795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:38:59] (step=0008616) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.16743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8617, "loss": 0.27865809202194214, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1861934661865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:03] (step=0008617) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.16745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8618, "loss": 0.23023083806037903, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0732345581055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:07] (step=0008618) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.1674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8619, "loss": 0.21125391125679016, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4489612579346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:10] (step=0008619) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.16748931208705792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8620, "loss": 0.20983298122882843, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5217723846436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:14] (step=0008620) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.16750874465604354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8621, "loss": 0.14938846230506897, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2257289886475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:17] (step=0008621) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.16752817722502916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8622, "loss": 0.14049743115901947, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6837310791016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:21] (step=0008622) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.16754760979401476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8623, "loss": 0.2900386154651642, "memory_gb": 7.721559524536133, "step_time_ms": 3354.567527770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:24] (step=0008623) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.16756704236300038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8624, "loss": 0.2531052231788635, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8829250335693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:28] (step=0008624) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.167586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8625, "loss": 0.30286410450935364, "memory_gb": 7.721559524536133, "step_time_ms": 3344.920873641968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:32] (step=0008625) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.16760590750097162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8626, "loss": 0.21006879210472107, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4329872131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:35] (step=0008626) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.16762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8627, "loss": 0.28769415616989136, "memory_gb": 7.715639114379883, "step_time_ms": 3322.1051692962646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:39] (step=0008627) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.16764477263894287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8628, "loss": 0.2294907569885254, "memory_gb": 7.721559524536133, "step_time_ms": 3347.3141193389893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:42] (step=0008628) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.1676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8629, "loss": 0.2264106273651123, "memory_gb": 7.721559524536133, "step_time_ms": 3355.372190475464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:46] (step=0008629) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.1676836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8630, "loss": 0.23299023509025574, "memory_gb": 7.721559524536133, "step_time_ms": 3350.515604019165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:49] (step=0008630) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.16770307034589974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8631, "loss": 0.10178008675575256, "memory_gb": 7.721559524536133, "step_time_ms": 3352.597951889038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:53] (step=0008631) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.16772250291488536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8632, "loss": 0.14923207461833954, "memory_gb": 7.721559524536133, "step_time_ms": 3360.713243484497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:39:57] (step=0008632) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.16774193548387098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8633, "loss": 0.1289822906255722, "memory_gb": 7.721559524536133, "step_time_ms": 3352.674722671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:00] (step=0008633) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.16776136805285657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8634, "loss": 0.26657986640930176, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9058227539062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:04] (step=0008634) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.1677808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8635, "loss": 0.2664286494255066, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6808700561523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:07] (step=0008635) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.16780023319082782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8636, "loss": 0.22500716149806976, "memory_gb": 7.721559524536133, "step_time_ms": 3357.179880142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:11] (step=0008636) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.16781966575981344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8637, "loss": 0.2495276927947998, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9748401641846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:14] (step=0008637) Train Loss: 0.3088, Train Steps/Sec: 0.28, Epoch: 0.16783909832879906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8638, "loss": 0.27894964814186096, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9985885620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:18] (step=0008638) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.16785853089778469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8639, "loss": 0.2914423942565918, "memory_gb": 7.721559524536133, "step_time_ms": 3349.787473678589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:21] (step=0008639) Train Loss: 0.2804, Train Steps/Sec: 0.28, Epoch: 0.1678779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8640, "loss": 0.2870371639728546, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9584617614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:25] (step=0008640) Train Loss: 0.2629, Train Steps/Sec: 0.28, Epoch: 0.16789739603575593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8641, "loss": 0.26890385150909424, "memory_gb": 7.721559524536133, "step_time_ms": 3358.478784561157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:29] (step=0008641) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.16791682860474155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8642, "loss": 0.1963619887828827, "memory_gb": 7.721559524536133, "step_time_ms": 3345.376968383789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:32] (step=0008642) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.16793626117372717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8643, "loss": 0.19560331106185913, "memory_gb": 7.721559524536133, "step_time_ms": 3354.517698287964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:36] (step=0008643) Train Loss: 0.2412, Train Steps/Sec: 0.28, Epoch: 0.1679556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8644, "loss": 0.2547610402107239, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9670448303223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:39] (step=0008644) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.16797512631169842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8645, "loss": 0.1551450788974762, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6387424468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:43] (step=0008645) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.16799455888068401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8646, "loss": 0.22970399260520935, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2706756591797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:46] (step=0008646) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.16801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8647, "loss": 0.2756502330303192, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3074131011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:50] (step=0008647) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.16803342401865526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8648, "loss": 0.24333971738815308, "memory_gb": 7.721559524536133, "step_time_ms": 3501.276969909668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:54] (step=0008648) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.16805285658764088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:40:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8649, "loss": 0.30692481994628906, "memory_gb": 7.721559524536133, "step_time_ms": 3356.76646232605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:40:57] (step=0008649) Train Loss: 0.3341, Train Steps/Sec: 0.28, Epoch: 0.1680722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8650, "loss": 0.18056491017341614, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8707237243652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:01] (step=0008650) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.16809172172561213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8651, "loss": 0.3221338987350464, "memory_gb": 7.721559524536133, "step_time_ms": 3358.311176300049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:04] (step=0008651) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.16811115429459775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8652, "loss": 0.16259467601776123, "memory_gb": 7.721559524536133, "step_time_ms": 3349.85089302063, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:08] (step=0008652) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.16813058686358337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8653, "loss": 0.269844114780426, "memory_gb": 7.721559524536133, "step_time_ms": 3354.781150817871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:11] (step=0008653) Train Loss: 0.2329, Train Steps/Sec: 0.28, Epoch: 0.168150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8654, "loss": 0.24235732853412628, "memory_gb": 7.721559524536133, "step_time_ms": 3354.649543762207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:15] (step=0008654) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.16816945200155461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8655, "loss": 0.2599782347679138, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8947315216064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:18] (step=0008655) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.16818888457054024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8656, "loss": 0.32631003856658936, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9584617614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:22] (step=0008656) Train Loss: 0.2985, Train Steps/Sec: 0.28, Epoch: 0.16820831713952586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8657, "loss": 0.1773463785648346, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5588302612305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:26] (step=0008657) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.16822774970851145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8658, "loss": 0.206402987241745, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7467250823975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:29] (step=0008658) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.16824718227749708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8659, "loss": 0.2035522162914276, "memory_gb": 7.721559524536133, "step_time_ms": 3354.259967803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:33] (step=0008659) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.1682666148464827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8660, "loss": 0.34929072856903076, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9679985046387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:36] (step=0008660) Train Loss: 0.2765, Train Steps/Sec: 0.27, Epoch: 0.16828604741546832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8661, "loss": 0.3159051537513733, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6249656677246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:40] (step=0008661) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.16830547998445394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8662, "loss": 0.15480802953243256, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3595542907715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:44] (step=0008662) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.16832491255343957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8663, "loss": 0.22145357728004456, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6970825195312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:47] (step=0008663) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.1683443451224252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8664, "loss": 0.19343829154968262, "memory_gb": 7.721559524536133, "step_time_ms": 3356.05788230896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:51] (step=0008664) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.1683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8665, "loss": 0.19757401943206787, "memory_gb": 7.721559524536133, "step_time_ms": 3356.048345565796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:54] (step=0008665) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.16838321026039643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:41:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8666, "loss": 0.11905007064342499, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9090366363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:41:58] (step=0008666) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.16840264282938205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8667, "loss": 0.2168038785457611, "memory_gb": 7.721559524536133, "step_time_ms": 3357.940673828125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:01] (step=0008667) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.16842207539836768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8668, "loss": 0.14330774545669556, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6085357666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:05] (step=0008668) Train Loss: 0.1784, Train Steps/Sec: 0.28, Epoch: 0.16844150796735327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8669, "loss": 0.3374485671520233, "memory_gb": 7.721559524536133, "step_time_ms": 3357.149362564087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:09] (step=0008669) Train Loss: 0.3289, Train Steps/Sec: 0.28, Epoch: 0.1684609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8670, "loss": 0.1933136284351349, "memory_gb": 7.721559524536133, "step_time_ms": 3340.6524658203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:12] (step=0008670) Train Loss: 0.2002, Train Steps/Sec: 0.29, Epoch: 0.16848037310532452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8671, "loss": 0.2100578248500824, "memory_gb": 7.721559524536133, "step_time_ms": 3354.013204574585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:16] (step=0008671) Train Loss: 0.2090, Train Steps/Sec: 0.28, Epoch: 0.16849980567431014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8672, "loss": 0.20592552423477173, "memory_gb": 7.721559524536133, "step_time_ms": 3353.647470474243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:19] (step=0008672) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.16851923824329576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8673, "loss": 0.3517929017543793, "memory_gb": 7.721559524536133, "step_time_ms": 3360.193967819214, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:23] (step=0008673) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.16853867081228138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8674, "loss": 0.21316209435462952, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5621376037598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:26] (step=0008674) Train Loss: 0.1918, Train Steps/Sec: 0.28, Epoch: 0.168558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8675, "loss": 0.20570023357868195, "memory_gb": 7.721559524536133, "step_time_ms": 3360.593318939209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:30] (step=0008675) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.16857753595025263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8676, "loss": 0.2741508185863495, "memory_gb": 7.721559524536133, "step_time_ms": 3353.032350540161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:33] (step=0008676) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.16859696851923825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8677, "loss": 0.22207975387573242, "memory_gb": 7.721559524536133, "step_time_ms": 3343.045711517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:37] (step=0008677) Train Loss: 0.2088, Train Steps/Sec: 0.28, Epoch: 0.16861640108822387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8678, "loss": 0.20419663190841675, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1320095062256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:41] (step=0008678) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.1686358336572095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8679, "loss": 0.16934306919574738, "memory_gb": 7.721559524536133, "step_time_ms": 3354.26926612854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:44] (step=0008679) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.16865526622619512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8680, "loss": 0.23864352703094482, "memory_gb": 7.721559524536133, "step_time_ms": 3356.823682785034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:48] (step=0008680) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.1686746987951807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8681, "loss": 0.3247281312942505, "memory_gb": 7.721559524536133, "step_time_ms": 3342.4572944641113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:51] (step=0008681) Train Loss: 0.2670, Train Steps/Sec: 0.28, Epoch: 0.16869413136416633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8682, "loss": 0.30995500087738037, "memory_gb": 7.721559524536133, "step_time_ms": 3357.951879501343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:55] (step=0008682) Train Loss: 0.2957, Train Steps/Sec: 0.28, Epoch: 0.16871356393315196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:42:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8683, "loss": 0.19583266973495483, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0452671051025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:42:58] (step=0008683) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.16873299650213758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8684, "loss": 0.15388286113739014, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6272468566895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:02] (step=0008684) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.1687524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8685, "loss": 0.3295210301876068, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5808277130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:05] (step=0008685) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.16877186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8686, "loss": 0.2409828007221222, "memory_gb": 7.721559524536133, "step_time_ms": 3353.041172027588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:09] (step=0008686) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.16879129420909444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8687, "loss": 0.2717348039150238, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2616786956787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:13] (step=0008687) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.16881072677808007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8688, "loss": 0.21320918202400208, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6115322113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:16] (step=0008688) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.1688301593470657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8689, "loss": 0.19006574153900146, "memory_gb": 7.721559524536133, "step_time_ms": 3499.262809753418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:20] (step=0008689) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.1688495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8690, "loss": 0.24801132082939148, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6181659698486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:23] (step=0008690) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.16886902448503693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8691, "loss": 0.2861165404319763, "memory_gb": 7.721559524536133, "step_time_ms": 3358.834743499756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:27] (step=0008691) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.16888845705402253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8692, "loss": 0.24819566309452057, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3102951049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:30] (step=0008692) Train Loss: 0.2455, Train Steps/Sec: 0.28, Epoch: 0.16890788962300815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8693, "loss": 0.3006936013698578, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0265254974365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:34] (step=0008693) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.16892732219199377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8694, "loss": 0.30601638555526733, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0103855133057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:38] (step=0008694) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.1689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8695, "loss": 0.1298603117465973, "memory_gb": 7.721559524536133, "step_time_ms": 3359.987497329712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:41] (step=0008695) Train Loss: 0.1777, Train Steps/Sec: 0.28, Epoch: 0.16896618732996502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8696, "loss": 0.18649058043956757, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4024181365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:45] (step=0008696) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.16898561989895064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8697, "loss": 0.1904963254928589, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2189807891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:48] (step=0008697) Train Loss: 0.2711, Train Steps/Sec: 0.28, Epoch: 0.16900505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8698, "loss": 0.2376391589641571, "memory_gb": 7.721559524536133, "step_time_ms": 3361.922264099121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:52] (step=0008698) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.16902448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8699, "loss": 0.3123229742050171, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8103523254395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:55] (step=0008699) Train Loss: 0.3090, Train Steps/Sec: 0.28, Epoch: 0.1690439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8700, "loss": 0.22050584852695465, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1549911499023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:43:59] (step=0008700) Train Loss: 0.2394, Train Steps/Sec: 0.27, Epoch: 0.16906335017489313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8701, "loss": 0.25013118982315063, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4462871551514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:03] (step=0008701) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.16908278274387875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8702, "loss": 0.3541039228439331, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8030548095703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:06] (step=0008702) Train Loss: 0.3411, Train Steps/Sec: 0.28, Epoch: 0.16910221531286437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8703, "loss": 0.32560697197914124, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3244552612305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:10] (step=0008703) Train Loss: 0.2882, Train Steps/Sec: 0.28, Epoch: 0.16912164788184997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8704, "loss": 0.20051482319831848, "memory_gb": 7.715639114379883, "step_time_ms": 3323.8093852996826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:13] (step=0008704) Train Loss: 0.1754, Train Steps/Sec: 0.28, Epoch: 0.1691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8705, "loss": 0.3345804214477539, "memory_gb": 7.721559524536133, "step_time_ms": 3352.698802947998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:17] (step=0008705) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.1691605130198212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8706, "loss": 0.20210552215576172, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7427654266357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:21] (step=0008706) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.16917994558880683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8707, "loss": 0.277985155582428, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0896129608154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:24] (step=0008707) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.16919937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8708, "loss": 0.2661503255367279, "memory_gb": 7.721559524536133, "step_time_ms": 3359.529733657837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:28] (step=0008708) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.16921881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8709, "loss": 0.17043446004390717, "memory_gb": 7.721559524536133, "step_time_ms": 3344.909906387329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:31] (step=0008709) Train Loss: 0.1649, Train Steps/Sec: 0.28, Epoch: 0.1692382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8710, "loss": 0.32895636558532715, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4395179748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:35] (step=0008710) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.16925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8711, "loss": 0.22971373796463013, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2487831115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:38] (step=0008711) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.16927710843373495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8712, "loss": 0.20878933370113373, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9508533477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:42] (step=0008712) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.16929654100272057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8713, "loss": 0.2919307053089142, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9773902893066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:46] (step=0008713) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.1693159735717062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8714, "loss": 0.2202562391757965, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1906814575195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:49] (step=0008714) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.1693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8715, "loss": 0.24649767577648163, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3929538726807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:53] (step=0008715) Train Loss: 0.3250, Train Steps/Sec: 0.28, Epoch: 0.1693548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:44:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8716, "loss": 0.2545284032821655, "memory_gb": 7.721559524536133, "step_time_ms": 3359.606981277466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:44:56] (step=0008716) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.16937427127866303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8717, "loss": 0.18771830201148987, "memory_gb": 7.721559524536133, "step_time_ms": 3355.224370956421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:00] (step=0008717) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.16939370384764865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8718, "loss": 0.25640392303466797, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1698150634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:03] (step=0008718) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.16941313641663427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8719, "loss": 0.20295213162899017, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6946983337402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:07] (step=0008719) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.1694325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8720, "loss": 0.33954912424087524, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0264835357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:11] (step=0008720) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.16945200155460552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8721, "loss": 0.22247272729873657, "memory_gb": 7.721559524536133, "step_time_ms": 3346.4159965515137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:14] (step=0008721) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.16947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8722, "loss": 0.18086746335029602, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8520641326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:18] (step=0008722) Train Loss: 0.2977, Train Steps/Sec: 0.28, Epoch: 0.16949086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8723, "loss": 0.26706239581108093, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4863204956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:21] (step=0008723) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.16951029926156239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8724, "loss": 0.24885427951812744, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3144721984863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:25] (step=0008724) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.169529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8725, "loss": 0.17230935394763947, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4162254333496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:28] (step=0008725) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.16954916439953363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8726, "loss": 0.21246004104614258, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9695434570312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:32] (step=0008726) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.16956859696851923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8727, "loss": 0.26387283205986023, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7840843200684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:36] (step=0008727) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.16958802953750485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8728, "loss": 0.2819128930568695, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3291606903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:39] (step=0008728) Train Loss: 0.2901, Train Steps/Sec: 0.28, Epoch: 0.16960746210649047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8729, "loss": 0.19392108917236328, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3479232788086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:43] (step=0008729) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.1696268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8730, "loss": 0.19098034501075745, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4060459136963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:46] (step=0008730) Train Loss: 0.2329, Train Steps/Sec: 0.28, Epoch: 0.16964632724446171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8731, "loss": 0.18258439004421234, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5923023223877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:50] (step=0008731) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.16966575981344734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8732, "loss": 0.25918033719062805, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1862239837646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:53] (step=0008732) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.16968519238243296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8733, "loss": 0.24943917989730835, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7680377960205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:45:57] (step=0008733) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.16970462495141858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8734, "loss": 0.23347724974155426, "memory_gb": 7.721559524536133, "step_time_ms": 3349.6527671813965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:01] (step=0008734) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.1697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8735, "loss": 0.16976475715637207, "memory_gb": 7.721559524536133, "step_time_ms": 3346.25244140625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:04] (step=0008735) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.16974349008938983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8736, "loss": 0.1428288072347641, "memory_gb": 7.721559524536133, "step_time_ms": 3496.8037605285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:08] (step=0008736) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.16976292265837545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8737, "loss": 0.3340095579624176, "memory_gb": 7.721559524536133, "step_time_ms": 3353.66153717041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:11] (step=0008737) Train Loss: 0.3266, Train Steps/Sec: 0.28, Epoch: 0.16978235522736107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8738, "loss": 0.26388829946517944, "memory_gb": 7.721559524536133, "step_time_ms": 3354.949951171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:15] (step=0008738) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.16980178779634666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8739, "loss": 0.2991334795951843, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5051555633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:18] (step=0008739) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.1698212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8740, "loss": 0.18602603673934937, "memory_gb": 7.721559524536133, "step_time_ms": 3358.161211013794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:22] (step=0008740) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.1698406529343179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8741, "loss": 0.19653743505477905, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0756702423096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:26] (step=0008741) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.16986008550330353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8742, "loss": 0.19632048904895782, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3154468536377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:29] (step=0008742) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.16987951807228915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8743, "loss": 0.24337679147720337, "memory_gb": 7.721559524536133, "step_time_ms": 3357.947826385498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:33] (step=0008743) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.16989895064127478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8744, "loss": 0.23559626936912537, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3029041290283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:36] (step=0008744) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.1699183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8745, "loss": 0.2222457230091095, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7218055725098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:40] (step=0008745) Train Loss: 0.2076, Train Steps/Sec: 0.28, Epoch: 0.16993781577924602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8746, "loss": 0.27990415692329407, "memory_gb": 7.721559524536133, "step_time_ms": 3351.68194770813, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:43] (step=0008746) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.16995724834823164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8747, "loss": 0.22213990986347198, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4301261901855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:47] (step=0008747) Train Loss: 0.1761, Train Steps/Sec: 0.27, Epoch: 0.16997668091721727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8748, "loss": 0.30117255449295044, "memory_gb": 7.721559524536133, "step_time_ms": 3356.039524078369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:51] (step=0008748) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.1699961134862029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8749, "loss": 0.2788159251213074, "memory_gb": 7.721559524536133, "step_time_ms": 3354.367256164551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:54] (step=0008749) Train Loss: 0.2923, Train Steps/Sec: 0.28, Epoch: 0.17001554605518848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:46:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8750, "loss": 0.19206540286540985, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0427894592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:46:58] (step=0008750) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.1700349786241741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8751, "loss": 0.24548494815826416, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0144901275635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:01] (step=0008751) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.17005441119315973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8752, "loss": 0.2757791578769684, "memory_gb": 7.721559524536133, "step_time_ms": 3353.64031791687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:05] (step=0008752) Train Loss: 0.3198, Train Steps/Sec: 0.28, Epoch: 0.17007384376214535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8753, "loss": 0.24954679608345032, "memory_gb": 7.721559524536133, "step_time_ms": 3356.257438659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:08] (step=0008753) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.17009327633113097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8754, "loss": 0.20833005011081696, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1040840148926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:12] (step=0008754) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.1701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8755, "loss": 0.2681151032447815, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2333583831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:16] (step=0008755) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.17013214146910222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8756, "loss": 0.18168818950653076, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2703132629395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:19] (step=0008756) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.17015157403808784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8757, "loss": 0.32105177640914917, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4588298797607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:23] (step=0008757) Train Loss: 0.2988, Train Steps/Sec: 0.28, Epoch: 0.17017100660707346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8758, "loss": 0.3101099133491516, "memory_gb": 7.721559524536133, "step_time_ms": 3357.492685317993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:26] (step=0008758) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.17019043917605908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8759, "loss": 0.20901884138584137, "memory_gb": 7.721559524536133, "step_time_ms": 3354.125499725342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:30] (step=0008759) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.1702098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8760, "loss": 0.2194908708333969, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6697578430176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:33] (step=0008760) Train Loss: 0.2807, Train Steps/Sec: 0.28, Epoch: 0.17022930431403033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8761, "loss": 0.2681979238986969, "memory_gb": 7.721559524536133, "step_time_ms": 3357.178211212158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:37] (step=0008761) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.17024873688301592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8762, "loss": 0.300676554441452, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1363430023193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:41] (step=0008762) Train Loss: 0.2994, Train Steps/Sec: 0.28, Epoch: 0.17026816945200154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8763, "loss": 0.21975354850292206, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4209213256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:44] (step=0008763) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.17028760202098717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8764, "loss": 0.14860276877880096, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4463596343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:48] (step=0008764) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.1703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8765, "loss": 0.19806048274040222, "memory_gb": 7.715639114379883, "step_time_ms": 3322.222948074341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:51] (step=0008765) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.1703264671589584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8766, "loss": 0.24533183872699738, "memory_gb": 7.721559524536133, "step_time_ms": 3352.536678314209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:55] (step=0008766) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.17034589972794403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:47:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8767, "loss": 0.16750970482826233, "memory_gb": 7.721559524536133, "step_time_ms": 3353.776216506958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:47:58] (step=0008767) Train Loss: 0.1689, Train Steps/Sec: 0.28, Epoch: 0.17036533229692966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8768, "loss": 0.3329254388809204, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1088829040527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:02] (step=0008768) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.17038476486591528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8769, "loss": 0.24884852766990662, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8498363494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:06] (step=0008769) Train Loss: 0.2664, Train Steps/Sec: 0.28, Epoch: 0.1704041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8770, "loss": 0.23150736093521118, "memory_gb": 7.721559524536133, "step_time_ms": 3352.731943130493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:09] (step=0008770) Train Loss: 0.2996, Train Steps/Sec: 0.28, Epoch: 0.17042363000388652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8771, "loss": 0.26590245962142944, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0472469329834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:13] (step=0008771) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.17044306257287214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8772, "loss": 0.12750279903411865, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1668605804443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:16] (step=0008772) Train Loss: 0.1639, Train Steps/Sec: 0.28, Epoch: 0.17046249514185777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8773, "loss": 0.3013657033443451, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7609272003174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:20] (step=0008773) Train Loss: 0.3088, Train Steps/Sec: 0.28, Epoch: 0.17048192771084336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8774, "loss": 0.1459418684244156, "memory_gb": 7.721559524536133, "step_time_ms": 3355.863332748413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:23] (step=0008774) Train Loss: 0.1637, Train Steps/Sec: 0.28, Epoch: 0.17050136027982898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8775, "loss": 0.19127094745635986, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2753505706787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:27] (step=0008775) Train Loss: 0.1952, Train Steps/Sec: 0.28, Epoch: 0.1705207928488146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8776, "loss": 0.2855287194252014, "memory_gb": 7.721559524536133, "step_time_ms": 3360.835075378418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:30] (step=0008776) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.17054022541780023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8777, "loss": 0.2488642781972885, "memory_gb": 7.721559524536133, "step_time_ms": 3500.2567768096924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:34] (step=0008777) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.17055965798678585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8778, "loss": 0.26772433519363403, "memory_gb": 7.721559524536133, "step_time_ms": 3357.412338256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:38] (step=0008778) Train Loss: 0.2529, Train Steps/Sec: 0.28, Epoch: 0.17057909055577147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8779, "loss": 0.13112632930278778, "memory_gb": 7.721559524536133, "step_time_ms": 3357.670783996582, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:41] (step=0008779) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.1705985231247571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8780, "loss": 0.24961379170417786, "memory_gb": 7.721559524536133, "step_time_ms": 3356.504201889038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:45] (step=0008780) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.17061795569374272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8781, "loss": 0.2629874348640442, "memory_gb": 7.721559524536133, "step_time_ms": 3358.020067214966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:48] (step=0008781) Train Loss: 0.2622, Train Steps/Sec: 0.28, Epoch: 0.17063738826272834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8782, "loss": 0.30855727195739746, "memory_gb": 7.721559524536133, "step_time_ms": 3339.2269611358643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:52] (step=0008782) Train Loss: 0.2882, Train Steps/Sec: 0.28, Epoch: 0.17065682083171396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8783, "loss": 0.3028733730316162, "memory_gb": 7.715639114379883, "step_time_ms": 3310.5578422546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:55] (step=0008783) Train Loss: 0.3117, Train Steps/Sec: 0.28, Epoch: 0.17067625340069958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:48:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8784, "loss": 0.24551664292812347, "memory_gb": 7.721559524536133, "step_time_ms": 3351.123809814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:48:59] (step=0008784) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.17069568596968518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8785, "loss": 0.20675727725028992, "memory_gb": 7.721559524536133, "step_time_ms": 3354.867935180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:03] (step=0008785) Train Loss: 0.2329, Train Steps/Sec: 0.28, Epoch: 0.1707151185386708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8786, "loss": 0.2564377784729004, "memory_gb": 7.721559524536133, "step_time_ms": 3356.466293334961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:06] (step=0008786) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.17073455110765642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8787, "loss": 0.257071852684021, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3952026367188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:10] (step=0008787) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.17075398367664205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8788, "loss": 0.23332589864730835, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3430309295654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:13] (step=0008788) Train Loss: 0.2214, Train Steps/Sec: 0.27, Epoch: 0.17077341624562767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8789, "loss": 0.3691028654575348, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3860187530518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:17] (step=0008789) Train Loss: 0.2667, Train Steps/Sec: 0.28, Epoch: 0.1707928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8790, "loss": 0.1384272277355194, "memory_gb": 7.721559524536133, "step_time_ms": 3362.245798110962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:21] (step=0008790) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.1708122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8791, "loss": 0.17826353013515472, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3337326049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:24] (step=0008791) Train Loss: 0.1849, Train Steps/Sec: 0.28, Epoch: 0.17083171395258454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8792, "loss": 0.31568053364753723, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0027561187744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:28] (step=0008792) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.17085114652157016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8793, "loss": 0.29409289360046387, "memory_gb": 7.721559524536133, "step_time_ms": 3358.274459838867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:31] (step=0008793) Train Loss: 0.2937, Train Steps/Sec: 0.28, Epoch: 0.17087057909055578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8794, "loss": 0.20758187770843506, "memory_gb": 7.721559524536133, "step_time_ms": 3360.676050186157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:35] (step=0008794) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.1708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8795, "loss": 0.24670827388763428, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2148036956787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:38] (step=0008795) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.17090944422852702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8796, "loss": 0.3660084307193756, "memory_gb": 7.715639114379883, "step_time_ms": 3315.0181770324707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:42] (step=0008796) Train Loss: 0.3200, Train Steps/Sec: 0.28, Epoch: 0.17092887679751262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8797, "loss": 0.12892618775367737, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3123683929443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:46] (step=0008797) Train Loss: 0.1534, Train Steps/Sec: 0.28, Epoch: 0.17094830936649824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8798, "loss": 0.2509830594062805, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5974445343018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:49] (step=0008798) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.17096774193548386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8799, "loss": 0.24135540425777435, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0124588012695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:53] (step=0008799) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.17098717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:49:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8800, "loss": 0.21802183985710144, "memory_gb": 7.721559524536133, "step_time_ms": 3357.68723487854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:49:56] (step=0008800) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.1710066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8801, "loss": 0.21173560619354248, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7617359161377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:00] (step=0008801) Train Loss: 0.2788, Train Steps/Sec: 0.28, Epoch: 0.17102603964244073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8802, "loss": 0.24873730540275574, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3872108459473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:03] (step=0008802) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.17104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8803, "loss": 0.2775290310382843, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8147888183594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:07] (step=0008803) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.17106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8804, "loss": 0.2632223069667816, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5920543670654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:11] (step=0008804) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.1710843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8805, "loss": 0.21708205342292786, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2756309509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:14] (step=0008805) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.17110376991838322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8806, "loss": 0.22331129014492035, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4834594726562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:18] (step=0008806) Train Loss: 0.2742, Train Steps/Sec: 0.28, Epoch: 0.17112320248736884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8807, "loss": 0.22404012084007263, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0708084106445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:21] (step=0008807) Train Loss: 0.1724, Train Steps/Sec: 0.28, Epoch: 0.17114263505635444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8808, "loss": 0.337807834148407, "memory_gb": 7.715639114379883, "step_time_ms": 3321.5599060058594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:25] (step=0008808) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.17116206762534006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8809, "loss": 0.16529369354248047, "memory_gb": 7.721559524536133, "step_time_ms": 3360.978126525879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:28] (step=0008809) Train Loss: 0.1735, Train Steps/Sec: 0.28, Epoch: 0.17118150019432568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8810, "loss": 0.19521448016166687, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1695556640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:32] (step=0008810) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.1712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8811, "loss": 0.25396275520324707, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5780601501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:36] (step=0008811) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.17122036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8812, "loss": 0.3658443093299866, "memory_gb": 7.721559524536133, "step_time_ms": 3353.139877319336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:39] (step=0008812) Train Loss: 0.3024, Train Steps/Sec: 0.28, Epoch: 0.17123979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8813, "loss": 0.2447170466184616, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1298332214355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:43] (step=0008813) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.17125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8814, "loss": 0.22050145268440247, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2425842285156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:46] (step=0008814) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.1712786630392538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8815, "loss": 0.278664231300354, "memory_gb": 7.721559524536133, "step_time_ms": 3350.182294845581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:50] (step=0008815) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.17129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8816, "loss": 0.18793511390686035, "memory_gb": 7.721559524536133, "step_time_ms": 3359.234094619751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:54] (step=0008816) Train Loss: 0.1816, Train Steps/Sec: 0.28, Epoch: 0.17131752817722504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:50:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8817, "loss": 0.32370680570602417, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5869121551514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:50:57] (step=0008817) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.17133696074621066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8818, "loss": 0.2379375398159027, "memory_gb": 7.721559524536133, "step_time_ms": 3357.353448867798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:01] (step=0008818) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.17135639331519628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8819, "loss": 0.26172542572021484, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5609035491943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:04] (step=0008819) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.17137582588418188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8820, "loss": 0.1506725251674652, "memory_gb": 7.721559524536133, "step_time_ms": 3357.889413833618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:08] (step=0008820) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.1713952584531675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8821, "loss": 0.21781332790851593, "memory_gb": 7.721559524536133, "step_time_ms": 3356.175184249878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:11] (step=0008821) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.17141469102215312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8822, "loss": 0.29526346921920776, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8851013183594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:15] (step=0008822) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.17143412359113874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8823, "loss": 0.34267741441726685, "memory_gb": 7.721559524536133, "step_time_ms": 3346.715211868286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:19] (step=0008823) Train Loss: 0.3217, Train Steps/Sec: 0.28, Epoch: 0.17145355616012437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8824, "loss": 0.25165319442749023, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5196056365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:22] (step=0008824) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.17147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8825, "loss": 0.12468470633029938, "memory_gb": 7.721559524536133, "step_time_ms": 3495.1462745666504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:26] (step=0008825) Train Loss: 0.1732, Train Steps/Sec: 0.28, Epoch: 0.1714924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8826, "loss": 0.16807112097740173, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4140796661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:29] (step=0008826) Train Loss: 0.1607, Train Steps/Sec: 0.28, Epoch: 0.17151185386708123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8827, "loss": 0.27598118782043457, "memory_gb": 7.721559524536133, "step_time_ms": 3357.067823410034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:33] (step=0008827) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.17153128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8828, "loss": 0.2596280574798584, "memory_gb": 7.721559524536133, "step_time_ms": 3357.463836669922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:36] (step=0008828) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.17155071900505248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8829, "loss": 0.24337534606456757, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4578552246094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:40] (step=0008829) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.1715701515740381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8830, "loss": 0.28988635540008545, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0753593444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:44] (step=0008830) Train Loss: 0.2888, Train Steps/Sec: 0.28, Epoch: 0.17158958414302372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8831, "loss": 0.28112685680389404, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1453819274902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:47] (step=0008831) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.17160901671200932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8832, "loss": 0.22903209924697876, "memory_gb": 7.721559524536133, "step_time_ms": 3358.443021774292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:51] (step=0008832) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.17162844928099494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8833, "loss": 0.18060633540153503, "memory_gb": 7.721559524536133, "step_time_ms": 3351.867437362671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:54] (step=0008833) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.17164788184998056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:51:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8834, "loss": 0.12410760670900345, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2401790618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:51:58] (step=0008834) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.17166731441896618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8835, "loss": 0.18777161836624146, "memory_gb": 7.721559524536133, "step_time_ms": 3357.966899871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:01] (step=0008835) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.1716867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8836, "loss": 0.2065240889787674, "memory_gb": 7.721559524536133, "step_time_ms": 3358.557939529419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:05] (step=0008836) Train Loss: 0.2553, Train Steps/Sec: 0.27, Epoch: 0.17170617955693743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8837, "loss": 0.2029288113117218, "memory_gb": 7.721559524536133, "step_time_ms": 3350.226402282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:09] (step=0008837) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.17172561212592305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8838, "loss": 0.34523847699165344, "memory_gb": 7.721559524536133, "step_time_ms": 3352.41961479187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:12] (step=0008838) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.17174504469490867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8839, "loss": 0.3867638111114502, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1627349853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:16] (step=0008839) Train Loss: 0.3087, Train Steps/Sec: 0.28, Epoch: 0.1717644772638943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8840, "loss": 0.2713879942893982, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9886474609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:19] (step=0008840) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.17178390983287992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8841, "loss": 0.28558483719825745, "memory_gb": 7.721559524536133, "step_time_ms": 3353.098154067993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:23] (step=0008841) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.17180334240186554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8842, "loss": 0.13427230715751648, "memory_gb": 7.721559524536133, "step_time_ms": 3355.147361755371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:26] (step=0008842) Train Loss: 0.1891, Train Steps/Sec: 0.28, Epoch: 0.17182277497085113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8843, "loss": 0.21005752682685852, "memory_gb": 7.721559524536133, "step_time_ms": 3356.03666305542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:30] (step=0008843) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.17184220753983676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8844, "loss": 0.14231249690055847, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1838397979736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:34] (step=0008844) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.17186164010882238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8845, "loss": 0.2904394865036011, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0892086029053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:37] (step=0008845) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.171881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8846, "loss": 0.24534007906913757, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7375717163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:41] (step=0008846) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.17190050524679362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8847, "loss": 0.269517719745636, "memory_gb": 7.721559524536133, "step_time_ms": 3354.497194290161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:44] (step=0008847) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.17191993781577924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8848, "loss": 0.34174033999443054, "memory_gb": 7.721559524536133, "step_time_ms": 3345.6175327301025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:48] (step=0008848) Train Loss: 0.3205, Train Steps/Sec: 0.28, Epoch: 0.17193937038476487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8849, "loss": 0.0943584144115448, "memory_gb": 7.721559524536133, "step_time_ms": 3354.552745819092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:51] (step=0008849) Train Loss: 0.1401, Train Steps/Sec: 0.28, Epoch: 0.1719588029537505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8850, "loss": 0.2866194248199463, "memory_gb": 7.721559524536133, "step_time_ms": 3353.466033935547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:55] (step=0008850) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.1719782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:52:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8851, "loss": 0.3248140513896942, "memory_gb": 7.721559524536133, "step_time_ms": 3360.368251800537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:52:58] (step=0008851) Train Loss: 0.3472, Train Steps/Sec: 0.28, Epoch: 0.17199766809172173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8852, "loss": 0.1687697023153305, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7024936676025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:02] (step=0008852) Train Loss: 0.1598, Train Steps/Sec: 0.28, Epoch: 0.17201710066070736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8853, "loss": 0.26066696643829346, "memory_gb": 7.721559524536133, "step_time_ms": 3344.0043926239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:06] (step=0008853) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.17203653322969298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8854, "loss": 0.20490366220474243, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2845039367676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:09] (step=0008854) Train Loss: 0.2326, Train Steps/Sec: 0.28, Epoch: 0.17205596579867857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8855, "loss": 0.23716947436332703, "memory_gb": 7.721559524536133, "step_time_ms": 3354.374408721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:13] (step=0008855) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.1720753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8856, "loss": 0.23570989072322845, "memory_gb": 7.721559524536133, "step_time_ms": 3356.661319732666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:16] (step=0008856) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.17209483093664982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8857, "loss": 0.2391858994960785, "memory_gb": 7.721559524536133, "step_time_ms": 3353.703022003174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:20] (step=0008857) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.17211426350563544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8858, "loss": 0.2692980170249939, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6925106048584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:23] (step=0008858) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.17213369607462106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8859, "loss": 0.20561383664608002, "memory_gb": 7.721559524536133, "step_time_ms": 3357.03444480896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:27] (step=0008859) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.17215312864360668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8860, "loss": 0.3541876971721649, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8849563598633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:31] (step=0008860) Train Loss: 0.3543, Train Steps/Sec: 0.28, Epoch: 0.1721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8861, "loss": 0.12940102815628052, "memory_gb": 7.721559524536133, "step_time_ms": 3351.834774017334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:34] (step=0008861) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.17219199378157793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8862, "loss": 0.27972209453582764, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6617546081543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:38] (step=0008862) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.17221142635056355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8863, "loss": 0.09824954718351364, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0550212860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:41] (step=0008863) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.17223085891954917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8864, "loss": 0.33211708068847656, "memory_gb": 7.715639114379883, "step_time_ms": 3325.3347873687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:45] (step=0008864) Train Loss: 0.2928, Train Steps/Sec: 0.28, Epoch: 0.1722502914885348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8865, "loss": 0.30066514015197754, "memory_gb": 7.721559524536133, "step_time_ms": 3503.826141357422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:48] (step=0008865) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.17226972405752042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8866, "loss": 0.25368884205818176, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6288738250732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:52] (step=0008866) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.172289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8867, "loss": 0.14444288611412048, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2850017547607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:56] (step=0008867) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.17230858919549163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8868, "loss": 0.32414522767066956, "memory_gb": 7.721559524536133, "step_time_ms": 3356.092691421509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:53:59] (step=0008868) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.17232802176447726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8869, "loss": 0.2883853614330292, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5078716278076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:03] (step=0008869) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.17234745433346288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8870, "loss": 0.18013882637023926, "memory_gb": 7.721559524536133, "step_time_ms": 3359.370231628418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:06] (step=0008870) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.1723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8871, "loss": 0.21682484447956085, "memory_gb": 7.721559524536133, "step_time_ms": 3357.405185699463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:10] (step=0008871) Train Loss: 0.3148, Train Steps/Sec: 0.28, Epoch: 0.17238631947143412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8872, "loss": 0.28193193674087524, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8569679260254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:13] (step=0008872) Train Loss: 0.2752, Train Steps/Sec: 0.28, Epoch: 0.17240575204041975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8873, "loss": 0.30009549856185913, "memory_gb": 7.721559524536133, "step_time_ms": 3358.794689178467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:17] (step=0008873) Train Loss: 0.3074, Train Steps/Sec: 0.28, Epoch: 0.17242518460940537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8874, "loss": 0.25317853689193726, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9155673980713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:21] (step=0008874) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.172444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8875, "loss": 0.2057553380727768, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5690307617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:24] (step=0008875) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.1724640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8876, "loss": 0.23057332634925842, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4110736846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:28] (step=0008876) Train Loss: 0.2121, Train Steps/Sec: 0.27, Epoch: 0.17248348231636224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8877, "loss": 0.13655252754688263, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8524589538574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:31] (step=0008877) Train Loss: 0.1815, Train Steps/Sec: 0.28, Epoch: 0.17250291488534783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8878, "loss": 0.20332351326942444, "memory_gb": 7.721559524536133, "step_time_ms": 3356.05788230896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:35] (step=0008878) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.17252234745433345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8879, "loss": 0.3252715468406677, "memory_gb": 7.721559524536133, "step_time_ms": 3356.762409210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:39] (step=0008879) Train Loss: 0.3186, Train Steps/Sec: 0.28, Epoch: 0.17254178002331907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8880, "loss": 0.20373883843421936, "memory_gb": 7.721559524536133, "step_time_ms": 3357.274293899536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:42] (step=0008880) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.1725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8881, "loss": 0.24909088015556335, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6787033081055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:46] (step=0008881) Train Loss: 0.2966, Train Steps/Sec: 0.28, Epoch: 0.17258064516129032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8882, "loss": 0.19507041573524475, "memory_gb": 7.721559524536133, "step_time_ms": 3358.009099960327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:49] (step=0008882) Train Loss: 0.1813, Train Steps/Sec: 0.28, Epoch: 0.17260007773027594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8883, "loss": 0.20305046439170837, "memory_gb": 7.721559524536133, "step_time_ms": 3360.914707183838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:53] (step=0008883) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.17261951029926156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8884, "loss": 0.15152204036712646, "memory_gb": 7.721559524536133, "step_time_ms": 3360.941171646118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:54:56] (step=0008884) Train Loss: 0.1765, Train Steps/Sec: 0.28, Epoch: 0.17263894286824719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8885, "loss": 0.1768205612897873, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3318462371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:00] (step=0008885) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.1726583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8886, "loss": 0.27184391021728516, "memory_gb": 7.715639114379883, "step_time_ms": 3323.219060897827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:03] (step=0008886) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.17267780800621843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8887, "loss": 0.2272799015045166, "memory_gb": 7.721559524536133, "step_time_ms": 3364.596366882324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:07] (step=0008887) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.17269724057520405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8888, "loss": 0.2517108619213104, "memory_gb": 7.721559524536133, "step_time_ms": 3361.603260040283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:11] (step=0008888) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.17271667314418968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8889, "loss": 0.25162094831466675, "memory_gb": 7.721559524536133, "step_time_ms": 3364.899158477783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:14] (step=0008889) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.17273610571317527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8890, "loss": 0.3245985209941864, "memory_gb": 7.721559524536133, "step_time_ms": 3344.5310592651367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:18] (step=0008890) Train Loss: 0.2778, Train Steps/Sec: 0.28, Epoch: 0.1727555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8891, "loss": 0.3279339075088501, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8834228515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:21] (step=0008891) Train Loss: 0.2527, Train Steps/Sec: 0.28, Epoch: 0.17277497085114651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8892, "loss": 0.1417970359325409, "memory_gb": 7.721559524536133, "step_time_ms": 3361.30952835083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:25] (step=0008892) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.17279440342013214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8893, "loss": 0.24175012111663818, "memory_gb": 7.721559524536133, "step_time_ms": 3356.855630874634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:28] (step=0008893) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.17281383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8894, "loss": 0.28978273272514343, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4414672851562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:32] (step=0008894) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.17283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8895, "loss": 0.25883907079696655, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5314960479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:36] (step=0008895) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.172852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8896, "loss": 0.24601677060127258, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6275062561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:39] (step=0008896) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.17287213369607463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8897, "loss": 0.18633058667182922, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3834381103516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:43] (step=0008897) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.17289156626506025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8898, "loss": 0.4068624973297119, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9636554718018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:46] (step=0008898) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.17291099883404587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8899, "loss": 0.29783594608306885, "memory_gb": 7.721559524536133, "step_time_ms": 3362.87522315979, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:50] (step=0008899) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.1729304314030315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8900, "loss": 0.3341897130012512, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2145652770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:53] (step=0008900) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.1729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:55:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8901, "loss": 0.29546666145324707, "memory_gb": 7.721559524536133, "step_time_ms": 3353.06978225708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:55:57] (step=0008901) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.1729692965410027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8902, "loss": 0.3109780550003052, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7188720703125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:01] (step=0008902) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.17298872910998833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8903, "loss": 0.2696305513381958, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1649017333984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:04] (step=0008903) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.17300816167897395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8904, "loss": 0.26661980152130127, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2418899536133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:08] (step=0008904) Train Loss: 0.1964, Train Steps/Sec: 0.28, Epoch: 0.17302759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8905, "loss": 0.2717704176902771, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8523349761963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:11] (step=0008905) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.1730470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8906, "loss": 0.15121379494667053, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7424755096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:15] (step=0008906) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.17306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8907, "loss": 0.17004989087581635, "memory_gb": 7.721559524536133, "step_time_ms": 3499.2802143096924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:19] (step=0008907) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.17308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8908, "loss": 0.15423870086669922, "memory_gb": 7.721559524536133, "step_time_ms": 3360.37015914917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:22] (step=0008908) Train Loss: 0.1964, Train Steps/Sec: 0.28, Epoch: 0.17310532452390207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8909, "loss": 0.19614693522453308, "memory_gb": 7.721559524536133, "step_time_ms": 3354.440689086914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:26] (step=0008909) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.1731247570928877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8910, "loss": 0.23634716868400574, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9230308532715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:29] (step=0008910) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.1731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8911, "loss": 0.17697976529598236, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2590866088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:33] (step=0008911) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.17316362223085893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8912, "loss": 0.25214052200317383, "memory_gb": 7.721559524536133, "step_time_ms": 3360.720634460449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:36] (step=0008912) Train Loss: 0.1942, Train Steps/Sec: 0.28, Epoch: 0.17318305479984453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8913, "loss": 0.3458542227745056, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8813800811768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:40] (step=0008913) Train Loss: 0.3097, Train Steps/Sec: 0.28, Epoch: 0.17320248736883015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8914, "loss": 0.22290073335170746, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8265647888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:44] (step=0008914) Train Loss: 0.1707, Train Steps/Sec: 0.28, Epoch: 0.17322191993781577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8915, "loss": 0.1838037073612213, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9700717926025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:47] (step=0008915) Train Loss: 0.1631, Train Steps/Sec: 0.28, Epoch: 0.1732413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 8916, "loss": 0.1833503544330597, "memory_gb": 7.721559524536133, "step_time_ms": 3357.172727584839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:51] (step=0008916) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.17326078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8917, "loss": 0.17717725038528442, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4816246032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:54] (step=0008917) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.17328021764477264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:56:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 8918, "loss": 0.2749541401863098, "memory_gb": 7.721559524536133, "step_time_ms": 3353.076696395874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:56:58] (step=0008918) Train Loss: 0.3087, Train Steps/Sec: 0.28, Epoch: 0.17329965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8919, "loss": 0.27540987730026245, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4302406311035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:01] (step=0008919) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.17331908278274388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8920, "loss": 0.18250250816345215, "memory_gb": 7.721559524536133, "step_time_ms": 3349.494457244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:05] (step=0008920) Train Loss: 0.1741, Train Steps/Sec: 0.28, Epoch: 0.1733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8921, "loss": 0.3619617819786072, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5925102233887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:09] (step=0008921) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.17335794792071513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8922, "loss": 0.3396415710449219, "memory_gb": 7.721559524536133, "step_time_ms": 3342.4320220947266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:12] (step=0008922) Train Loss: 0.3222, Train Steps/Sec: 0.28, Epoch: 0.17337738048970075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 8923, "loss": 0.27861258387565613, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8995723724365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:16] (step=0008923) Train Loss: 0.2346, Train Steps/Sec: 0.27, Epoch: 0.17339681305868637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8924, "loss": 0.28086549043655396, "memory_gb": 7.721559524536133, "step_time_ms": 3354.203701019287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:19] (step=0008924) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.17341624562767197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 8925, "loss": 0.31359073519706726, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3855628967285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:23] (step=0008925) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.1734356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8926, "loss": 0.30005955696105957, "memory_gb": 7.721559524536133, "step_time_ms": 3352.602243423462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:27] (step=0008926) Train Loss: 0.3060, Train Steps/Sec: 0.28, Epoch: 0.1734551107656432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 8927, "loss": 0.26585882902145386, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8313121795654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:30] (step=0008927) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.17347454333462883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8928, "loss": 0.26247844099998474, "memory_gb": 7.721559524536133, "step_time_ms": 3351.97114944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:34] (step=0008928) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.17349397590361446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8929, "loss": 0.33513450622558594, "memory_gb": 7.721559524536133, "step_time_ms": 3350.221872329712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:37] (step=0008929) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.17351340847260008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 8930, "loss": 0.1949995458126068, "memory_gb": 7.721559524536133, "step_time_ms": 3351.717472076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:41] (step=0008930) Train Loss: 0.1779, Train Steps/Sec: 0.28, Epoch: 0.1735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8931, "loss": 0.29505568742752075, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7980060577393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:44] (step=0008931) Train Loss: 0.3471, Train Steps/Sec: 0.28, Epoch: 0.17355227361057132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 8932, "loss": 0.37338417768478394, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0069541931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:48] (step=0008932) Train Loss: 0.3011, Train Steps/Sec: 0.28, Epoch: 0.17357170617955694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8933, "loss": 0.16995422542095184, "memory_gb": 7.721559524536133, "step_time_ms": 3352.301597595215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:52] (step=0008933) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.17359113874854257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 8934, "loss": 0.20335713028907776, "memory_gb": 7.721559524536133, "step_time_ms": 3351.560592651367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:55] (step=0008934) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.1736105713175282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:57:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8935, "loss": 0.16497133672237396, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7515449523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:57:59] (step=0008935) Train Loss: 0.1745, Train Steps/Sec: 0.28, Epoch: 0.17363000388651378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 8936, "loss": 0.2742900550365448, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2726249694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:02] (step=0008936) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.1736494364554994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 8937, "loss": 0.16751930117607117, "memory_gb": 7.721559524536133, "step_time_ms": 3345.9420204162598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:06] (step=0008937) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.17366886902448503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 8938, "loss": 0.3070206046104431, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3079109191895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:09] (step=0008938) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.17368830159347065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 8939, "loss": 0.3147350251674652, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6983680725098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:13] (step=0008939) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.17370773416245627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8940, "loss": 0.1911286860704422, "memory_gb": 7.721559524536133, "step_time_ms": 3351.271152496338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:17] (step=0008940) Train Loss: 0.2194, Train Steps/Sec: 0.28, Epoch: 0.1737271667314419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 8941, "loss": 0.28388047218322754, "memory_gb": 7.721559524536133, "step_time_ms": 3356.823205947876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:20] (step=0008941) Train Loss: 0.2947, Train Steps/Sec: 0.28, Epoch: 0.17374659930042752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8942, "loss": 0.23052635788917542, "memory_gb": 7.721559524536133, "step_time_ms": 3350.088596343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:24] (step=0008942) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.17376603186941314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 8943, "loss": 0.29727327823638916, "memory_gb": 7.721559524536133, "step_time_ms": 3346.5425968170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:27] (step=0008943) Train Loss: 0.3062, Train Steps/Sec: 0.28, Epoch: 0.17378546443839876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 8944, "loss": 0.28174659609794617, "memory_gb": 7.721559524536133, "step_time_ms": 3349.538803100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:31] (step=0008944) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.17380489700738438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 8945, "loss": 0.24372170865535736, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4304370880127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:34] (step=0008945) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.17382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 8946, "loss": 0.2243758887052536, "memory_gb": 7.721559524536133, "step_time_ms": 3357.802391052246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:38] (step=0008946) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.17384376214535563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8947, "loss": 0.17135091125965118, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9202938079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:42] (step=0008947) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.17386319471434122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 8948, "loss": 0.1944672167301178, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2254600524902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:45] (step=0008948) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.17388262728332685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8949, "loss": 0.3320990204811096, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6858768463135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:49] (step=0008949) Train Loss: 0.3295, Train Steps/Sec: 0.28, Epoch: 0.17390205985231247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 8950, "loss": 0.3658801019191742, "memory_gb": 7.721559524536133, "step_time_ms": 3356.951951980591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:52] (step=0008950) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.1739214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 8951, "loss": 0.24867069721221924, "memory_gb": 7.721559524536133, "step_time_ms": 3347.9435443878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:56] (step=0008951) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.1739409249902837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:58:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 8952, "loss": 0.12364531308412552, "memory_gb": 7.721559524536133, "step_time_ms": 3354.719638824463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:58:59] (step=0008952) Train Loss: 0.1486, Train Steps/Sec: 0.28, Epoch: 0.17396035755926934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 8953, "loss": 0.22311502695083618, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5592041015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:03] (step=0008953) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.17397979012825496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8954, "loss": 0.2975327670574188, "memory_gb": 7.721559524536133, "step_time_ms": 3500.8883476257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:07] (step=0008954) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.17399922269724058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 8955, "loss": 0.2338152527809143, "memory_gb": 7.721559524536133, "step_time_ms": 3348.08087348938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:10] (step=0008955) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.1740186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 8956, "loss": 0.19042617082595825, "memory_gb": 7.721559524536133, "step_time_ms": 3355.224132537842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:14] (step=0008956) Train Loss: 0.1684, Train Steps/Sec: 0.28, Epoch: 0.17403808783521182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 8957, "loss": 0.17710401117801666, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9982051849365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:17] (step=0008957) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.17405752040419745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 8958, "loss": 0.19964918494224548, "memory_gb": 7.721559524536133, "step_time_ms": 3348.1950759887695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:21] (step=0008958) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.17407695297318304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 8959, "loss": 0.18635182082653046, "memory_gb": 7.721559524536133, "step_time_ms": 3353.421211242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:24] (step=0008959) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.17409638554216866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 8960, "loss": 0.21394068002700806, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1200160980225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:28] (step=0008960) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.17411581811115429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8961, "loss": 0.29440397024154663, "memory_gb": 7.721559524536133, "step_time_ms": 3352.684736251831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:32] (step=0008961) Train Loss: 0.3023, Train Steps/Sec: 0.28, Epoch: 0.1741352506801399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 8962, "loss": 0.2655479907989502, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0242137908936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:35] (step=0008962) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.17415468324912553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8963, "loss": 0.24059513211250305, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4210872650146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:39] (step=0008963) Train Loss: 0.1958, Train Steps/Sec: 0.28, Epoch: 0.17417411581811115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 8964, "loss": 0.250717431306839, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5544147491455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:42] (step=0008964) Train Loss: 0.2360, Train Steps/Sec: 0.27, Epoch: 0.17419354838709677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 8965, "loss": 0.19138911366462708, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8529148101807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:46] (step=0008965) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.1742129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 8966, "loss": 0.21579837799072266, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6009998321533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:49] (step=0008966) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.17423241352506802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 8967, "loss": 0.26485586166381836, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2122116088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:53] (step=0008967) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.17425184609405364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 08:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8968, "loss": 0.2644974887371063, "memory_gb": 7.721559524536133, "step_time_ms": 3348.2823371887207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 08:59:57] (step=0008968) Train Loss: 0.3023, Train Steps/Sec: 0.28, Epoch: 0.17427127866303926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 8969, "loss": 0.32205212116241455, "memory_gb": 7.721559524536133, "step_time_ms": 3357.041835784912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:00] (step=0008969) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.1742907112320249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 8970, "loss": 0.28180670738220215, "memory_gb": 7.721559524536133, "step_time_ms": 3356.672525405884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:04] (step=0008970) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.17431014380101048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 8971, "loss": 0.2825324833393097, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4041595458984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:07] (step=0008971) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.1743295763699961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 8972, "loss": 0.28378844261169434, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4295768737793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:11] (step=0008972) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.17434900893898173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8973, "loss": 0.23830848932266235, "memory_gb": 7.721559524536133, "step_time_ms": 3354.696273803711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:15] (step=0008973) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.17436844150796735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 8974, "loss": 0.258097767829895, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4670085906982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:18] (step=0008974) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.17438787407695297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8975, "loss": 0.2614325284957886, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7163944244385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:22] (step=0008975) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.1744073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 8976, "loss": 0.2382611334323883, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3102951049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:25] (step=0008976) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.17442673921492421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8977, "loss": 0.3130427598953247, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3952236175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:29] (step=0008977) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.17444617178390984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 8978, "loss": 0.3494223356246948, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4068031311035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:32] (step=0008978) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.17446560435289546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 8979, "loss": 0.19138576090335846, "memory_gb": 7.721559524536133, "step_time_ms": 3343.078851699829, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:36] (step=0008979) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.17448503692188108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 8980, "loss": 0.15764224529266357, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9465408325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:39] (step=0008980) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.1745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 8981, "loss": 0.18527020514011383, "memory_gb": 7.721559524536133, "step_time_ms": 3361.682415008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:43] (step=0008981) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.17452390205985233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8982, "loss": 0.20604020357131958, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9880046844482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:47] (step=0008982) Train Loss: 0.1833, Train Steps/Sec: 0.28, Epoch: 0.17454333462883792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 8983, "loss": 0.18765145540237427, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2757453918457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:50] (step=0008983) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.17456276719782354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 8984, "loss": 0.1256190836429596, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5431785583496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:54] (step=0008984) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.17458219976680917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:00:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 8985, "loss": 0.2643930912017822, "memory_gb": 7.721559524536133, "step_time_ms": 3360.713481903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:00:57] (step=0008985) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.1746016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 8986, "loss": 0.10027122497558594, "memory_gb": 7.721559524536133, "step_time_ms": 3358.012914657593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:01] (step=0008986) Train Loss: 0.1698, Train Steps/Sec: 0.28, Epoch: 0.1746210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 8987, "loss": 0.24047410488128662, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5659008026123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:05] (step=0008987) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.17464049747376603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 8988, "loss": 0.1940060704946518, "memory_gb": 7.721559524536133, "step_time_ms": 3362.905263900757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:08] (step=0008988) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.17465993004275165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 8989, "loss": 0.2185530811548233, "memory_gb": 7.715639114379883, "step_time_ms": 3326.0128498077393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:12] (step=0008989) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.17467936261173728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 8990, "loss": 0.2546229958534241, "memory_gb": 7.721559524536133, "step_time_ms": 3361.511707305908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:15] (step=0008990) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.1746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 8991, "loss": 0.27891474962234497, "memory_gb": 7.721559524536133, "step_time_ms": 3363.605260848999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:19] (step=0008991) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.17471822774970852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 8992, "loss": 0.18961837887763977, "memory_gb": 7.721559524536133, "step_time_ms": 3364.216089248657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:22] (step=0008992) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.17473766031869414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 8993, "loss": 0.2101111114025116, "memory_gb": 7.721559524536133, "step_time_ms": 3363.516330718994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:26] (step=0008993) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.17475709288767974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 8994, "loss": 0.2191276252269745, "memory_gb": 7.721559524536133, "step_time_ms": 3359.734296798706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:29] (step=0008994) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.17477652545666536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 8995, "loss": 0.207072913646698, "memory_gb": 7.721559524536133, "step_time_ms": 3505.2223205566406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:33] (step=0008995) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.17479595802565098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 8996, "loss": 0.26777368783950806, "memory_gb": 7.721559524536133, "step_time_ms": 3364.589214324951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:37] (step=0008996) Train Loss: 0.3051, Train Steps/Sec: 0.28, Epoch: 0.1748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 8997, "loss": 0.3312416672706604, "memory_gb": 7.721559524536133, "step_time_ms": 3360.184907913208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:40] (step=0008997) Train Loss: 0.3594, Train Steps/Sec: 0.28, Epoch: 0.17483482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 8998, "loss": 0.18021929264068604, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6571941375732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:44] (step=0008998) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.17485425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 8999, "loss": 0.2451089471578598, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8399047851562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:47] (step=0008999) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.17487368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9000, "loss": 0.2179076373577118, "memory_gb": 7.721559524536133, "step_time_ms": 3356.036901473999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:51] (step=0009000) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.1748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9001, "loss": 0.326842725276947, "memory_gb": 7.721559524536133, "step_time_ms": 3356.677532196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:55] (step=0009001) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.17491255343956472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9002, "loss": 0.24487736821174622, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9746437072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:01:58] (step=0009002) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.17493198600855034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9003, "loss": 0.2647164463996887, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3623638153076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:02] (step=0009003) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.17495141857753596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9004, "loss": 0.2061455100774765, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6981296539307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:05] (step=0009004) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.17497085114652158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9005, "loss": 0.14463111758232117, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5823307037354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:09] (step=0009005) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.17499028371550718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9006, "loss": 0.15067553520202637, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5711975097656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:12] (step=0009006) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.1750097162844928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9007, "loss": 0.23520216345787048, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3873043060303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:16] (step=0009007) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.17502914885347842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9008, "loss": 0.22221244871616364, "memory_gb": 7.721559524536133, "step_time_ms": 3352.095127105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:20] (step=0009008) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.17504858142246404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9009, "loss": 0.2019520103931427, "memory_gb": 7.721559524536133, "step_time_ms": 3358.431577682495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:23] (step=0009009) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.17506801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9010, "loss": 0.2643297612667084, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4889850616455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:27] (step=0009010) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.1750874465604353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9011, "loss": 0.26056820154190063, "memory_gb": 7.721559524536133, "step_time_ms": 3356.942653656006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:30] (step=0009011) Train Loss: 0.2107, Train Steps/Sec: 0.28, Epoch: 0.1751068791294209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9012, "loss": 0.20026302337646484, "memory_gb": 7.721559524536133, "step_time_ms": 3349.6503829956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:34] (step=0009012) Train Loss: 0.1913, Train Steps/Sec: 0.27, Epoch: 0.17512631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9013, "loss": 0.2787056267261505, "memory_gb": 7.721559524536133, "step_time_ms": 3354.503393173218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:38] (step=0009013) Train Loss: 0.2873, Train Steps/Sec: 0.28, Epoch: 0.17514574426739216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9014, "loss": 0.17716479301452637, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8600158691406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:41] (step=0009014) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.17516517683637778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9015, "loss": 0.20014262199401855, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0491333007812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:45] (step=0009015) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.1751846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9016, "loss": 0.27657920122146606, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2121601104736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:48] (step=0009016) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.175204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9017, "loss": 0.2662147879600525, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8871746063232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:52] (step=0009017) Train Loss: 0.2605, Train Steps/Sec: 0.28, Epoch: 0.17522347454333462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9018, "loss": 0.3511008024215698, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2637310028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:56] (step=0009018) Train Loss: 0.2878, Train Steps/Sec: 0.28, Epoch: 0.17524290711232024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9019, "loss": 0.22270843386650085, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1762828826904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:02:59] (step=0009019) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.17526233968130586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9020, "loss": 0.2570019066333771, "memory_gb": 7.721559524536133, "step_time_ms": 3356.764554977417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:03] (step=0009020) Train Loss: 0.2678, Train Steps/Sec: 0.28, Epoch: 0.17528177225029148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9021, "loss": 0.20159879326820374, "memory_gb": 7.721559524536133, "step_time_ms": 3347.836971282959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:06] (step=0009021) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.1753012048192771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9022, "loss": 0.18481019139289856, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2512397766113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:10] (step=0009022) Train Loss: 0.1641, Train Steps/Sec: 0.28, Epoch: 0.17532063738826273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9023, "loss": 0.35227611660957336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2052249908447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:13] (step=0009023) Train Loss: 0.3053, Train Steps/Sec: 0.28, Epoch: 0.17534006995724835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9024, "loss": 0.24525868892669678, "memory_gb": 7.721559524536133, "step_time_ms": 3352.023124694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:17] (step=0009024) Train Loss: 0.2014, Train Steps/Sec: 0.28, Epoch: 0.17535950252623397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9025, "loss": 0.1904328316450119, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5356941223145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:21] (step=0009025) Train Loss: 0.1765, Train Steps/Sec: 0.28, Epoch: 0.1753789350952196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9026, "loss": 0.20906852185726166, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6667098999023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:24] (step=0009026) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.17539836766420522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9027, "loss": 0.20408861339092255, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0804595947266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:28] (step=0009027) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.17541780023319084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9028, "loss": 0.23309700191020966, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7541370391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:31] (step=0009028) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.17543723280217643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9029, "loss": 0.27492907643318176, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4988117218018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:35] (step=0009029) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.17545666537116206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9030, "loss": 0.24687694013118744, "memory_gb": 7.721559524536133, "step_time_ms": 3345.5443382263184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:38] (step=0009030) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.17547609794014768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9031, "loss": 0.22708065807819366, "memory_gb": 7.721559524536133, "step_time_ms": 3354.316473007202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:42] (step=0009031) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.1754955305091333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9032, "loss": 0.34257644414901733, "memory_gb": 7.721559524536133, "step_time_ms": 3345.931053161621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:46] (step=0009032) Train Loss: 0.3030, Train Steps/Sec: 0.28, Epoch: 0.17551496307811892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9033, "loss": 0.27676132321357727, "memory_gb": 7.721559524536133, "step_time_ms": 3337.735414505005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:49] (step=0009033) Train Loss: 0.2479, Train Steps/Sec: 0.29, Epoch: 0.17553439564710455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9034, "loss": 0.3095178008079529, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3547134399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:53] (step=0009034) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.17555382821609017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:03:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9035, "loss": 0.18570023775100708, "memory_gb": 7.721559524536133, "step_time_ms": 3342.7863121032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:03:56] (step=0009035) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.1755732607850758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9036, "loss": 0.17977337539196014, "memory_gb": 7.721559524536133, "step_time_ms": 3349.356174468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:00] (step=0009036) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.1755926933540614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9037, "loss": 0.21696992218494415, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5537719726562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:03] (step=0009037) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.17561212592304704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9038, "loss": 0.331894189119339, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0925254821777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:07] (step=0009038) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.17563155849203266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9039, "loss": 0.3124619722366333, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8509349823, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:11] (step=0009039) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.17565099106101828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9040, "loss": 0.3155457377433777, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6603870391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:14] (step=0009040) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.17567042363000387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9041, "loss": 0.17905376851558685, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0070362091064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:18] (step=0009041) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.1756898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9042, "loss": 0.29545170068740845, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9869060516357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:21] (step=0009042) Train Loss: 0.2358, Train Steps/Sec: 0.28, Epoch: 0.17570928876797512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9043, "loss": 0.13052710890769958, "memory_gb": 7.721559524536133, "step_time_ms": 3494.8477745056152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:25] (step=0009043) Train Loss: 0.1821, Train Steps/Sec: 0.28, Epoch: 0.17572872133696074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9044, "loss": 0.1375799924135208, "memory_gb": 7.721559524536133, "step_time_ms": 3349.968671798706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:28] (step=0009044) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.17574815390594636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9045, "loss": 0.3076358139514923, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8526554107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:32] (step=0009045) Train Loss: 0.3295, Train Steps/Sec: 0.28, Epoch: 0.17576758647493199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9046, "loss": 0.24896690249443054, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6414165496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:36] (step=0009046) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.1757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9047, "loss": 0.27444273233413696, "memory_gb": 7.721559524536133, "step_time_ms": 3346.5635776519775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:39] (step=0009047) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.17580645161290323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9048, "loss": 0.19039011001586914, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5806407928467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:43] (step=0009048) Train Loss: 0.1775, Train Steps/Sec: 0.28, Epoch: 0.17582588418188885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9049, "loss": 0.18655140697956085, "memory_gb": 7.721559524536133, "step_time_ms": 3351.051092147827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:46] (step=0009049) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.17584531675087448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9050, "loss": 0.2536337971687317, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1851978302, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:50] (step=0009050) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.1758647493198601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9051, "loss": 0.2536300718784332, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3007888793945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:53] (step=0009051) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.1758841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9052, "loss": 0.335696816444397, "memory_gb": 7.721559524536133, "step_time_ms": 3352.946996688843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:04:57] (step=0009052) Train Loss: 0.2660, Train Steps/Sec: 0.27, Epoch: 0.17590361445783131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9053, "loss": 0.23841112852096558, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3642807006836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:01] (step=0009053) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.17592304702681694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9054, "loss": 0.19866526126861572, "memory_gb": 7.721559524536133, "step_time_ms": 3353.323221206665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:04] (step=0009054) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.17594247959580256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9055, "loss": 0.17570710182189941, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2142429351807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:08] (step=0009055) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.17596191216478818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9056, "loss": 0.1955210268497467, "memory_gb": 7.721559524536133, "step_time_ms": 3356.757640838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:11] (step=0009056) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.1759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9057, "loss": 0.25830358266830444, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1900177001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:15] (step=0009057) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.17600077730275943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9058, "loss": 0.36122414469718933, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7211418151855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:19] (step=0009058) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.17602020987174505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9059, "loss": 0.18282689154148102, "memory_gb": 7.721559524536133, "step_time_ms": 3353.909969329834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:22] (step=0009059) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.17603964244073067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9060, "loss": 0.24115046858787537, "memory_gb": 7.721559524536133, "step_time_ms": 3350.442409515381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:26] (step=0009060) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.1760590750097163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9061, "loss": 0.21222345530986786, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3330478668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:29] (step=0009061) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.17607850757870191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9062, "loss": 0.15035155415534973, "memory_gb": 7.721559524536133, "step_time_ms": 3357.280492782593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:33] (step=0009062) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.17609794014768754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9063, "loss": 0.20155277848243713, "memory_gb": 7.721559524536133, "step_time_ms": 3352.494716644287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:36] (step=0009063) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.17611737271667313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9064, "loss": 0.16239817440509796, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4584980010986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:40] (step=0009064) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.17613680528565875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9065, "loss": 0.285072386264801, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4817905426025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:44] (step=0009065) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.17615623785464438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9066, "loss": 0.1486324667930603, "memory_gb": 7.721559524536133, "step_time_ms": 3352.123975753784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:47] (step=0009066) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.17617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9067, "loss": 0.21232977509498596, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1157245635986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:51] (step=0009067) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.17619510299261562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9068, "loss": 0.27119940519332886, "memory_gb": 7.721559524536133, "step_time_ms": 3355.529308319092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:54] (step=0009068) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.17621453556160124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9069, "loss": 0.30602285265922546, "memory_gb": 7.721559524536133, "step_time_ms": 3348.9277362823486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:05:58] (step=0009069) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.17623396813058687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9070, "loss": 0.3424724340438843, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8603687286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:01] (step=0009070) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.1762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9071, "loss": 0.3237383961677551, "memory_gb": 7.721559524536133, "step_time_ms": 3354.351043701172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:05] (step=0009071) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.1762728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9072, "loss": 0.19946786761283875, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8808097839355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:09] (step=0009072) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.17629226583754373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9073, "loss": 0.2394292950630188, "memory_gb": 7.721559524536133, "step_time_ms": 3350.084066390991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:12] (step=0009073) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.17631169840652935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9074, "loss": 0.3231591582298279, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1441898345947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:16] (step=0009074) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.17633113097551498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9075, "loss": 0.1866094470024109, "memory_gb": 7.721559524536133, "step_time_ms": 3346.6148376464844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:19] (step=0009075) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.17635056354450057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9076, "loss": 0.2082405984401703, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7028770446777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:23] (step=0009076) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.1763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9077, "loss": 0.3163852095603943, "memory_gb": 7.721559524536133, "step_time_ms": 3347.9416370391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:26] (step=0009077) Train Loss: 0.2950, Train Steps/Sec: 0.28, Epoch: 0.17638942868247182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9078, "loss": 0.20792818069458008, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0727577209473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:30] (step=0009078) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.17640886125145744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9079, "loss": 0.25140076875686646, "memory_gb": 7.721559524536133, "step_time_ms": 3353.499412536621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:33] (step=0009079) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.17642829382044306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9080, "loss": 0.20496618747711182, "memory_gb": 7.721559524536133, "step_time_ms": 3339.4546508789062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:37] (step=0009080) Train Loss: 0.2738, Train Steps/Sec: 0.28, Epoch: 0.17644772638942868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9081, "loss": 0.19650650024414062, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2370071411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:41] (step=0009081) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.1764671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9082, "loss": 0.31312596797943115, "memory_gb": 7.721559524536133, "step_time_ms": 3357.48553276062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:44] (step=0009082) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.17648659152739993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9083, "loss": 0.3276183605194092, "memory_gb": 7.721559524536133, "step_time_ms": 3496.6180324554443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:48] (step=0009083) Train Loss: 0.3246, Train Steps/Sec: 0.28, Epoch: 0.17650602409638555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9084, "loss": 0.1893913596868515, "memory_gb": 7.721559524536133, "step_time_ms": 3355.384588241577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:51] (step=0009084) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.17652545666537117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9085, "loss": 0.2682938575744629, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7724952697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:55] (step=0009085) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.1765448892343568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9086, "loss": 0.1257416307926178, "memory_gb": 7.721559524536133, "step_time_ms": 3358.957290649414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:06:59] (step=0009086) Train Loss: 0.1705, Train Steps/Sec: 0.28, Epoch: 0.1765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9087, "loss": 0.26204541325569153, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6690006256104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:02] (step=0009087) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.176583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9088, "loss": 0.1908574253320694, "memory_gb": 7.721559524536133, "step_time_ms": 3358.942985534668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:06] (step=0009088) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.17660318694131363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9089, "loss": 0.20587803423404694, "memory_gb": 7.721559524536133, "step_time_ms": 3357.454538345337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:09] (step=0009089) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.17662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9090, "loss": 0.12889133393764496, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6034355163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:13] (step=0009090) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.17664205207928488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9091, "loss": 0.1746503710746765, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1082305908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:16] (step=0009091) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.1766614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9092, "loss": 0.21171844005584717, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2427196502686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:20] (step=0009092) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.17668091721725612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9093, "loss": 0.20437097549438477, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0526065826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:24] (step=0009093) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.17670034978624174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9094, "loss": 0.3282846510410309, "memory_gb": 7.721559524536133, "step_time_ms": 3359.442710876465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:27] (step=0009094) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.17671978235522737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9095, "loss": 0.27627480030059814, "memory_gb": 7.721559524536133, "step_time_ms": 3361.097574234009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:31] (step=0009095) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.176739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9096, "loss": 0.2547997832298279, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5996112823486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:34] (step=0009096) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.1767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9097, "loss": 0.30482932925224304, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1207523345947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:38] (step=0009097) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.17677808006218423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9098, "loss": 0.21624237298965454, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5127334594727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:42] (step=0009098) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.17679751263116983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9099, "loss": 0.2618701159954071, "memory_gb": 7.721559524536133, "step_time_ms": 3362.363576889038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:45] (step=0009099) Train Loss: 0.2322, Train Steps/Sec: 0.27, Epoch: 0.17681694520015545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9100, "loss": 0.14598050713539124, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0285358428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:49] (step=0009100) Train Loss: 0.1775, Train Steps/Sec: 0.28, Epoch: 0.17683637776914107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9101, "loss": 0.2712651491165161, "memory_gb": 7.721559524536133, "step_time_ms": 3364.990711212158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:52] (step=0009101) Train Loss: 0.2412, Train Steps/Sec: 0.28, Epoch: 0.1768558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9102, "loss": 0.3536595404148102, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6559715270996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:07:56] (step=0009102) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.17687524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9103, "loss": 0.253984659910202, "memory_gb": 7.715639114379883, "step_time_ms": 3317.491292953491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:00] (step=0009103) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.17689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9104, "loss": 0.16278475522994995, "memory_gb": 7.721559524536133, "step_time_ms": 3352.379322052002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:03] (step=0009104) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.17691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9105, "loss": 0.19424748420715332, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0122928619385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:07] (step=0009105) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.17693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9106, "loss": 0.22245314717292786, "memory_gb": 7.721559524536133, "step_time_ms": 3363.994359970093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:10] (step=0009106) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.1769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9107, "loss": 0.3404957056045532, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4437160491943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:14] (step=0009107) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.17697240575204043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9108, "loss": 0.21947917342185974, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7437915802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:17] (step=0009108) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.17699183832102605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9109, "loss": 0.32698920369148254, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1485023498535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:21] (step=0009109) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.17701127089001165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9110, "loss": 0.2479703426361084, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4420890808105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:25] (step=0009110) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.17703070345899727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9111, "loss": 0.3038104772567749, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4980239868164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:28] (step=0009111) Train Loss: 0.2681, Train Steps/Sec: 0.28, Epoch: 0.1770501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9112, "loss": 0.3332165777683258, "memory_gb": 7.721559524536133, "step_time_ms": 3361.539840698242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:32] (step=0009112) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.1770695685969685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9113, "loss": 0.20452310144901276, "memory_gb": 7.721559524536133, "step_time_ms": 3362.333297729492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:35] (step=0009113) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.17708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9114, "loss": 0.1585792601108551, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2733612060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:39] (step=0009114) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.17710843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9115, "loss": 0.3038557767868042, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7545623779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:43] (step=0009115) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.17712786630392538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9116, "loss": 0.24912062287330627, "memory_gb": 7.721559524536133, "step_time_ms": 3356.433153152466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:46] (step=0009116) Train Loss: 0.2886, Train Steps/Sec: 0.28, Epoch: 0.177147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9117, "loss": 0.2653745114803314, "memory_gb": 7.721559524536133, "step_time_ms": 3358.219623565674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:50] (step=0009117) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.17716673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9118, "loss": 0.24182209372520447, "memory_gb": 7.721559524536133, "step_time_ms": 3360.920190811157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:53] (step=0009118) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.17718616401088225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9119, "loss": 0.35182875394821167, "memory_gb": 7.721559524536133, "step_time_ms": 3358.590602874756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:08:57] (step=0009119) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.17720559657986787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9120, "loss": 0.23759442567825317, "memory_gb": 7.721559524536133, "step_time_ms": 3356.60982131958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:00] (step=0009120) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.1772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9121, "loss": 0.23479557037353516, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2232723236084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:04] (step=0009121) Train Loss: 0.2012, Train Steps/Sec: 0.28, Epoch: 0.17724446171783909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9122, "loss": 0.34151047468185425, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2448749542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:08] (step=0009122) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.1772638942868247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9123, "loss": 0.2770645022392273, "memory_gb": 7.721559524536133, "step_time_ms": 3360.139846801758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:11] (step=0009123) Train Loss: 0.2653, Train Steps/Sec: 0.28, Epoch: 0.17728332685581033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9124, "loss": 0.22350192070007324, "memory_gb": 7.721559524536133, "step_time_ms": 3356.501817703247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:15] (step=0009124) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.17730275942479595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9125, "loss": 0.3384096920490265, "memory_gb": 7.721559524536133, "step_time_ms": 3355.85880279541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:18] (step=0009125) Train Loss: 0.3000, Train Steps/Sec: 0.28, Epoch: 0.17732219199378157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9126, "loss": 0.17693212628364563, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8758239746094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:22] (step=0009126) Train Loss: 0.1914, Train Steps/Sec: 0.28, Epoch: 0.1773416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9127, "loss": 0.21417059004306793, "memory_gb": 7.721559524536133, "step_time_ms": 3360.51869392395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:26] (step=0009127) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.17736105713175282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9128, "loss": 0.20686016976833344, "memory_gb": 7.721559524536133, "step_time_ms": 3358.173131942749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:29] (step=0009128) Train Loss: 0.1962, Train Steps/Sec: 0.28, Epoch: 0.17738048970073844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9129, "loss": 0.18553656339645386, "memory_gb": 7.721559524536133, "step_time_ms": 3350.388765335083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:33] (step=0009129) Train Loss: 0.1994, Train Steps/Sec: 0.28, Epoch: 0.17739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9130, "loss": 0.21986989676952362, "memory_gb": 7.721559524536133, "step_time_ms": 3497.788190841675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:36] (step=0009130) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.1774193548387097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9131, "loss": 0.2929217219352722, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7699451446533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:40] (step=0009131) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.1774387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9132, "loss": 0.2833916246891022, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3272018432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:43] (step=0009132) Train Loss: 0.2929, Train Steps/Sec: 0.28, Epoch: 0.17745821997668093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9133, "loss": 0.29346224665641785, "memory_gb": 7.721559524536133, "step_time_ms": 3354.128837585449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:47] (step=0009133) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.17747765254566653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9134, "loss": 0.14881668984889984, "memory_gb": 7.721559524536133, "step_time_ms": 3358.319044113159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:51] (step=0009134) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.17749708511465215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9135, "loss": 0.1605886071920395, "memory_gb": 7.721559524536133, "step_time_ms": 3351.020336151123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:54] (step=0009135) Train Loss: 0.1775, Train Steps/Sec: 0.28, Epoch: 0.17751651768363777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:09:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9136, "loss": 0.12240120768547058, "memory_gb": 7.721559524536133, "step_time_ms": 3358.037233352661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:09:58] (step=0009136) Train Loss: 0.1941, Train Steps/Sec: 0.28, Epoch: 0.1775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9137, "loss": 0.29543963074684143, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9648990631104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:01] (step=0009137) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.17755538282160901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9138, "loss": 0.2273697704076767, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1229705810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:05] (step=0009138) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.17757481539059464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9139, "loss": 0.2573959231376648, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0303916931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:09] (step=0009139) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.17759424795958026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9140, "loss": 0.257796049118042, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1998138427734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:12] (step=0009140) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.17761368052856588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9141, "loss": 0.2998579144477844, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8036556243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:16] (step=0009141) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.1776331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9142, "loss": 0.183018296957016, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8738956451416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:19] (step=0009142) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.17765254566653713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9143, "loss": 0.23051801323890686, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8759479522705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:23] (step=0009143) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.17767197823552275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9144, "loss": 0.30914875864982605, "memory_gb": 7.721559524536133, "step_time_ms": 3355.630874633789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:26] (step=0009144) Train Loss: 0.2838, Train Steps/Sec: 0.28, Epoch: 0.17769141080450834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9145, "loss": 0.20140407979488373, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8478870391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:30] (step=0009145) Train Loss: 0.1862, Train Steps/Sec: 0.28, Epoch: 0.17771084337349397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9146, "loss": 0.20440185070037842, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0140647888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:34] (step=0009146) Train Loss: 0.1954, Train Steps/Sec: 0.28, Epoch: 0.1777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9147, "loss": 0.19844144582748413, "memory_gb": 7.721559524536133, "step_time_ms": 3342.434883117676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:37] (step=0009147) Train Loss: 0.1929, Train Steps/Sec: 0.27, Epoch: 0.1777497085114652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9148, "loss": 0.24555492401123047, "memory_gb": 7.721559524536133, "step_time_ms": 3348.668336868286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:41] (step=0009148) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.17776914108045083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9149, "loss": 0.1695048213005066, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9489040374756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:44] (step=0009149) Train Loss: 0.1891, Train Steps/Sec: 0.28, Epoch: 0.17778857364943645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9150, "loss": 0.15832854807376862, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8569679260254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:48] (step=0009150) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.17780800621842208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9151, "loss": 0.2004217803478241, "memory_gb": 7.721559524536133, "step_time_ms": 3356.318235397339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:52] (step=0009151) Train Loss: 0.1542, Train Steps/Sec: 0.28, Epoch: 0.1778274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9152, "loss": 0.24603271484375, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6646575927734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:55] (step=0009152) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.17784687135639332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9153, "loss": 0.20940172672271729, "memory_gb": 7.721559524536133, "step_time_ms": 3358.021020889282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:10:59] (step=0009153) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.17786630392537894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9154, "loss": 0.2513883709907532, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2118072509766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:02] (step=0009154) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.17788573649436457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9155, "loss": 0.23584769666194916, "memory_gb": 7.721559524536133, "step_time_ms": 3352.900266647339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:06] (step=0009155) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.1779051690633502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9156, "loss": 0.18299995362758636, "memory_gb": 7.721559524536133, "step_time_ms": 3357.146739959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:09] (step=0009156) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.17792460163233578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9157, "loss": 0.15352541208267212, "memory_gb": 7.721559524536133, "step_time_ms": 3356.180429458618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:13] (step=0009157) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.1779440342013214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9158, "loss": 0.16171066462993622, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6415519714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:17] (step=0009158) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.17796346677030703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9159, "loss": 0.21020753681659698, "memory_gb": 7.721559524536133, "step_time_ms": 3358.232021331787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:20] (step=0009159) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.17798289933929265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9160, "loss": 0.3691028952598572, "memory_gb": 7.721559524536133, "step_time_ms": 3360.687732696533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:24] (step=0009160) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.17800233190827827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9161, "loss": 0.26944953203201294, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6274852752686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:27] (step=0009161) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.1780217644772639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9162, "loss": 0.2838146686553955, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4014949798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:31] (step=0009162) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.17804119704624952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9163, "loss": 0.233353853225708, "memory_gb": 7.721559524536133, "step_time_ms": 3353.433847427368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:34] (step=0009163) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.17806062961523514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9164, "loss": 0.24959714710712433, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6292781829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:38] (step=0009164) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.17808006218422076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9165, "loss": 0.2821519374847412, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9374294281006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:42] (step=0009165) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.17809949475320638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9166, "loss": 0.27889877557754517, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8010330200195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:45] (step=0009166) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.178118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9167, "loss": 0.1979612410068512, "memory_gb": 7.721559524536133, "step_time_ms": 3345.0770378112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:49] (step=0009167) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.1781383598911776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9168, "loss": 0.22588400542736053, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8972816467285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:52] (step=0009168) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.17815779246016322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9169, "loss": 0.20382148027420044, "memory_gb": 7.721559524536133, "step_time_ms": 3361.537456512451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:56] (step=0009169) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.17817722502914884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9170, "loss": 0.2701101005077362, "memory_gb": 7.721559524536133, "step_time_ms": 3344.078302383423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:11:59] (step=0009170) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.17819665759813447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9171, "loss": 0.22209030389785767, "memory_gb": 7.721559524536133, "step_time_ms": 3498.220443725586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:03] (step=0009171) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.1782160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9172, "loss": 0.35830047726631165, "memory_gb": 7.721559524536133, "step_time_ms": 3357.382297515869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:07] (step=0009172) Train Loss: 0.3654, Train Steps/Sec: 0.28, Epoch: 0.1782355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9173, "loss": 0.1403082013130188, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6249141693115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:10] (step=0009173) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.17825495530509133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9174, "loss": 0.20343734323978424, "memory_gb": 7.715639114379883, "step_time_ms": 3326.5397548675537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:14] (step=0009174) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.17827438787407696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9175, "loss": 0.13209089636802673, "memory_gb": 7.721559524536133, "step_time_ms": 3339.6947383880615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:17] (step=0009175) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.17829382044306258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9176, "loss": 0.29655709862709045, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3191890716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:21] (step=0009176) Train Loss: 0.2807, Train Steps/Sec: 0.28, Epoch: 0.1783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9177, "loss": 0.2430969476699829, "memory_gb": 7.721559524536133, "step_time_ms": 3361.213445663452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:24] (step=0009177) Train Loss: 0.2657, Train Steps/Sec: 0.28, Epoch: 0.17833268558103382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9178, "loss": 0.2925819158554077, "memory_gb": 7.715639114379883, "step_time_ms": 3322.1423625946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:28] (step=0009178) Train Loss: 0.2950, Train Steps/Sec: 0.28, Epoch: 0.17835211815001945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9179, "loss": 0.32087141275405884, "memory_gb": 7.721559524536133, "step_time_ms": 3359.217882156372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:32] (step=0009179) Train Loss: 0.2841, Train Steps/Sec: 0.28, Epoch: 0.17837155071900504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9180, "loss": 0.1825312376022339, "memory_gb": 7.721559524536133, "step_time_ms": 3357.523202896118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:35] (step=0009180) Train Loss: 0.1853, Train Steps/Sec: 0.28, Epoch: 0.17839098328799066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9181, "loss": 0.27584308385849, "memory_gb": 7.721559524536133, "step_time_ms": 3360.640048980713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:39] (step=0009181) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.17841041585697628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9182, "loss": 0.23642608523368835, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1674823760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:42] (step=0009182) Train Loss: 0.2957, Train Steps/Sec: 0.28, Epoch: 0.1784298484259619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9183, "loss": 0.35361671447753906, "memory_gb": 7.721559524536133, "step_time_ms": 3359.180212020874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:46] (step=0009183) Train Loss: 0.3047, Train Steps/Sec: 0.28, Epoch: 0.17844928099494753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9184, "loss": 0.14407417178153992, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1553440093994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:49] (step=0009184) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.17846871356393315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9185, "loss": 0.2413926124572754, "memory_gb": 7.721559524536133, "step_time_ms": 3362.088441848755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:53] (step=0009185) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.17848814613291877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:12:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9186, "loss": 0.2440004199743271, "memory_gb": 7.721559524536133, "step_time_ms": 3356.580972671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:12:57] (step=0009186) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.1785075787019044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9187, "loss": 0.2327251434326172, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6598377227783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:00] (step=0009187) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.17852701127089002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9188, "loss": 0.22269266843795776, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9867095947266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:04] (step=0009188) Train Loss: 0.2097, Train Steps/Sec: 0.27, Epoch: 0.17854644383987564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9189, "loss": 0.12686946988105774, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6815853118896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:08] (step=0009189) Train Loss: 0.1916, Train Steps/Sec: 0.28, Epoch: 0.17856587640886126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9190, "loss": 0.28205031156539917, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2774143218994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:11] (step=0009190) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.17858530897784688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9191, "loss": 0.15826794505119324, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4960956573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:15] (step=0009191) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.17860474154683248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9192, "loss": 0.18537858128547668, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8217964172363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:18] (step=0009192) Train Loss: 0.1821, Train Steps/Sec: 0.28, Epoch: 0.1786241741158181, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9193, "loss": 0.24531292915344238, "memory_gb": 7.721559524536133, "step_time_ms": 3356.532096862793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:22] (step=0009193) Train Loss: 0.2626, Train Steps/Sec: 0.28, Epoch: 0.17864360668480372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9194, "loss": 0.2616640329360962, "memory_gb": 7.721559524536133, "step_time_ms": 3361.158847808838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:25] (step=0009194) Train Loss: 0.3131, Train Steps/Sec: 0.28, Epoch: 0.17866303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9195, "loss": 0.15643839538097382, "memory_gb": 7.721559524536133, "step_time_ms": 3356.144905090332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:29] (step=0009195) Train Loss: 0.1549, Train Steps/Sec: 0.28, Epoch: 0.17868247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9196, "loss": 0.23653215169906616, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9963912963867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:33] (step=0009196) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.1787019043917606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9197, "loss": 0.18436166644096375, "memory_gb": 7.721559524536133, "step_time_ms": 3360.872268676758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:36] (step=0009197) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.1787213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9198, "loss": 0.24421527981758118, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4049682617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:40] (step=0009198) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.17874076952973184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9199, "loss": 0.2006072998046875, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1481914520264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:43] (step=0009199) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.17876020209871746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9200, "loss": 0.23031532764434814, "memory_gb": 7.721559524536133, "step_time_ms": 3366.267204284668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:47] (step=0009200) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.17877963466770308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9201, "loss": 0.22054632008075714, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6536598205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:51] (step=0009201) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.1787990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9202, "loss": 0.1512519121170044, "memory_gb": 7.721559524536133, "step_time_ms": 3364.654779434204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:54] (step=0009202) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.1788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:13:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9203, "loss": 0.1234041079878807, "memory_gb": 7.721559524536133, "step_time_ms": 3361.88006401062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:13:58] (step=0009203) Train Loss: 0.1434, Train Steps/Sec: 0.28, Epoch: 0.17883793237465992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9204, "loss": 0.31683802604675293, "memory_gb": 7.721559524536133, "step_time_ms": 3366.121292114258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:01] (step=0009204) Train Loss: 0.3073, Train Steps/Sec: 0.28, Epoch: 0.17885736494364554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9205, "loss": 0.23057058453559875, "memory_gb": 7.721559524536133, "step_time_ms": 3362.865447998047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:05] (step=0009205) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.17887679751263116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9206, "loss": 0.332034170627594, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9912605285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:08] (step=0009206) Train Loss: 0.2994, Train Steps/Sec: 0.28, Epoch: 0.17889623008161679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9207, "loss": 0.24560396373271942, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8999462127686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:12] (step=0009207) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.1789156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9208, "loss": 0.20048055052757263, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2469902038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:16] (step=0009208) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.17893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9209, "loss": 0.12671567499637604, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5411262512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:19] (step=0009209) Train Loss: 0.1302, Train Steps/Sec: 0.28, Epoch: 0.17895452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9210, "loss": 0.38080981373786926, "memory_gb": 7.715639114379883, "step_time_ms": 3329.807996749878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:23] (step=0009210) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.17897396035755928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9211, "loss": 0.23907232284545898, "memory_gb": 7.715639114379883, "step_time_ms": 3321.406126022339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:26] (step=0009211) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.1789933929265449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9212, "loss": 0.21308791637420654, "memory_gb": 7.721559524536133, "step_time_ms": 3503.1161308288574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:30] (step=0009212) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.17901282549553052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9213, "loss": 0.3608606457710266, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4676303863525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:34] (step=0009213) Train Loss: 0.2781, Train Steps/Sec: 0.28, Epoch: 0.17903225806451614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9214, "loss": 0.21426773071289062, "memory_gb": 7.721559524536133, "step_time_ms": 3359.553813934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:37] (step=0009214) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.17905169063350174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9215, "loss": 0.1859799176454544, "memory_gb": 7.721559524536133, "step_time_ms": 3357.54656791687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:41] (step=0009215) Train Loss: 0.1836, Train Steps/Sec: 0.28, Epoch: 0.17907112320248736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9216, "loss": 0.24494612216949463, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5258769989014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:44] (step=0009216) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.17909055577147298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9217, "loss": 0.26565951108932495, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5408668518066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:48] (step=0009217) Train Loss: 0.2919, Train Steps/Sec: 0.28, Epoch: 0.1791099883404586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9218, "loss": 0.19183458387851715, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9039573669434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:51] (step=0009218) Train Loss: 0.1667, Train Steps/Sec: 0.28, Epoch: 0.17912942090944423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9219, "loss": 0.2006649225950241, "memory_gb": 7.721559524536133, "step_time_ms": 3363.980531692505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:55] (step=0009219) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.17914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9220, "loss": 0.22450321912765503, "memory_gb": 7.721559524536133, "step_time_ms": 3361.351490020752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:14:59] (step=0009220) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.17916828604741547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9221, "loss": 0.2940934896469116, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4959297180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:02] (step=0009221) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.1791877186164011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9222, "loss": 0.2932606637477875, "memory_gb": 7.721559524536133, "step_time_ms": 3357.693672180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:06] (step=0009222) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.17920715118538671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9223, "loss": 0.14134788513183594, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7098331451416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:09] (step=0009223) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.17922658375437234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9224, "loss": 0.2996940016746521, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1748008728027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:13] (step=0009224) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.17924601632335796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9225, "loss": 0.214620441198349, "memory_gb": 7.721559524536133, "step_time_ms": 3358.853340148926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:17] (step=0009225) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.17926544889234355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9226, "loss": 0.2429216504096985, "memory_gb": 7.721559524536133, "step_time_ms": 3353.973150253296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:20] (step=0009226) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.17928488146132918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9227, "loss": 0.3344106674194336, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0122203826904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:24] (step=0009227) Train Loss: 0.3337, Train Steps/Sec: 0.28, Epoch: 0.1793043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9228, "loss": 0.2232913374900818, "memory_gb": 7.721559524536133, "step_time_ms": 3361.116886138916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:27] (step=0009228) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.17932374659930042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9229, "loss": 0.23698967695236206, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3456745147705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:31] (step=0009229) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.17934317916828604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9230, "loss": 0.22163543105125427, "memory_gb": 7.721559524536133, "step_time_ms": 3349.989652633667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:34] (step=0009230) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.17936261173727167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9231, "loss": 0.25483304262161255, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3454360961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:38] (step=0009231) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.1793820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9232, "loss": 0.23756664991378784, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9768199920654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:42] (step=0009232) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.1794014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9233, "loss": 0.3076525330543518, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4461936950684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:45] (step=0009233) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.17942090944422853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9234, "loss": 0.2097422033548355, "memory_gb": 7.721559524536133, "step_time_ms": 3355.231523513794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:49] (step=0009234) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.17944034201321415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9235, "loss": 0.2431795448064804, "memory_gb": 7.721559524536133, "step_time_ms": 3356.689691543579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:52] (step=0009235) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.17945977458219978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9236, "loss": 0.2522268295288086, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2406253814697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:15:56] (step=0009236) Train Loss: 0.2841, Train Steps/Sec: 0.27, Epoch: 0.1794792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9237, "loss": 0.2658335566520691, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6954650878906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:00] (step=0009237) Train Loss: 0.2899, Train Steps/Sec: 0.28, Epoch: 0.179498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9238, "loss": 0.2249181568622589, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2583198547363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:03] (step=0009238) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.17951807228915662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9239, "loss": 0.29140329360961914, "memory_gb": 7.721559524536133, "step_time_ms": 3354.288101196289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:07] (step=0009239) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.17953750485814224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9240, "loss": 0.3576406240463257, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8293113708496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:10] (step=0009240) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.17955693742712786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9241, "loss": 0.2985510528087616, "memory_gb": 7.721559524536133, "step_time_ms": 3346.377372741699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:14] (step=0009241) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.17957636999611348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9242, "loss": 0.22786206007003784, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5830459594727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:17] (step=0009242) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.1795958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9243, "loss": 0.23044352233409882, "memory_gb": 7.721559524536133, "step_time_ms": 3353.975772857666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:21] (step=0009243) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.17961523513408473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9244, "loss": 0.2562931478023529, "memory_gb": 7.721559524536133, "step_time_ms": 3356.717109680176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:25] (step=0009244) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.17963466770307035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9245, "loss": 0.2694387435913086, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7097911834717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:28] (step=0009245) Train Loss: 0.2991, Train Steps/Sec: 0.28, Epoch: 0.17965410027205597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9246, "loss": 0.22964878380298615, "memory_gb": 7.715639114379883, "step_time_ms": 3321.7709064483643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:32] (step=0009246) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.1796735328410416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9247, "loss": 0.30395805835723877, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5124320983887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:35] (step=0009247) Train Loss: 0.3116, Train Steps/Sec: 0.28, Epoch: 0.17969296541002722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9248, "loss": 0.27072033286094666, "memory_gb": 7.721559524536133, "step_time_ms": 3353.29270362854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:39] (step=0009248) Train Loss: 0.2358, Train Steps/Sec: 0.28, Epoch: 0.17971239797901284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9249, "loss": 0.14491792023181915, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4105339050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:43] (step=0009249) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.17973183054799843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9250, "loss": 0.2105514109134674, "memory_gb": 7.721559524536133, "step_time_ms": 3350.588321685791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:46] (step=0009250) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.17975126311698406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9251, "loss": 0.23335841298103333, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4653911590576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:50] (step=0009251) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.17977069568596968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9252, "loss": 0.28718817234039307, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5710105895996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:53] (step=0009252) Train Loss: 0.2725, Train Steps/Sec: 0.28, Epoch: 0.1797901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:16:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9253, "loss": 0.29127174615859985, "memory_gb": 7.721559524536133, "step_time_ms": 3345.7441329956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:16:57] (step=0009253) Train Loss: 0.2965, Train Steps/Sec: 0.28, Epoch: 0.17980956082394092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9254, "loss": 0.26613062620162964, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2711219787598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:00] (step=0009254) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.17982899339292654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9255, "loss": 0.23326361179351807, "memory_gb": 7.721559524536133, "step_time_ms": 3357.769012451172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:04] (step=0009255) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.17984842596191217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9256, "loss": 0.27879083156585693, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7169227600098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:08] (step=0009256) Train Loss: 0.2494, Train Steps/Sec: 0.28, Epoch: 0.1798678585308978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9257, "loss": 0.23793882131576538, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6814403533936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:11] (step=0009257) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.1798872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9258, "loss": 0.17324575781822205, "memory_gb": 7.721559524536133, "step_time_ms": 3352.008819580078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:15] (step=0009258) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.17990672366886903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9259, "loss": 0.2697277069091797, "memory_gb": 7.721559524536133, "step_time_ms": 3347.368001937866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:18] (step=0009259) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.17992615623785466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9260, "loss": 0.24361707270145416, "memory_gb": 7.721559524536133, "step_time_ms": 3499.43470954895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:22] (step=0009260) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.17994558880684025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9261, "loss": 0.22916917502880096, "memory_gb": 7.721559524536133, "step_time_ms": 3356.898784637451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:25] (step=0009261) Train Loss: 0.2523, Train Steps/Sec: 0.28, Epoch: 0.17996502137582587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9262, "loss": 0.19810912013053894, "memory_gb": 7.721559524536133, "step_time_ms": 3359.300374984741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:29] (step=0009262) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.1799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9263, "loss": 0.27456605434417725, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9559326171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:33] (step=0009263) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.18000388651379712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9264, "loss": 0.25888118147850037, "memory_gb": 7.715639114379883, "step_time_ms": 3318.1538581848145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:36] (step=0009264) Train Loss: 0.2381, Train Steps/Sec: 0.28, Epoch: 0.18002331908278274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9265, "loss": 0.1597663015127182, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7704944610596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:40] (step=0009265) Train Loss: 0.2255, Train Steps/Sec: 0.28, Epoch: 0.18004275165176836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9266, "loss": 0.2157631814479828, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5196266174316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:43] (step=0009266) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.18006218422075398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9267, "loss": 0.2427264004945755, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7260971069336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:47] (step=0009267) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.1800816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9268, "loss": 0.1835007667541504, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3369464874268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:50] (step=0009268) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.18010104935872523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9269, "loss": 0.2954389452934265, "memory_gb": 7.721559524536133, "step_time_ms": 3356.513023376465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:54] (step=0009269) Train Loss: 0.2849, Train Steps/Sec: 0.28, Epoch: 0.18012048192771085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:17:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9270, "loss": 0.34214872121810913, "memory_gb": 7.721559524536133, "step_time_ms": 3360.302209854126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:17:58] (step=0009270) Train Loss: 0.3106, Train Steps/Sec: 0.28, Epoch: 0.18013991449669647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9271, "loss": 0.19697654247283936, "memory_gb": 7.721559524536133, "step_time_ms": 3357.016086578369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:01] (step=0009271) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.1801593470656821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9272, "loss": 0.2374027967453003, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1861000061035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:05] (step=0009272) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.1801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9273, "loss": 0.31858503818511963, "memory_gb": 7.721559524536133, "step_time_ms": 3356.614589691162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:08] (step=0009273) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.1801982122036533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9274, "loss": 0.2265051007270813, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5243949890137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:12] (step=0009274) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.18021764477263894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9275, "loss": 0.07679177820682526, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9622764587402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:15] (step=0009275) Train Loss: 0.1648, Train Steps/Sec: 0.28, Epoch: 0.18023707734162456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9276, "loss": 0.2935580611228943, "memory_gb": 7.715639114379883, "step_time_ms": 3322.9289054870605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:19] (step=0009276) Train Loss: 0.3167, Train Steps/Sec: 0.27, Epoch: 0.18025650991061018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9277, "loss": 0.30026519298553467, "memory_gb": 7.721559524536133, "step_time_ms": 3355.647087097168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:23] (step=0009277) Train Loss: 0.3407, Train Steps/Sec: 0.28, Epoch: 0.1802759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9278, "loss": 0.17935603857040405, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3568077087402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:26] (step=0009278) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.18029537504858142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9279, "loss": 0.2725372612476349, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2772274017334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:30] (step=0009279) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.18031480761756705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9280, "loss": 0.27692151069641113, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1221103668213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:33] (step=0009280) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.18033424018655267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9281, "loss": 0.22786618769168854, "memory_gb": 7.721559524536133, "step_time_ms": 3356.666326522827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:37] (step=0009281) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.1803536727555383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9282, "loss": 0.1563662886619568, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9818992614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:41] (step=0009282) Train Loss: 0.1557, Train Steps/Sec: 0.28, Epoch: 0.1803731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9283, "loss": 0.23563507199287415, "memory_gb": 7.715639114379883, "step_time_ms": 3318.78924369812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:44] (step=0009283) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.18039253789350954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9284, "loss": 0.2982995808124542, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1597595214844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:48] (step=0009284) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.18041197046249513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9285, "loss": 0.2471550554037094, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9718132019043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:51] (step=0009285) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.18043140303148075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9286, "loss": 0.22224801778793335, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5919399261475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:55] (step=0009286) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.18045083560046637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:18:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9287, "loss": 0.2429945319890976, "memory_gb": 7.721559524536133, "step_time_ms": 3353.148937225342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:18:59] (step=0009287) Train Loss: 0.2381, Train Steps/Sec: 0.28, Epoch: 0.180470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9288, "loss": 0.20993274450302124, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6977977752686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:02] (step=0009288) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.18048970073843762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9289, "loss": 0.27307918667793274, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4932136535645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:06] (step=0009289) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.18050913330742324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9290, "loss": 0.15765400230884552, "memory_gb": 7.721559524536133, "step_time_ms": 3345.7915782928467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:09] (step=0009290) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.18052856587640886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9291, "loss": 0.17149943113327026, "memory_gb": 7.721559524536133, "step_time_ms": 3358.783483505249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:13] (step=0009291) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.1805479984453945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9292, "loss": 0.24563932418823242, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1325492858887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:16] (step=0009292) Train Loss: 0.2125, Train Steps/Sec: 0.28, Epoch: 0.1805674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9293, "loss": 0.31832432746887207, "memory_gb": 7.721559524536133, "step_time_ms": 3358.150005340576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:20] (step=0009293) Train Loss: 0.2890, Train Steps/Sec: 0.28, Epoch: 0.18058686358336573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9294, "loss": 0.15142597258090973, "memory_gb": 7.721559524536133, "step_time_ms": 3359.698534011841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:24] (step=0009294) Train Loss: 0.2054, Train Steps/Sec: 0.28, Epoch: 0.18060629615235135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9295, "loss": 0.1741596758365631, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2687797546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:27] (step=0009295) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.18062572872133695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9296, "loss": 0.2480524331331253, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1321239471436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:31] (step=0009296) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.18064516129032257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9297, "loss": 0.1542498767375946, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1721992492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:34] (step=0009297) Train Loss: 0.1846, Train Steps/Sec: 0.28, Epoch: 0.1806645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9298, "loss": 0.23339472711086273, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0382289886475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:38] (step=0009298) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.18068402642829381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9299, "loss": 0.1930910348892212, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4862899780273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:42] (step=0009299) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.18070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9300, "loss": 0.22182981669902802, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1697216033936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:45] (step=0009300) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.18072289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9301, "loss": 0.16323800384998322, "memory_gb": 7.721559524536133, "step_time_ms": 3495.943069458008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:49] (step=0009301) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.18074232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9302, "loss": 0.26875799894332886, "memory_gb": 7.721559524536133, "step_time_ms": 3354.726552963257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:52] (step=0009302) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.1807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9303, "loss": 0.23786912858486176, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3956184387207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:56] (step=0009303) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.18078118927322193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:19:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9304, "loss": 0.2779496908187866, "memory_gb": 7.721559524536133, "step_time_ms": 3364.363670349121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:19:59] (step=0009304) Train Loss: 0.3225, Train Steps/Sec: 0.28, Epoch: 0.18080062184220755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9305, "loss": 0.21783384680747986, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6929473876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:03] (step=0009305) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.18082005441119317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9306, "loss": 0.2668950855731964, "memory_gb": 7.721559524536133, "step_time_ms": 3367.816686630249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:07] (step=0009306) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.1808394869801788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9307, "loss": 0.28439825773239136, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4816455841064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:10] (step=0009307) Train Loss: 0.2915, Train Steps/Sec: 0.28, Epoch: 0.1808589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9308, "loss": 0.24775168299674988, "memory_gb": 7.715639114379883, "step_time_ms": 3337.756395339966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:14] (step=0009308) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.18087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9309, "loss": 0.2873873710632324, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5871410369873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:17] (step=0009309) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.18089778468713563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9310, "loss": 0.2792036533355713, "memory_gb": 7.721559524536133, "step_time_ms": 3368.584632873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:21] (step=0009310) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.18091721725612125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9311, "loss": 0.3108513653278351, "memory_gb": 7.721559524536133, "step_time_ms": 3362.096071243286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:25] (step=0009311) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.18093664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9312, "loss": 0.3692604899406433, "memory_gb": 7.721559524536133, "step_time_ms": 3360.013484954834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:28] (step=0009312) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.1809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9313, "loss": 0.2870404124259949, "memory_gb": 7.721559524536133, "step_time_ms": 3364.396095275879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:32] (step=0009313) Train Loss: 0.2696, Train Steps/Sec: 0.28, Epoch: 0.18097551496307812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9314, "loss": 0.15273714065551758, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7940883636475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:35] (step=0009314) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.18099494753206374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9315, "loss": 0.1821260154247284, "memory_gb": 7.721559524536133, "step_time_ms": 3364.617347717285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:39] (step=0009315) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.18101438010104937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9316, "loss": 0.21482819318771362, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375238418579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:42] (step=0009316) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.181033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9317, "loss": 0.20568135380744934, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2921028137207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:46] (step=0009317) Train Loss: 0.1734, Train Steps/Sec: 0.28, Epoch: 0.1810532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9318, "loss": 0.22045300900936127, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6537322998047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:50] (step=0009318) Train Loss: 0.1739, Train Steps/Sec: 0.28, Epoch: 0.1810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9319, "loss": 0.2055465131998062, "memory_gb": 7.721559524536133, "step_time_ms": 3359.901189804077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:53] (step=0009319) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.18109211037699183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:20:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9320, "loss": 0.29433751106262207, "memory_gb": 7.721559524536133, "step_time_ms": 3359.731674194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:20:57] (step=0009320) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.18111154294597745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9321, "loss": 0.33825039863586426, "memory_gb": 7.715639114379883, "step_time_ms": 3329.1893005371094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:00] (step=0009321) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.18113097551496307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9322, "loss": 0.31000640988349915, "memory_gb": 7.721559524536133, "step_time_ms": 3365.611791610718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:04] (step=0009322) Train Loss: 0.3233, Train Steps/Sec: 0.28, Epoch: 0.1811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9323, "loss": 0.16507339477539062, "memory_gb": 7.721559524536133, "step_time_ms": 3361.738681793213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:08] (step=0009323) Train Loss: 0.2317, Train Steps/Sec: 0.27, Epoch: 0.18116984065293432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9324, "loss": 0.30228835344314575, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9882850646973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:11] (step=0009324) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.18118927322191994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9325, "loss": 0.23243218660354614, "memory_gb": 7.721559524536133, "step_time_ms": 3362.157106399536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:15] (step=0009325) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.18120870579090556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9326, "loss": 0.2067405879497528, "memory_gb": 7.721559524536133, "step_time_ms": 3348.1521606445312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:18] (step=0009326) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.18122813835989118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9327, "loss": 0.27660173177719116, "memory_gb": 7.721559524536133, "step_time_ms": 3360.118865966797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:22] (step=0009327) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.1812475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9328, "loss": 0.26785415410995483, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2567234039307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:26] (step=0009328) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.18126700349786243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9329, "loss": 0.3318745493888855, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5425357818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:29] (step=0009329) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.18128643606684805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9330, "loss": 0.19356736540794373, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7680168151855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:33] (step=0009330) Train Loss: 0.1595, Train Steps/Sec: 0.28, Epoch: 0.18130586863583364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9331, "loss": 0.25787675380706787, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8243255615234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:36] (step=0009331) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.18132530120481927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9332, "loss": 0.22445231676101685, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0339374542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:40] (step=0009332) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.1813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9333, "loss": 0.22251483798027039, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9771728515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:44] (step=0009333) Train Loss: 0.1898, Train Steps/Sec: 0.28, Epoch: 0.1813641663427905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9334, "loss": 0.16168949007987976, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0409030914307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:47] (step=0009334) Train Loss: 0.1418, Train Steps/Sec: 0.28, Epoch: 0.18138359891177613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9335, "loss": 0.16723696887493134, "memory_gb": 7.721559524536133, "step_time_ms": 3358.002185821533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:51] (step=0009335) Train Loss: 0.2161, Train Steps/Sec: 0.28, Epoch: 0.18140303148076176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9336, "loss": 0.361854612827301, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4071655273438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:54] (step=0009336) Train Loss: 0.2849, Train Steps/Sec: 0.28, Epoch: 0.18142246404974738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:21:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9337, "loss": 0.2672354578971863, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2962284088135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:21:58] (step=0009337) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.181441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9338, "loss": 0.22537653148174286, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7136058807373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:01] (step=0009338) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.18146132918771862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9339, "loss": 0.32709240913391113, "memory_gb": 7.721559524536133, "step_time_ms": 3362.231492996216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:05] (step=0009339) Train Loss: 0.3394, Train Steps/Sec: 0.28, Epoch: 0.18148076175670425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9340, "loss": 0.20852017402648926, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0475788116455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:09] (step=0009340) Train Loss: 0.2120, Train Steps/Sec: 0.28, Epoch: 0.18150019432568987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9341, "loss": 0.2496470808982849, "memory_gb": 7.721559524536133, "step_time_ms": 3363.386392593384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:12] (step=0009341) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.1815196268946755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9342, "loss": 0.37907522916793823, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2986125946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:16] (step=0009342) Train Loss: 0.3481, Train Steps/Sec: 0.28, Epoch: 0.18153905946366108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9343, "loss": 0.2278093844652176, "memory_gb": 7.721559524536133, "step_time_ms": 3362.185478210449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:19] (step=0009343) Train Loss: 0.2865, Train Steps/Sec: 0.28, Epoch: 0.1815584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9344, "loss": 0.15233109891414642, "memory_gb": 7.721559524536133, "step_time_ms": 3357.959747314453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:23] (step=0009344) Train Loss: 0.1439, Train Steps/Sec: 0.28, Epoch: 0.18157792460163233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9345, "loss": 0.32768094539642334, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2429370880127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:27] (step=0009345) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.18159735717061795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9346, "loss": 0.21576648950576782, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4492721557617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:30] (step=0009346) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.18161678973960357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9347, "loss": 0.2019907832145691, "memory_gb": 7.721559524536133, "step_time_ms": 3359.03263092041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:34] (step=0009347) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.1816362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9348, "loss": 0.1707392930984497, "memory_gb": 7.721559524536133, "step_time_ms": 3498.281478881836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:37] (step=0009348) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.18165565487757482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9349, "loss": 0.18609970808029175, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2305908203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:41] (step=0009349) Train Loss: 0.1936, Train Steps/Sec: 0.28, Epoch: 0.18167508744656044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9350, "loss": 0.2537943720817566, "memory_gb": 7.721559524536133, "step_time_ms": 3350.552797317505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:44] (step=0009350) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.18169452001554606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9351, "loss": 0.19981622695922852, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7287197113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:48] (step=0009351) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.18171395258453168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9352, "loss": 0.20594069361686707, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9956340789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:52] (step=0009352) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.1817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9353, "loss": 0.20988142490386963, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7284088134766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:55] (step=0009353) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.1817528177225029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9354, "loss": 0.24266254901885986, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7089309692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:22:59] (step=0009354) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.18177225029148852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9355, "loss": 0.19515052437782288, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6597232818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:02] (step=0009355) Train Loss: 0.1724, Train Steps/Sec: 0.28, Epoch: 0.18179168286047415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9356, "loss": 0.2584571838378906, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5756340026855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:06] (step=0009356) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.18181111542945977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9357, "loss": 0.21578538417816162, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8398838043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:10] (step=0009357) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.1818305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9358, "loss": 0.14526115357875824, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7211112976074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:13] (step=0009358) Train Loss: 0.1507, Train Steps/Sec: 0.28, Epoch: 0.181849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9359, "loss": 0.22862425446510315, "memory_gb": 7.721559524536133, "step_time_ms": 3359.060525894165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:17] (step=0009359) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.18186941313641664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9360, "loss": 0.22370365262031555, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8593940734863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:20] (step=0009360) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.18188884570540226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9361, "loss": 0.21009483933448792, "memory_gb": 7.721559524536133, "step_time_ms": 3361.196517944336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:24] (step=0009361) Train Loss: 0.1969, Train Steps/Sec: 0.28, Epoch: 0.18190827827438788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9362, "loss": 0.22576949000358582, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7613830566406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:27] (step=0009362) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.1819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9363, "loss": 0.24338898062705994, "memory_gb": 7.721559524536133, "step_time_ms": 3361.506938934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:31] (step=0009363) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.18194714341235912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9364, "loss": 0.24263323843479156, "memory_gb": 7.715639114379883, "step_time_ms": 3320.941925048828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:35] (step=0009364) Train Loss: 0.2411, Train Steps/Sec: 0.27, Epoch: 0.18196657598134475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9365, "loss": 0.29215723276138306, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8200130462646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:38] (step=0009365) Train Loss: 0.3076, Train Steps/Sec: 0.28, Epoch: 0.18198600855033034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9366, "loss": 0.24699094891548157, "memory_gb": 7.721559524536133, "step_time_ms": 3344.3357944488525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:42] (step=0009366) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.18200544111931596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9367, "loss": 0.27471864223480225, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2512607574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:45] (step=0009367) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.18202487368830159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9368, "loss": 0.21583780646324158, "memory_gb": 7.721559524536133, "step_time_ms": 3358.017921447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:49] (step=0009368) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.1820443062572872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9369, "loss": 0.21269717812538147, "memory_gb": 7.721559524536133, "step_time_ms": 3358.245372772217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:53] (step=0009369) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.18206373882627283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9370, "loss": 0.18213430047035217, "memory_gb": 7.721559524536133, "step_time_ms": 3357.816457748413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:23:56] (step=0009370) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.18208317139525845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9371, "loss": 0.21200020611286163, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9890518188477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:00] (step=0009371) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.18210260396424408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9372, "loss": 0.25442540645599365, "memory_gb": 7.721559524536133, "step_time_ms": 3356.180191040039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:03] (step=0009372) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.1821220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9373, "loss": 0.2745036482810974, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7965755462646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:07] (step=0009373) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.18214146910221532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9374, "loss": 0.25047051906585693, "memory_gb": 7.721559524536133, "step_time_ms": 3358.668565750122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:11] (step=0009374) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.18216090167120094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9375, "loss": 0.20256313681602478, "memory_gb": 7.721559524536133, "step_time_ms": 3359.907627105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:14] (step=0009375) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.18218033424018656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9376, "loss": 0.2700619697570801, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3142127990723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:18] (step=0009376) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.18219976680917216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9377, "loss": 0.29685017466545105, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8686714172363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:21] (step=0009377) Train Loss: 0.3120, Train Steps/Sec: 0.28, Epoch: 0.18221919937815778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9378, "loss": 0.24450203776359558, "memory_gb": 7.721559524536133, "step_time_ms": 3352.344036102295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:25] (step=0009378) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.1822386319471434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9379, "loss": 0.35122063755989075, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3013076782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:28] (step=0009379) Train Loss: 0.2988, Train Steps/Sec: 0.28, Epoch: 0.18225806451612903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9380, "loss": 0.3449779152870178, "memory_gb": 7.721559524536133, "step_time_ms": 3354.166269302368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:32] (step=0009380) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.18227749708511465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9381, "loss": 0.20931163430213928, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8104763031006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:36] (step=0009381) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.18229692965410027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9382, "loss": 0.25458091497421265, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1278228759766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:39] (step=0009382) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.1823163622230859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9383, "loss": 0.14654585719108582, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3736839294434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:43] (step=0009383) Train Loss: 0.1737, Train Steps/Sec: 0.28, Epoch: 0.18233579479207151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9384, "loss": 0.24242755770683289, "memory_gb": 7.721559524536133, "step_time_ms": 3367.276430130005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:46] (step=0009384) Train Loss: 0.1785, Train Steps/Sec: 0.28, Epoch: 0.18235522736105714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9385, "loss": 0.19267699122428894, "memory_gb": 7.721559524536133, "step_time_ms": 3363.46435546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:50] (step=0009385) Train Loss: 0.1947, Train Steps/Sec: 0.28, Epoch: 0.18237465993004276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9386, "loss": 0.1900855004787445, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3129806518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:54] (step=0009386) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.18239409249902838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9387, "loss": 0.14950545132160187, "memory_gb": 7.721559524536133, "step_time_ms": 3365.626096725464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:24:57] (step=0009387) Train Loss: 0.1679, Train Steps/Sec: 0.28, Epoch: 0.182413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9388, "loss": 0.20265129208564758, "memory_gb": 7.721559524536133, "step_time_ms": 3365.426540374756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:01] (step=0009388) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.1824329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9389, "loss": 0.22896328568458557, "memory_gb": 7.721559524536133, "step_time_ms": 3508.695125579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:04] (step=0009389) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.18245239020598522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9390, "loss": 0.2752230763435364, "memory_gb": 7.721559524536133, "step_time_ms": 3365.962028503418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:08] (step=0009390) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.18247182277497084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9391, "loss": 0.26356369256973267, "memory_gb": 7.721559524536133, "step_time_ms": 3359.055995941162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:11] (step=0009391) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.18249125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9392, "loss": 0.2758270502090454, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2830333709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:15] (step=0009392) Train Loss: 0.3233, Train Steps/Sec: 0.28, Epoch: 0.1825106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9393, "loss": 0.22962069511413574, "memory_gb": 7.721559524536133, "step_time_ms": 3367.669105529785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:19] (step=0009393) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.1825301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9394, "loss": 0.24397999048233032, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3111248016357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:22] (step=0009394) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.18254955305091333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9395, "loss": 0.22989217936992645, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5806522369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:26] (step=0009395) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.18256898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9396, "loss": 0.24576160311698914, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2631397247314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:29] (step=0009396) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.18258841818888458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9397, "loss": 0.19131456315517426, "memory_gb": 7.721559524536133, "step_time_ms": 3370.171546936035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:33] (step=0009397) Train Loss: 0.1816, Train Steps/Sec: 0.28, Epoch: 0.1826078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9398, "loss": 0.24229368567466736, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9769554138184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:37] (step=0009398) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.18262728332685582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9399, "loss": 0.2769603133201599, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0316677093506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:40] (step=0009399) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.18264671589584144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9400, "loss": 0.14485934376716614, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5777492523193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:44] (step=0009400) Train Loss: 0.1887, Train Steps/Sec: 0.28, Epoch: 0.18266614846482704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9401, "loss": 0.29929983615875244, "memory_gb": 7.721559524536133, "step_time_ms": 3370.668649673462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:47] (step=0009401) Train Loss: 0.2885, Train Steps/Sec: 0.28, Epoch: 0.18268558103381266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9402, "loss": 0.19153112173080444, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2241954803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:51] (step=0009402) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.18270501360279828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9403, "loss": 0.2377309948205948, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9744262695312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:54] (step=0009403) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.1827244461717839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9404, "loss": 0.1370355188846588, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4978790283203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:25:58] (step=0009404) Train Loss: 0.1753, Train Steps/Sec: 0.28, Epoch: 0.18274387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9405, "loss": 0.20923364162445068, "memory_gb": 7.721559524536133, "step_time_ms": 3371.779441833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:02] (step=0009405) Train Loss: 0.2964, Train Steps/Sec: 0.28, Epoch: 0.18276331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9406, "loss": 0.28304266929626465, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2805767059326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:05] (step=0009406) Train Loss: 0.2917, Train Steps/Sec: 0.28, Epoch: 0.18278274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9407, "loss": 0.16382470726966858, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9235401153564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:09] (step=0009407) Train Loss: 0.1527, Train Steps/Sec: 0.28, Epoch: 0.1828021764477264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9408, "loss": 0.16299808025360107, "memory_gb": 7.721559524536133, "step_time_ms": 3367.151975631714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:12] (step=0009408) Train Loss: 0.1904, Train Steps/Sec: 0.28, Epoch: 0.18282160901671202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9409, "loss": 0.16123409569263458, "memory_gb": 7.721559524536133, "step_time_ms": 3368.936538696289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:16] (step=0009409) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.18284104158569764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9410, "loss": 0.23787474632263184, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4772510528564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:20] (step=0009410) Train Loss: 0.2611, Train Steps/Sec: 0.28, Epoch: 0.18286047415468326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9411, "loss": 0.3174406886100769, "memory_gb": 7.715639114379883, "step_time_ms": 3335.430145263672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:23] (step=0009411) Train Loss: 0.3266, Train Steps/Sec: 0.28, Epoch: 0.18287990672366886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9412, "loss": 0.1603585034608841, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8804473876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:27] (step=0009412) Train Loss: 0.1695, Train Steps/Sec: 0.27, Epoch: 0.18289933929265448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9413, "loss": 0.1947263479232788, "memory_gb": 7.721559524536133, "step_time_ms": 3359.86328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:31] (step=0009413) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.1829187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9414, "loss": 0.2406003326177597, "memory_gb": 7.721559524536133, "step_time_ms": 3364.633560180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:34] (step=0009414) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.18293820443062572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9415, "loss": 0.2825121581554413, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2026653289795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:38] (step=0009415) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.18295763699961134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9416, "loss": 0.19787462055683136, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2912635803223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:41] (step=0009416) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.18297706956859697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9417, "loss": 0.21663036942481995, "memory_gb": 7.721559524536133, "step_time_ms": 3362.863063812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:45] (step=0009417) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.1829965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9418, "loss": 0.239889457821846, "memory_gb": 7.721559524536133, "step_time_ms": 3364.276647567749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:49] (step=0009418) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.1830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9419, "loss": 0.14073964953422546, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6954555511475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:52] (step=0009419) Train Loss: 0.1629, Train Steps/Sec: 0.28, Epoch: 0.18303536727555383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9420, "loss": 0.2479599118232727, "memory_gb": 7.721559524536133, "step_time_ms": 3362.182855606079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:56] (step=0009420) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.18305479984453946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9421, "loss": 0.22714312374591827, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7472858428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:26:59] (step=0009421) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.18307423241352508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9422, "loss": 0.19896046817302704, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8997802734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:03] (step=0009422) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.1830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9423, "loss": 0.25910401344299316, "memory_gb": 7.721559524536133, "step_time_ms": 3365.187644958496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:06] (step=0009423) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.1831130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9424, "loss": 0.2569194436073303, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5262718200684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:10] (step=0009424) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.18313253012048192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9425, "loss": 0.31280797719955444, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6150875091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:14] (step=0009425) Train Loss: 0.3163, Train Steps/Sec: 0.28, Epoch: 0.18315196268946754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9426, "loss": 0.21072259545326233, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2938652038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:17] (step=0009426) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.18317139525845316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9427, "loss": 0.29920077323913574, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4462146759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:21] (step=0009427) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.18319082782743878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9428, "loss": 0.1850309818983078, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1576766967773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:24] (step=0009428) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.1832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9429, "loss": 0.2285527139902115, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1463775634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:28] (step=0009429) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.18322969296541003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9430, "loss": 0.23806504905223846, "memory_gb": 7.721559524536133, "step_time_ms": 3357.450485229492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:32] (step=0009430) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.18324912553439565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9431, "loss": 0.18756835162639618, "memory_gb": 7.721559524536133, "step_time_ms": 3362.213134765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:35] (step=0009431) Train Loss: 0.1964, Train Steps/Sec: 0.28, Epoch: 0.18326855810338127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9432, "loss": 0.17576941847801208, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7001514434814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:39] (step=0009432) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.1832879906723669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9433, "loss": 0.2851572334766388, "memory_gb": 7.721559524536133, "step_time_ms": 3362.264394760132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:42] (step=0009433) Train Loss: 0.3032, Train Steps/Sec: 0.28, Epoch: 0.18330742324135252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9434, "loss": 0.1567976325750351, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4284057617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:46] (step=0009434) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.1833268558103381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9435, "loss": 0.16720180213451385, "memory_gb": 7.721559524536133, "step_time_ms": 3357.792854309082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:50] (step=0009435) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.18334628837932374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9436, "loss": 0.2714245319366455, "memory_gb": 7.721559524536133, "step_time_ms": 3351.790189743042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:53] (step=0009436) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.18336572094830936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9437, "loss": 0.1665254682302475, "memory_gb": 7.721559524536133, "step_time_ms": 3499.5250701904297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:27:57] (step=0009437) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.18338515351729498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9438, "loss": 0.26655304431915283, "memory_gb": 7.721559524536133, "step_time_ms": 3361.504316329956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:00] (step=0009438) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.1834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9439, "loss": 0.11257563531398773, "memory_gb": 7.721559524536133, "step_time_ms": 3361.811876296997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:04] (step=0009439) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.18342401865526622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9440, "loss": 0.2514980137348175, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9371910095215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:08] (step=0009440) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.18344345122425185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9441, "loss": 0.3604621887207031, "memory_gb": 7.721559524536133, "step_time_ms": 3361.504316329956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:11] (step=0009441) Train Loss: 0.3063, Train Steps/Sec: 0.28, Epoch: 0.18346288379323747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9442, "loss": 0.2804403603076935, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9236011505127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:15] (step=0009442) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.1834823163622231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9443, "loss": 0.1717326045036316, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5595664978027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:18] (step=0009443) Train Loss: 0.1681, Train Steps/Sec: 0.28, Epoch: 0.1835017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9444, "loss": 0.3049789071083069, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2633476257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:22] (step=0009444) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.18352118150019434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9445, "loss": 0.260617733001709, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7038612365723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:25] (step=0009445) Train Loss: 0.2979, Train Steps/Sec: 0.28, Epoch: 0.18354061406917996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9446, "loss": 0.24620041251182556, "memory_gb": 7.715639114379883, "step_time_ms": 3318.027973175049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:29] (step=0009446) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.18356004663816555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9447, "loss": 0.23026815056800842, "memory_gb": 7.721559524536133, "step_time_ms": 3350.30198097229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:33] (step=0009447) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.18357947920715117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9448, "loss": 0.277626097202301, "memory_gb": 7.721559524536133, "step_time_ms": 3358.245372772217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:36] (step=0009448) Train Loss: 0.1830, Train Steps/Sec: 0.28, Epoch: 0.1835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9449, "loss": 0.2571756839752197, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4091243743896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:40] (step=0009449) Train Loss: 0.3172, Train Steps/Sec: 0.28, Epoch: 0.18361834434512242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9450, "loss": 0.2302032709121704, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4581146240234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:43] (step=0009450) Train Loss: 0.2255, Train Steps/Sec: 0.28, Epoch: 0.18363777691410804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9451, "loss": 0.24394086003303528, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6602725982666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:47] (step=0009451) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.18365720948309366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9452, "loss": 0.23133638501167297, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4491481781006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:51] (step=0009452) Train Loss: 0.2309, Train Steps/Sec: 0.27, Epoch: 0.1836766420520793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9453, "loss": 0.21591627597808838, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8527488708496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:54] (step=0009453) Train Loss: 0.2131, Train Steps/Sec: 0.28, Epoch: 0.1836960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9454, "loss": 0.17409367859363556, "memory_gb": 7.721559524536133, "step_time_ms": 3359.135150909424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:28:58] (step=0009454) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.18371550719005053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9455, "loss": 0.2441994845867157, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4844646453857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:01] (step=0009455) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.18373493975903615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9456, "loss": 0.1617618203163147, "memory_gb": 7.721559524536133, "step_time_ms": 3358.747720718384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:05] (step=0009456) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.18375437232802178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9457, "loss": 0.12686684727668762, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6465377807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:09] (step=0009457) Train Loss: 0.1552, Train Steps/Sec: 0.28, Epoch: 0.1837738048970074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9458, "loss": 0.26538190245628357, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7661514282227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:12] (step=0009458) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.183793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9459, "loss": 0.2005283087491989, "memory_gb": 7.721559524536133, "step_time_ms": 3358.72483253479, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:16] (step=0009459) Train Loss: 0.1778, Train Steps/Sec: 0.28, Epoch: 0.18381267003497861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9460, "loss": 0.21579048037528992, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9954166412354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:19] (step=0009460) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.18383210260396424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9461, "loss": 0.2724573016166687, "memory_gb": 7.721559524536133, "step_time_ms": 3357.271194458008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:23] (step=0009461) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.18385153517294986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9462, "loss": 0.2195260375738144, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0996055603027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:26] (step=0009462) Train Loss: 0.3011, Train Steps/Sec: 0.28, Epoch: 0.18387096774193548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9463, "loss": 0.3392072021961212, "memory_gb": 7.721559524536133, "step_time_ms": 3357.969284057617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:30] (step=0009463) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.1838904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9464, "loss": 0.11733327805995941, "memory_gb": 7.721559524536133, "step_time_ms": 3356.274127960205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:34] (step=0009464) Train Loss: 0.1888, Train Steps/Sec: 0.28, Epoch: 0.18390983287990673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9465, "loss": 0.2500462532043457, "memory_gb": 7.721559524536133, "step_time_ms": 3358.536720275879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:37] (step=0009465) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.18392926544889235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9466, "loss": 0.2640925347805023, "memory_gb": 7.721559524536133, "step_time_ms": 3357.961654663086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:41] (step=0009466) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.18394869801787797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9467, "loss": 0.33147746324539185, "memory_gb": 7.721559524536133, "step_time_ms": 3361.367702484131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:44] (step=0009467) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.1839681305868636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9468, "loss": 0.3230013847351074, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3501529693604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:48] (step=0009468) Train Loss: 0.2867, Train Steps/Sec: 0.28, Epoch: 0.18398756315584922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9469, "loss": 0.2630268931388855, "memory_gb": 7.721559524536133, "step_time_ms": 3358.048439025879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:51] (step=0009469) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.1840069957248348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9470, "loss": 0.285316526889801, "memory_gb": 7.721559524536133, "step_time_ms": 3355.706214904785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:55] (step=0009470) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.18402642829382043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:29:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9471, "loss": 0.16703183948993683, "memory_gb": 7.721559524536133, "step_time_ms": 3355.283260345459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:29:59] (step=0009471) Train Loss: 0.2274, Train Steps/Sec: 0.28, Epoch: 0.18404586086280605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9472, "loss": 0.1709706038236618, "memory_gb": 7.721559524536133, "step_time_ms": 3356.999158859253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:02] (step=0009472) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.18406529343179168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9473, "loss": 0.1361398994922638, "memory_gb": 7.721559524536133, "step_time_ms": 3358.668565750122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:06] (step=0009473) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.1840847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9474, "loss": 0.20501922070980072, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2399101257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:09] (step=0009474) Train Loss: 0.1877, Train Steps/Sec: 0.28, Epoch: 0.18410415856976292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9475, "loss": 0.23541894555091858, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3543815612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:13] (step=0009475) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.18412359113874854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9476, "loss": 0.15852433443069458, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7563858032227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:16] (step=0009476) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.18414302370773417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9477, "loss": 0.3900553584098816, "memory_gb": 7.721559524536133, "step_time_ms": 3502.4349689483643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:20] (step=0009477) Train Loss: 0.2970, Train Steps/Sec: 0.28, Epoch: 0.1841624562767198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9478, "loss": 0.233364537358284, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8133792877197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:24] (step=0009478) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.1841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9479, "loss": 0.3026362359523773, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0358448028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:27] (step=0009479) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.18420132141469103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9480, "loss": 0.2967143654823303, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2380962371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:31] (step=0009480) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.18422075398367665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9481, "loss": 0.2688271999359131, "memory_gb": 7.721559524536133, "step_time_ms": 3356.147050857544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:34] (step=0009481) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.18424018655266225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9482, "loss": 0.21255172789096832, "memory_gb": 7.721559524536133, "step_time_ms": 3363.044500350952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:38] (step=0009482) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.18425961912164787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9483, "loss": 0.28621789813041687, "memory_gb": 7.721559524536133, "step_time_ms": 3360.215187072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:42] (step=0009483) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.1842790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9484, "loss": 0.17257796227931976, "memory_gb": 7.721559524536133, "step_time_ms": 3358.33740234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:45] (step=0009484) Train Loss: 0.1829, Train Steps/Sec: 0.28, Epoch: 0.18429848425961912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9485, "loss": 0.23497948050498962, "memory_gb": 7.721559524536133, "step_time_ms": 3357.527494430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:49] (step=0009485) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.18431791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9486, "loss": 0.12994682788848877, "memory_gb": 7.721559524536133, "step_time_ms": 3363.574504852295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:52] (step=0009486) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.18433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9487, "loss": 0.24922583997249603, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0524196624756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:30:56] (step=0009487) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.18435678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9488, "loss": 0.3120245337486267, "memory_gb": 7.721559524536133, "step_time_ms": 3364.166259765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:00] (step=0009488) Train Loss: 0.2797, Train Steps/Sec: 0.28, Epoch: 0.1843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9489, "loss": 0.22037982940673828, "memory_gb": 7.721559524536133, "step_time_ms": 3360.966682434082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:03] (step=0009489) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.18439564710454723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9490, "loss": 0.16711877286434174, "memory_gb": 7.721559524536133, "step_time_ms": 3362.757682800293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:07] (step=0009490) Train Loss: 0.2009, Train Steps/Sec: 0.28, Epoch: 0.18441507967353285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9491, "loss": 0.18858569860458374, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2230548858643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:10] (step=0009491) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.18443451224251847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9492, "loss": 0.268086314201355, "memory_gb": 7.721559524536133, "step_time_ms": 3359.163761138916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:14] (step=0009492) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.1844539448115041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9493, "loss": 0.2493826001882553, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0638332366943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:17] (step=0009493) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.1844733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9494, "loss": 0.23315781354904175, "memory_gb": 7.721559524536133, "step_time_ms": 3366.490602493286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:21] (step=0009494) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.1844928099494753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9495, "loss": 0.27578091621398926, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6441535949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:25] (step=0009495) Train Loss: 0.2918, Train Steps/Sec: 0.28, Epoch: 0.18451224251846093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9496, "loss": 0.2560332715511322, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0828132629395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:28] (step=0009496) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.18453167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9497, "loss": 0.23833267390727997, "memory_gb": 7.721559524536133, "step_time_ms": 3365.222930908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:32] (step=0009497) Train Loss: 0.1796, Train Steps/Sec: 0.28, Epoch: 0.18455110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9498, "loss": 0.3109501600265503, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1009845733643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:35] (step=0009498) Train Loss: 0.2748, Train Steps/Sec: 0.28, Epoch: 0.1845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9499, "loss": 0.12984612584114075, "memory_gb": 7.721559524536133, "step_time_ms": 3366.135597229004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:39] (step=0009499) Train Loss: 0.1472, Train Steps/Sec: 0.27, Epoch: 0.18458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9500, "loss": 0.14210206270217896, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7921085357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:43] (step=0009500) Train Loss: 0.1643, Train Steps/Sec: 0.28, Epoch: 0.18460940536338905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9501, "loss": 0.16045373678207397, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9864711761475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:46] (step=0009501) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.18462883793237467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9502, "loss": 0.25994834303855896, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1934394836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:50] (step=0009502) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.1846482705013603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9503, "loss": 0.23703834414482117, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9886589050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:54] (step=0009503) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.1846677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9504, "loss": 0.1918925940990448, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4586544036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:31:57] (step=0009504) Train Loss: 0.1703, Train Steps/Sec: 0.28, Epoch: 0.1846871356393315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9505, "loss": 0.18907299637794495, "memory_gb": 7.721559524536133, "step_time_ms": 3365.865707397461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:01] (step=0009505) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.18470656820831713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9506, "loss": 0.23327210545539856, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1622276306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:04] (step=0009506) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.18472600077730275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9507, "loss": 0.21988803148269653, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4533882141113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:08] (step=0009507) Train Loss: 0.1919, Train Steps/Sec: 0.28, Epoch: 0.18474543334628837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9508, "loss": 0.2049751579761505, "memory_gb": 7.721559524536133, "step_time_ms": 3365.778684616089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:12] (step=0009508) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.184764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9509, "loss": 0.17263051867485046, "memory_gb": 7.721559524536133, "step_time_ms": 3367.995023727417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:15] (step=0009509) Train Loss: 0.1652, Train Steps/Sec: 0.28, Epoch: 0.18478429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9510, "loss": 0.21048793196678162, "memory_gb": 7.721559524536133, "step_time_ms": 3366.607189178467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:19] (step=0009510) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.18480373105324524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9511, "loss": 0.25105762481689453, "memory_gb": 7.721559524536133, "step_time_ms": 3362.83016204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:22] (step=0009511) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.18482316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9512, "loss": 0.2015213966369629, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0365600585938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:26] (step=0009512) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.18484259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9513, "loss": 0.31946900486946106, "memory_gb": 7.721559524536133, "step_time_ms": 3365.018844604492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:30] (step=0009513) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.1848620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9514, "loss": 0.23317377269268036, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4461631774902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:33] (step=0009514) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.18488146132918773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9515, "loss": 0.18947796523571014, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4586544036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:37] (step=0009515) Train Loss: 0.1970, Train Steps/Sec: 0.28, Epoch: 0.18490089389817335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9516, "loss": 0.26230156421661377, "memory_gb": 7.721559524536133, "step_time_ms": 3363.351345062256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:40] (step=0009516) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.18492032646715895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9517, "loss": 0.1784801483154297, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7421855926514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:44] (step=0009517) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.18493975903614457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9518, "loss": 0.1756947636604309, "memory_gb": 7.721559524536133, "step_time_ms": 3364.783763885498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:48] (step=0009518) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.1849591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9519, "loss": 0.2993168532848358, "memory_gb": 7.721559524536133, "step_time_ms": 3504.7478675842285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:51] (step=0009519) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.1849786241741158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9520, "loss": 0.21633735299110413, "memory_gb": 7.721559524536133, "step_time_ms": 3356.017589569092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:55] (step=0009520) Train Loss: 0.1840, Train Steps/Sec: 0.28, Epoch: 0.18499805674310144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:32:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9521, "loss": 0.20465785264968872, "memory_gb": 7.721559524536133, "step_time_ms": 3362.811803817749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:32:58] (step=0009521) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.18501748931208706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9522, "loss": 0.17756003141403198, "memory_gb": 7.721559524536133, "step_time_ms": 3365.300178527832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:02] (step=0009522) Train Loss: 0.1878, Train Steps/Sec: 0.28, Epoch: 0.18503692188107268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9523, "loss": 0.26402488350868225, "memory_gb": 7.721559524536133, "step_time_ms": 3344.266653060913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:06] (step=0009523) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.1850563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9524, "loss": 0.23440305888652802, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6763401031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:09] (step=0009524) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.18507578701904392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9525, "loss": 0.15869949758052826, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4816875457764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:13] (step=0009525) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.18509521958802955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9526, "loss": 0.15889959037303925, "memory_gb": 7.721559524536133, "step_time_ms": 3363.149881362915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:16] (step=0009526) Train Loss: 0.1925, Train Steps/Sec: 0.28, Epoch: 0.18511465215701517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9527, "loss": 0.2638879120349884, "memory_gb": 7.721559524536133, "step_time_ms": 3364.495277404785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:20] (step=0009527) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.18513408472600076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9528, "loss": 0.24827200174331665, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9564304351807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:24] (step=0009528) Train Loss: 0.2860, Train Steps/Sec: 0.28, Epoch: 0.18515351729498639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9529, "loss": 0.31765881180763245, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2164936065674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:27] (step=0009529) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.185172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9530, "loss": 0.24581441283226013, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7688884735107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:31] (step=0009530) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.18519238243295763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9531, "loss": 0.24752628803253174, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6527576446533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:34] (step=0009531) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.18521181500194325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9532, "loss": 0.2649429440498352, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6322326660156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:38] (step=0009532) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.18523124757092888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9533, "loss": 0.17807789146900177, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8031272888184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:41] (step=0009533) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.1852506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9534, "loss": 0.2687830328941345, "memory_gb": 7.721559524536133, "step_time_ms": 3358.842611312866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:45] (step=0009534) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.18527011270890012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9535, "loss": 0.22156919538974762, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0027561187744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:49] (step=0009535) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.18528954527788574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9536, "loss": 0.2294023036956787, "memory_gb": 7.721559524536133, "step_time_ms": 3363.209009170532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:52] (step=0009536) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.18530897784687136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9537, "loss": 0.1627807319164276, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6667518615723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:56] (step=0009537) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.185328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:33:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9538, "loss": 0.21143457293510437, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0361347198486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:33:59] (step=0009538) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.1853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9539, "loss": 0.27696162462234497, "memory_gb": 7.721559524536133, "step_time_ms": 3355.462074279785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:03] (step=0009539) Train Loss: 0.3066, Train Steps/Sec: 0.28, Epoch: 0.1853672755538282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9540, "loss": 0.24170342087745667, "memory_gb": 7.721559524536133, "step_time_ms": 3364.362955093384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:07] (step=0009540) Train Loss: 0.2384, Train Steps/Sec: 0.27, Epoch: 0.18538670812281383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9541, "loss": 0.16741345822811127, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7653846740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:10] (step=0009541) Train Loss: 0.1748, Train Steps/Sec: 0.28, Epoch: 0.18540614069179945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9542, "loss": 0.23300834000110626, "memory_gb": 7.721559524536133, "step_time_ms": 3345.921277999878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:14] (step=0009542) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.18542557326078507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9543, "loss": 0.2588059902191162, "memory_gb": 7.721559524536133, "step_time_ms": 3357.219934463501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:17] (step=0009543) Train Loss: 0.2263, Train Steps/Sec: 0.28, Epoch: 0.1854450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9544, "loss": 0.275581032037735, "memory_gb": 7.721559524536133, "step_time_ms": 3361.18745803833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:21] (step=0009544) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.18546443839875631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9545, "loss": 0.2705504894256592, "memory_gb": 7.721559524536133, "step_time_ms": 3363.597869873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:25] (step=0009545) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.18548387096774194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9546, "loss": 0.13752751052379608, "memory_gb": 7.721559524536133, "step_time_ms": 3358.452081680298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:28] (step=0009546) Train Loss: 0.1535, Train Steps/Sec: 0.28, Epoch: 0.18550330353672756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9547, "loss": 0.3061214089393616, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2335453033447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:32] (step=0009547) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.18552273610571318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9548, "loss": 0.25797897577285767, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0925674438477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:35] (step=0009548) Train Loss: 0.1962, Train Steps/Sec: 0.28, Epoch: 0.1855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9549, "loss": 0.21909812092781067, "memory_gb": 7.721559524536133, "step_time_ms": 3346.835136413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:39] (step=0009549) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.18556160124368443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9550, "loss": 0.26075851917266846, "memory_gb": 7.721559524536133, "step_time_ms": 3355.700731277466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:43] (step=0009550) Train Loss: 0.2222, Train Steps/Sec: 0.28, Epoch: 0.18558103381267005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9551, "loss": 0.2364795058965683, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4061393737793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:46] (step=0009551) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.18560046638165564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9552, "loss": 0.27255308628082275, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3552112579346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:50] (step=0009552) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.18561989895064127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9553, "loss": 0.16802754998207092, "memory_gb": 7.721559524536133, "step_time_ms": 3360.880136489868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:53] (step=0009553) Train Loss: 0.1684, Train Steps/Sec: 0.28, Epoch: 0.1856393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9554, "loss": 0.2332860827445984, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2285175323486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:34:57] (step=0009554) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.1856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9555, "loss": 0.2049371600151062, "memory_gb": 7.721559524536133, "step_time_ms": 3361.621141433716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:00] (step=0009555) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.18567819665759813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9556, "loss": 0.28186434507369995, "memory_gb": 7.721559524536133, "step_time_ms": 3360.081911087036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:04] (step=0009556) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.18569762922658375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9557, "loss": 0.1884443610906601, "memory_gb": 7.721559524536133, "step_time_ms": 3354.445219039917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:08] (step=0009557) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.18571706179556938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9558, "loss": 0.26107218861579895, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6274127960205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:11] (step=0009558) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.185736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9559, "loss": 0.2851790189743042, "memory_gb": 7.721559524536133, "step_time_ms": 3347.57661819458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:15] (step=0009559) Train Loss: 0.3282, Train Steps/Sec: 0.28, Epoch: 0.18575592693354062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9560, "loss": 0.21035701036453247, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4024181365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:18] (step=0009560) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.18577535950252624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9561, "loss": 0.21090851724147797, "memory_gb": 7.721559524536133, "step_time_ms": 3360.368013381958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:22] (step=0009561) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.18579479207151187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9562, "loss": 0.21255116164684296, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5353107452393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:26] (step=0009562) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.18581422464049746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9563, "loss": 0.23016215860843658, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1911792755127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:29] (step=0009563) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.18583365720948308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9564, "loss": 0.16802731156349182, "memory_gb": 7.715639114379883, "step_time_ms": 3322.9153156280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:33] (step=0009564) Train Loss: 0.1672, Train Steps/Sec: 0.28, Epoch: 0.1858530897784687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9565, "loss": 0.27064281702041626, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4419326782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:36] (step=0009565) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.18587252234745433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9566, "loss": 0.26658132672309875, "memory_gb": 7.721559524536133, "step_time_ms": 3506.310224533081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:40] (step=0009566) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.18589195491643995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9567, "loss": 0.3085933327674866, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9860877990723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:43] (step=0009567) Train Loss: 0.3479, Train Steps/Sec: 0.28, Epoch: 0.18591138748542557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9568, "loss": 0.26323050260543823, "memory_gb": 7.721559524536133, "step_time_ms": 3364.882469177246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:47] (step=0009568) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.1859308200544112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9569, "loss": 0.13169655203819275, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2301349639893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:51] (step=0009569) Train Loss: 0.1776, Train Steps/Sec: 0.28, Epoch: 0.18595025262339682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9570, "loss": 0.27015602588653564, "memory_gb": 7.721559524536133, "step_time_ms": 3363.168954849243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:54] (step=0009570) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.18596968519238244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9571, "loss": 0.17276149988174438, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9727363586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:35:58] (step=0009571) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.18598911776136806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9572, "loss": 0.15382477641105652, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0737113952637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:01] (step=0009572) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.18600855033035368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9573, "loss": 0.30486202239990234, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7019443511963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:05] (step=0009573) Train Loss: 0.3316, Train Steps/Sec: 0.28, Epoch: 0.1860279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9574, "loss": 0.2786097824573517, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7087955474854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:08] (step=0009574) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.1860474154683249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9575, "loss": 0.28236424922943115, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3171577453613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:12] (step=0009575) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.18606684803731052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9576, "loss": 0.26443979144096375, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2069568634033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:16] (step=0009576) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.18608628060629614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9577, "loss": 0.22400698065757751, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5804653167725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:19] (step=0009577) Train Loss: 0.2012, Train Steps/Sec: 0.28, Epoch: 0.18610571317528177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9578, "loss": 0.26689156889915466, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1949729919434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:23] (step=0009578) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.1861251457442674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9579, "loss": 0.19529147446155548, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6046390533447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:26] (step=0009579) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.186144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9580, "loss": 0.18121854960918427, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7699031829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:30] (step=0009580) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.18616401088223863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9581, "loss": 0.15526847541332245, "memory_gb": 7.721559524536133, "step_time_ms": 3360.706329345703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:33] (step=0009581) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.18618344345122426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9582, "loss": 0.22283457219600677, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5083179473877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:37] (step=0009582) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.18620287602020988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9583, "loss": 0.24290145933628082, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8014583587646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:41] (step=0009583) Train Loss: 0.1808, Train Steps/Sec: 0.28, Epoch: 0.1862223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9584, "loss": 0.15732955932617188, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1765117645264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:44] (step=0009584) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.18624174115818112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9585, "loss": 0.1753145158290863, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6694984436035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:48] (step=0009585) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.18626117372716672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9586, "loss": 0.2697334289550781, "memory_gb": 7.721559524536133, "step_time_ms": 3358.717679977417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:51] (step=0009586) Train Loss: 0.2977, Train Steps/Sec: 0.28, Epoch: 0.18628060629615234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9587, "loss": 0.18988052010536194, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2170429229736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:55] (step=0009587) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.18630003886513796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9588, "loss": 0.17446166276931763, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3417568206787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:36:59] (step=0009588) Train Loss: 0.2358, Train Steps/Sec: 0.27, Epoch: 0.18631947143412358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9589, "loss": 0.17019817233085632, "memory_gb": 7.721559524536133, "step_time_ms": 3365.123510360718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:02] (step=0009589) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.1863389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9590, "loss": 0.25889986753463745, "memory_gb": 7.721559524536133, "step_time_ms": 3363.71111869812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:06] (step=0009590) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.18635833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9591, "loss": 0.2880500853061676, "memory_gb": 7.721559524536133, "step_time_ms": 3363.539457321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:09] (step=0009591) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.18637776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9592, "loss": 0.21348194777965546, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4921264648438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:13] (step=0009592) Train Loss: 0.2005, Train Steps/Sec: 0.28, Epoch: 0.18639720171006607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9593, "loss": 0.2614538073539734, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2283420562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:17] (step=0009593) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.1864166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9594, "loss": 0.21767070889472961, "memory_gb": 7.721559524536133, "step_time_ms": 3366.189479827881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:20] (step=0009594) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.18643606684803732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9595, "loss": 0.24168816208839417, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1065311431885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:24] (step=0009595) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.18645549941702294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9596, "loss": 0.2515341639518738, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5124435424805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:27] (step=0009596) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.18647493198600856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9597, "loss": 0.16879884898662567, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7660274505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:31] (step=0009597) Train Loss: 0.1675, Train Steps/Sec: 0.28, Epoch: 0.18649436455499416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9598, "loss": 0.25754088163375854, "memory_gb": 7.721559524536133, "step_time_ms": 3368.786573410034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:34] (step=0009598) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.18651379712397978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9599, "loss": 0.21298250555992126, "memory_gb": 7.721559524536133, "step_time_ms": 3370.872974395752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:38] (step=0009599) Train Loss: 0.1789, Train Steps/Sec: 0.28, Epoch: 0.1865332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9600, "loss": 0.25164568424224854, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0999488830566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:42] (step=0009600) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.18655266226195102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9601, "loss": 0.14984160661697388, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4065856933594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:45] (step=0009601) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.18657209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9602, "loss": 0.31426218152046204, "memory_gb": 7.721559524536133, "step_time_ms": 3369.002342224121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:49] (step=0009602) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.18659152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9603, "loss": 0.2358967363834381, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0011501312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:52] (step=0009603) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.1866109599689079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9604, "loss": 0.2409813106060028, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8784160614014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:37:56] (step=0009604) Train Loss: 0.3118, Train Steps/Sec: 0.28, Epoch: 0.1866303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9605, "loss": 0.16373440623283386, "memory_gb": 7.721559524536133, "step_time_ms": 3363.905191421509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:00] (step=0009605) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.18664982510687914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9606, "loss": 0.3812369108200073, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6581382751465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:03] (step=0009606) Train Loss: 0.3056, Train Steps/Sec: 0.28, Epoch: 0.18666925767586476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9607, "loss": 0.26415297389030457, "memory_gb": 7.721559524536133, "step_time_ms": 3507.925510406494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:07] (step=0009607) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.18668869024485038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9608, "loss": 0.287229061126709, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6269569396973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:10] (step=0009608) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.186708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9609, "loss": 0.20674942433834076, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7074909210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:14] (step=0009609) Train Loss: 0.1621, Train Steps/Sec: 0.28, Epoch: 0.1867275553828216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9610, "loss": 0.22196024656295776, "memory_gb": 7.721559524536133, "step_time_ms": 3363.940715789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:18] (step=0009610) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.18674698795180722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9611, "loss": 0.20748187601566315, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8783111572266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:21] (step=0009611) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.18676642052079284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9612, "loss": 0.24149483442306519, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4161109924316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:25] (step=0009612) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.18678585308977846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9613, "loss": 0.2606567144393921, "memory_gb": 7.715639114379883, "step_time_ms": 3335.104465484619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:28] (step=0009613) Train Loss: 0.2991, Train Steps/Sec: 0.28, Epoch: 0.1868052856587641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9614, "loss": 0.2826898694038391, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8227195739746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:32] (step=0009614) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.1868247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9615, "loss": 0.2615774869918823, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2184009552, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:36] (step=0009615) Train Loss: 0.2263, Train Steps/Sec: 0.28, Epoch: 0.18684415079673533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9616, "loss": 0.21318572759628296, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6209964752197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:39] (step=0009616) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.18686358336572095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9617, "loss": 0.22655335068702698, "memory_gb": 7.721559524536133, "step_time_ms": 3363.600015640259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:43] (step=0009617) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.18688301593470658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9618, "loss": 0.21347472071647644, "memory_gb": 7.721559524536133, "step_time_ms": 3363.680839538574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:46] (step=0009618) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.1869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9619, "loss": 0.21022355556488037, "memory_gb": 7.721559524536133, "step_time_ms": 3363.954782485962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:50] (step=0009619) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.18692188107267782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9620, "loss": 0.15703049302101135, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4661903381348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:54] (step=0009620) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.18694131364166341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:38:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9621, "loss": 0.20568490028381348, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4397258758545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:38:57] (step=0009621) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.18696074621064904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9622, "loss": 0.27811646461486816, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5484657287598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:01] (step=0009622) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.18698017877963466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9623, "loss": 0.16282840073108673, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1975860595703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:04] (step=0009623) Train Loss: 0.1845, Train Steps/Sec: 0.28, Epoch: 0.18699961134862028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9624, "loss": 0.29173463582992554, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8291358947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:08] (step=0009624) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.1870190439176059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9625, "loss": 0.3076833486557007, "memory_gb": 7.721559524536133, "step_time_ms": 3364.454507827759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:12] (step=0009625) Train Loss: 0.2494, Train Steps/Sec: 0.28, Epoch: 0.18703847648659153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9626, "loss": 0.16381900012493134, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7170066833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:15] (step=0009626) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.18705790905557715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9627, "loss": 0.1572219282388687, "memory_gb": 7.721559524536133, "step_time_ms": 3361.415147781372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:19] (step=0009627) Train Loss: 0.1631, Train Steps/Sec: 0.28, Epoch: 0.18707734162456277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9628, "loss": 0.2127676010131836, "memory_gb": 7.721559524536133, "step_time_ms": 3358.217239379883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:22] (step=0009628) Train Loss: 0.1871, Train Steps/Sec: 0.27, Epoch: 0.1870967741935484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9629, "loss": 0.20300325751304626, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2445125579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:26] (step=0009629) Train Loss: 0.2010, Train Steps/Sec: 0.28, Epoch: 0.18711620676253402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9630, "loss": 0.21202567219734192, "memory_gb": 7.721559524536133, "step_time_ms": 3357.909917831421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:30] (step=0009630) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.18713563933151964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9631, "loss": 0.20966321229934692, "memory_gb": 7.721559524536133, "step_time_ms": 3362.332582473755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:33] (step=0009631) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.18715507190050526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9632, "loss": 0.21541926264762878, "memory_gb": 7.721559524536133, "step_time_ms": 3362.586498260498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:37] (step=0009632) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.18717450446949085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9633, "loss": 0.18426170945167542, "memory_gb": 7.721559524536133, "step_time_ms": 3363.664150238037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:40] (step=0009633) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.18719393703847648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9634, "loss": 0.16858752071857452, "memory_gb": 7.721559524536133, "step_time_ms": 3359.17329788208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:44] (step=0009634) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.1872133696074621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9635, "loss": 0.3102334141731262, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1842136383057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:48] (step=0009635) Train Loss: 0.3178, Train Steps/Sec: 0.28, Epoch: 0.18723280217644772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9636, "loss": 0.20196640491485596, "memory_gb": 7.721559524536133, "step_time_ms": 3360.400438308716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:51] (step=0009636) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.18725223474543334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9637, "loss": 0.23867949843406677, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9211235046387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:55] (step=0009637) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.18727166731441897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:39:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9638, "loss": 0.21725843846797943, "memory_gb": 7.721559524536133, "step_time_ms": 3359.337568283081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:39:58] (step=0009638) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.1872910998834046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9639, "loss": 0.18571501970291138, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2916469573975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:02] (step=0009639) Train Loss: 0.1964, Train Steps/Sec: 0.28, Epoch: 0.1873105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9640, "loss": 0.25194448232650757, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7850074768066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:06] (step=0009640) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.18732996502137583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9641, "loss": 0.23100648820400238, "memory_gb": 7.721559524536133, "step_time_ms": 3356.005907058716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:09] (step=0009641) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.18734939759036146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9642, "loss": 0.20735161006450653, "memory_gb": 7.721559524536133, "step_time_ms": 3362.515687942505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:13] (step=0009642) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.18736883015934708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9643, "loss": 0.27300766110420227, "memory_gb": 7.721559524536133, "step_time_ms": 3355.628728866577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:16] (step=0009643) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.18738826272833267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9644, "loss": 0.26637545228004456, "memory_gb": 7.721559524536133, "step_time_ms": 3362.306833267212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:20] (step=0009644) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.1874076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9645, "loss": 0.19049493968486786, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1150093078613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:24] (step=0009645) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.18742712786630392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9646, "loss": 0.3027920722961426, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3123168945312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:27] (step=0009646) Train Loss: 0.3486, Train Steps/Sec: 0.28, Epoch: 0.18744656043528954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9647, "loss": 0.1634024679660797, "memory_gb": 7.721559524536133, "step_time_ms": 3356.552839279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:31] (step=0009647) Train Loss: 0.1613, Train Steps/Sec: 0.28, Epoch: 0.18746599300427516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9648, "loss": 0.2518025040626526, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5453548431396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:34] (step=0009648) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.18748542557326078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9649, "loss": 0.202872633934021, "memory_gb": 7.721559524536133, "step_time_ms": 3348.2630252838135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:38] (step=0009649) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.1875048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9650, "loss": 0.22472652792930603, "memory_gb": 7.721559524536133, "step_time_ms": 3358.55770111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:42] (step=0009650) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.18752429071123203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9651, "loss": 0.1962815523147583, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0799827575684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:45] (step=0009651) Train Loss: 0.2985, Train Steps/Sec: 0.28, Epoch: 0.18754372328021765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9652, "loss": 0.26028770208358765, "memory_gb": 7.721559524536133, "step_time_ms": 3353.469133377075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:49] (step=0009652) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.18756315584920327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9653, "loss": 0.1949732005596161, "memory_gb": 7.721559524536133, "step_time_ms": 3347.6388454437256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:52] (step=0009653) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.1875825884181889, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9654, "loss": 0.2750990390777588, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8084239959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:56] (step=0009654) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.18760202098717452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:40:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9655, "loss": 0.2772774398326874, "memory_gb": 7.721559524536133, "step_time_ms": 3496.122360229492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:40:59] (step=0009655) Train Loss: 0.2452, Train Steps/Sec: 0.28, Epoch: 0.1876214535561601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9656, "loss": 0.2019772082567215, "memory_gb": 7.721559524536133, "step_time_ms": 3352.885961532593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:03] (step=0009656) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.18764088612514573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9657, "loss": 0.30626410245895386, "memory_gb": 7.721559524536133, "step_time_ms": 3353.071451187134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:07] (step=0009657) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.18766031869413136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9658, "loss": 0.22491833567619324, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9603691101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:10] (step=0009658) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.18767975126311698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9659, "loss": 0.24603356420993805, "memory_gb": 7.721559524536133, "step_time_ms": 3346.17280960083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:14] (step=0009659) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.1876991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9660, "loss": 0.174286350607872, "memory_gb": 7.721559524536133, "step_time_ms": 3357.03444480896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:17] (step=0009660) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.18771861640108822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9661, "loss": 0.21562211215496063, "memory_gb": 7.721559524536133, "step_time_ms": 3356.294631958008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:21] (step=0009661) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.18773804897007385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9662, "loss": 0.28090155124664307, "memory_gb": 7.721559524536133, "step_time_ms": 3354.907751083374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:24] (step=0009662) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.18775748153905947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9663, "loss": 0.17074759304523468, "memory_gb": 7.721559524536133, "step_time_ms": 3357.304334640503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:28] (step=0009663) Train Loss: 0.1684, Train Steps/Sec: 0.28, Epoch: 0.1877769141080451, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9664, "loss": 0.29924511909484863, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3853759765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:32] (step=0009664) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.1877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9665, "loss": 0.2298499494791031, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8785190582275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:35] (step=0009665) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.18781577924601633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9666, "loss": 0.2343839406967163, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8381843566895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:39] (step=0009666) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.18783521181500196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9667, "loss": 0.3062092661857605, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4818630218506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:42] (step=0009667) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.18785464438398755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9668, "loss": 0.20905375480651855, "memory_gb": 7.721559524536133, "step_time_ms": 3351.126194000244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:46] (step=0009668) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.18787407695297317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9669, "loss": 0.22052603960037231, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1623096466064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:50] (step=0009669) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.1878935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9670, "loss": 0.26476001739501953, "memory_gb": 7.715639114379883, "step_time_ms": 3320.864200592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:53] (step=0009670) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.18791294209094442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:41:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9671, "loss": 0.15785345435142517, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5961589813232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:41:57] (step=0009671) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.18793237465993004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9672, "loss": 0.2813081741333008, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8053035736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:00] (step=0009672) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.18795180722891566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9673, "loss": 0.37513482570648193, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8239212036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:04] (step=0009673) Train Loss: 0.3523, Train Steps/Sec: 0.28, Epoch: 0.18797123979790129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9674, "loss": 0.18521687388420105, "memory_gb": 7.721559524536133, "step_time_ms": 3356.930732727051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:07] (step=0009674) Train Loss: 0.1711, Train Steps/Sec: 0.28, Epoch: 0.1879906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9675, "loss": 0.17993834614753723, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5638790130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:11] (step=0009675) Train Loss: 0.1626, Train Steps/Sec: 0.27, Epoch: 0.18801010493587253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9676, "loss": 0.27523767948150635, "memory_gb": 7.721559524536133, "step_time_ms": 3349.81107711792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:15] (step=0009676) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.18802953750485815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9677, "loss": 0.22170352935791016, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7821769714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:18] (step=0009677) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.18804897007384377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9678, "loss": 0.28747090697288513, "memory_gb": 7.721559524536133, "step_time_ms": 3357.300043106079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:22] (step=0009678) Train Loss: 0.2696, Train Steps/Sec: 0.28, Epoch: 0.18806840264282937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9679, "loss": 0.30310484766960144, "memory_gb": 7.721559524536133, "step_time_ms": 3357.936382293701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:25] (step=0009679) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.188087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9680, "loss": 0.1887199878692627, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2867851257324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:29] (step=0009680) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.1881072677808006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9681, "loss": 0.28173011541366577, "memory_gb": 7.721559524536133, "step_time_ms": 3352.882146835327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:33] (step=0009681) Train Loss: 0.3213, Train Steps/Sec: 0.28, Epoch: 0.18812670034978624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9682, "loss": 0.21459992229938507, "memory_gb": 7.721559524536133, "step_time_ms": 3355.942487716675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:36] (step=0009682) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.18814613291877186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9683, "loss": 0.2796977460384369, "memory_gb": 7.721559524536133, "step_time_ms": 3360.010862350464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:40] (step=0009683) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.18816556548775748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9684, "loss": 0.23515476286411285, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5850467681885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:43] (step=0009684) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.1881849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9685, "loss": 0.29578056931495667, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3250980377197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:47] (step=0009685) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.18820443062572872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9686, "loss": 0.21063587069511414, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0469665527344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:50] (step=0009686) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.18822386319471435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9687, "loss": 0.28046518564224243, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6513996124268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:54] (step=0009687) Train Loss: 0.3184, Train Steps/Sec: 0.28, Epoch: 0.18824329576369997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:42:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9688, "loss": 0.17036601901054382, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9445819854736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:42:58] (step=0009688) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.1882627283326856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9689, "loss": 0.22333049774169922, "memory_gb": 7.721559524536133, "step_time_ms": 3346.072196960449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:01] (step=0009689) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.18828216090167121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9690, "loss": 0.14363586902618408, "memory_gb": 7.721559524536133, "step_time_ms": 3355.595588684082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:05] (step=0009690) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.1883015934706568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9691, "loss": 0.29658186435699463, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2004261016846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:08] (step=0009691) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.18832102603964243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9692, "loss": 0.20287713408470154, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9137535095215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:12] (step=0009692) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.18834045860862805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9693, "loss": 0.29000186920166016, "memory_gb": 7.721559524536133, "step_time_ms": 3359.213352203369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:16] (step=0009693) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.18835989117761368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9694, "loss": 0.21425171196460724, "memory_gb": 7.721559524536133, "step_time_ms": 3360.593557357788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:19] (step=0009694) Train Loss: 0.1697, Train Steps/Sec: 0.28, Epoch: 0.1883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9695, "loss": 0.27431541681289673, "memory_gb": 7.721559524536133, "step_time_ms": 3501.9383430480957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:23] (step=0009695) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.18839875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9696, "loss": 0.33185476064682007, "memory_gb": 7.721559524536133, "step_time_ms": 3362.133026123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:26] (step=0009696) Train Loss: 0.2960, Train Steps/Sec: 0.28, Epoch: 0.18841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9697, "loss": 0.2472163736820221, "memory_gb": 7.721559524536133, "step_time_ms": 3364.211320877075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:30] (step=0009697) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.18843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9698, "loss": 0.25007471442222595, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8787784576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:33] (step=0009698) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.1884570540225418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9699, "loss": 0.2701278328895569, "memory_gb": 7.721559524536133, "step_time_ms": 3352.780342102051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:37] (step=0009699) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.1884764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9700, "loss": 0.1963653564453125, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9492053985596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:41] (step=0009700) Train Loss: 0.1839, Train Steps/Sec: 0.28, Epoch: 0.18849591916051303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9701, "loss": 0.15942512452602386, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6414375305176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:44] (step=0009701) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.18851535172949863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9702, "loss": 0.245533287525177, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2965087890625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:48] (step=0009702) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.18853478429848425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9703, "loss": 0.251974880695343, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8096675872803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:51] (step=0009703) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.18855421686746987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9704, "loss": 0.20797695219516754, "memory_gb": 7.721559524536133, "step_time_ms": 3364.001750946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:55] (step=0009704) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.1885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9705, "loss": 0.19549885392189026, "memory_gb": 7.721559524536133, "step_time_ms": 3362.711191177368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:43:59] (step=0009705) Train Loss: 0.1769, Train Steps/Sec: 0.28, Epoch: 0.18859308200544112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9706, "loss": 0.23456916213035583, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7304096221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:02] (step=0009706) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.18861251457442674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9707, "loss": 0.3185218274593353, "memory_gb": 7.721559524536133, "step_time_ms": 3363.131284713745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:06] (step=0009707) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.18863194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9708, "loss": 0.24638496339321136, "memory_gb": 7.721559524536133, "step_time_ms": 3365.86332321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:09] (step=0009708) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.18865137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9709, "loss": 0.2612481117248535, "memory_gb": 7.721559524536133, "step_time_ms": 3369.469404220581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:13] (step=0009709) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.1886708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9710, "loss": 0.15968307852745056, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6650733947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:16] (step=0009710) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.18869024485036923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9711, "loss": 0.26907289028167725, "memory_gb": 7.721559524536133, "step_time_ms": 3363.438844680786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:20] (step=0009711) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.18870967741935485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9712, "loss": 0.18678772449493408, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0788021087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:24] (step=0009712) Train Loss: 0.2000, Train Steps/Sec: 0.28, Epoch: 0.18872910998834047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9713, "loss": 0.14181514084339142, "memory_gb": 7.721559524536133, "step_time_ms": 3368.826389312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:27] (step=0009713) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.18874854255732607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9714, "loss": 0.16941800713539124, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9309616088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:31] (step=0009714) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.1887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9715, "loss": 0.30945897102355957, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5503311157227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:34] (step=0009715) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.1887874076952973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9716, "loss": 0.29496437311172485, "memory_gb": 7.721559524536133, "step_time_ms": 3366.183280944824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:38] (step=0009716) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.18880684026428293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9717, "loss": 0.26846638321876526, "memory_gb": 7.721559524536133, "step_time_ms": 3364.124536514282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:42] (step=0009717) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.18882627283326855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9718, "loss": 0.24533772468566895, "memory_gb": 7.721559524536133, "step_time_ms": 3359.374523162842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:45] (step=0009718) Train Loss: 0.2921, Train Steps/Sec: 0.28, Epoch: 0.18884570540225418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9719, "loss": 0.1985422968864441, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9801273345947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:49] (step=0009719) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.1888651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9720, "loss": 0.1859644651412964, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3012771606445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:52] (step=0009720) Train Loss: 0.1948, Train Steps/Sec: 0.28, Epoch: 0.18888457054022542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:44:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9721, "loss": 0.182455912232399, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8764877319336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:44:56] (step=0009721) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.18890400310921104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9722, "loss": 0.33462464809417725, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9657497406006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:00] (step=0009722) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.18892343567819667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9723, "loss": 0.15947315096855164, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2970581054688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:03] (step=0009723) Train Loss: 0.2079, Train Steps/Sec: 0.27, Epoch: 0.1889428682471823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9724, "loss": 0.2408965528011322, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1179332733154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:07] (step=0009724) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.1889623008161679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9725, "loss": 0.23723942041397095, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5334968566895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:11] (step=0009725) Train Loss: 0.2707, Train Steps/Sec: 0.28, Epoch: 0.1889817333851535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9726, "loss": 0.29449284076690674, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4117164611816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:14] (step=0009726) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.18900116595413913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9727, "loss": 0.2510298490524292, "memory_gb": 7.721559524536133, "step_time_ms": 3361.687183380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:18] (step=0009727) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.18902059852312475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9728, "loss": 0.14451420307159424, "memory_gb": 7.721559524536133, "step_time_ms": 3361.690044403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:21] (step=0009728) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.18904003109211037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9729, "loss": 0.31108227372169495, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5583534240723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:25] (step=0009729) Train Loss: 0.2997, Train Steps/Sec: 0.28, Epoch: 0.189059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9730, "loss": 0.19161081314086914, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7731075286865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:29] (step=0009730) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.18907889623008162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9731, "loss": 0.1497947871685028, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4067306518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:32] (step=0009731) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.18909832879906724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9732, "loss": 0.1874040961265564, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3497276306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:36] (step=0009732) Train Loss: 0.2875, Train Steps/Sec: 0.28, Epoch: 0.18911776136805286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9733, "loss": 0.20807687938213348, "memory_gb": 7.721559524536133, "step_time_ms": 3359.785556793213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:39] (step=0009733) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.18913719393703848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9734, "loss": 0.14851683378219604, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6926879882812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:43] (step=0009734) Train Loss: 0.1559, Train Steps/Sec: 0.28, Epoch: 0.1891566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9735, "loss": 0.19826549291610718, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5947494506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:47] (step=0009735) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.18917605907500973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9736, "loss": 0.20365408062934875, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6038818359375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:50] (step=0009736) Train Loss: 0.2229, Train Steps/Sec: 0.28, Epoch: 0.18919549164399532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9737, "loss": 0.2034265697002411, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5692176818848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:54] (step=0009737) Train Loss: 0.1757, Train Steps/Sec: 0.28, Epoch: 0.18921492421298094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:45:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9738, "loss": 0.226791113615036, "memory_gb": 7.721559524536133, "step_time_ms": 3358.844995498657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:45:57] (step=0009738) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.18923435678196657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9739, "loss": 0.21692612767219543, "memory_gb": 7.721559524536133, "step_time_ms": 3355.346441268921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:01] (step=0009739) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.1892537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9740, "loss": 0.2279345840215683, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1064682006836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:05] (step=0009740) Train Loss: 0.2077, Train Steps/Sec: 0.28, Epoch: 0.1892732219199378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9741, "loss": 0.14802487194538116, "memory_gb": 7.721559524536133, "step_time_ms": 3355.726480484009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:08] (step=0009741) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.18929265448892343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9742, "loss": 0.23004227876663208, "memory_gb": 7.721559524536133, "step_time_ms": 3507.572889328003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:12] (step=0009742) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.18931208705790906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9743, "loss": 0.21718212962150574, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7446937561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:15] (step=0009743) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.18933151962689468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9744, "loss": 0.18315598368644714, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0212078094482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:19] (step=0009744) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.1893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9745, "loss": 0.17520397901535034, "memory_gb": 7.721559524536133, "step_time_ms": 3360.743284225464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:22] (step=0009745) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.18937038476486592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9746, "loss": 0.21624913811683655, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6060485839844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:26] (step=0009746) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.18938981733385155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9747, "loss": 0.29392367601394653, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0309410095215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:30] (step=0009747) Train Loss: 0.3271, Train Steps/Sec: 0.28, Epoch: 0.18940924990283717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9748, "loss": 0.33908480405807495, "memory_gb": 7.721559524536133, "step_time_ms": 3344.917058944702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:33] (step=0009748) Train Loss: 0.3208, Train Steps/Sec: 0.28, Epoch: 0.18942868247182276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9749, "loss": 0.2885081470012665, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8239002227783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:37] (step=0009749) Train Loss: 0.2886, Train Steps/Sec: 0.28, Epoch: 0.18944811504080838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9750, "loss": 0.25697264075279236, "memory_gb": 7.721559524536133, "step_time_ms": 3352.57625579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:40] (step=0009750) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.189467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9751, "loss": 0.1407623440027237, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5703887939453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:44] (step=0009751) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.18948698017877963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9752, "loss": 0.2933439016342163, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4925498962402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:47] (step=0009752) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.18950641274776525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9753, "loss": 0.3064177632331848, "memory_gb": 7.721559524536133, "step_time_ms": 3354.806661605835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:51] (step=0009753) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.18952584531675087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9754, "loss": 0.1908784806728363, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5023460388184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:55] (step=0009754) Train Loss: 0.1848, Train Steps/Sec: 0.28, Epoch: 0.1895452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:46:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9755, "loss": 0.2445134073495865, "memory_gb": 7.721559524536133, "step_time_ms": 3352.802276611328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:46:58] (step=0009755) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.18956471045472212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9756, "loss": 0.3753383457660675, "memory_gb": 7.721559524536133, "step_time_ms": 3352.548599243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:02] (step=0009756) Train Loss: 0.3196, Train Steps/Sec: 0.28, Epoch: 0.18958414302370774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9757, "loss": 0.25511032342910767, "memory_gb": 7.721559524536133, "step_time_ms": 3358.058214187622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:05] (step=0009757) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.18960357559269336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9758, "loss": 0.2061280906200409, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7794303894043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:09] (step=0009758) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.18962300816167899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9759, "loss": 0.2866007387638092, "memory_gb": 7.721559524536133, "step_time_ms": 3348.510980606079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:13] (step=0009759) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.1896424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9760, "loss": 0.2567017078399658, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9799194335938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:16] (step=0009760) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.1896618732996502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9761, "loss": 0.20487934350967407, "memory_gb": 7.721559524536133, "step_time_ms": 3346.4672565460205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:20] (step=0009761) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.18968130586863582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9762, "loss": 0.18552140891551971, "memory_gb": 7.721559524536133, "step_time_ms": 3349.902391433716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:23] (step=0009762) Train Loss: 0.1718, Train Steps/Sec: 0.28, Epoch: 0.18970073843762145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9763, "loss": 0.19255244731903076, "memory_gb": 7.721559524536133, "step_time_ms": 3356.667995452881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:27] (step=0009763) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.18972017100660707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9764, "loss": 0.20648333430290222, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6582927703857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:31] (step=0009764) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.1897396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9765, "loss": 0.17128244042396545, "memory_gb": 7.721559524536133, "step_time_ms": 3356.071710586548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:34] (step=0009765) Train Loss: 0.2412, Train Steps/Sec: 0.28, Epoch: 0.1897590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9766, "loss": 0.2940865457057953, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0590019226074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:38] (step=0009766) Train Loss: 0.2935, Train Steps/Sec: 0.28, Epoch: 0.18977846871356394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9767, "loss": 0.22094394266605377, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4928302764893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:41] (step=0009767) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.18979790128254956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9768, "loss": 0.17832183837890625, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0853729248047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:45] (step=0009768) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.18981733385153518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9769, "loss": 0.2660447359085083, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6012592315674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:48] (step=0009769) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.1898367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9770, "loss": 0.21657223999500275, "memory_gb": 7.721559524536133, "step_time_ms": 3354.132890701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:52] (step=0009770) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.18985619898950643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9771, "loss": 0.2697182893753052, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6738109588623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:56] (step=0009771) Train Loss: 0.3257, Train Steps/Sec: 0.28, Epoch: 0.18987563155849202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:47:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9772, "loss": 0.20103290677070618, "memory_gb": 7.721559524536133, "step_time_ms": 3359.772205352783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:47:59] (step=0009772) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.18989506412747764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9773, "loss": 0.1665973961353302, "memory_gb": 7.721559524536133, "step_time_ms": 3360.778331756592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:03] (step=0009773) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.18991449669646326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9774, "loss": 0.3062065839767456, "memory_gb": 7.721559524536133, "step_time_ms": 3358.771562576294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:06] (step=0009774) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.1899339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9775, "loss": 0.18118077516555786, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3500385284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:10] (step=0009775) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.1899533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9776, "loss": 0.24246442317962646, "memory_gb": 7.721559524536133, "step_time_ms": 3355.541706085205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:13] (step=0009776) Train Loss: 0.2645, Train Steps/Sec: 0.28, Epoch: 0.18997279440342013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9777, "loss": 0.2947538197040558, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0915203094482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:17] (step=0009777) Train Loss: 0.3275, Train Steps/Sec: 0.28, Epoch: 0.18999222697240575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9778, "loss": 0.2953844666481018, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8976860046387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:21] (step=0009778) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.19001165954139138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9779, "loss": 0.20446696877479553, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9717407226562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:24] (step=0009779) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.190031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9780, "loss": 0.26182931661605835, "memory_gb": 7.715639114379883, "step_time_ms": 3323.2431411743164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:28] (step=0009780) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.19005052467936262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9781, "loss": 0.2616722583770752, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2914295196533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:31] (step=0009781) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.19006995724834824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9782, "loss": 0.32248997688293457, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9909591674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:35] (step=0009782) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.19008938981733386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9783, "loss": 0.1989021897315979, "memory_gb": 7.721559524536133, "step_time_ms": 3497.225046157837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:38] (step=0009783) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.19010882238631946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9784, "loss": 0.2511966824531555, "memory_gb": 7.721559524536133, "step_time_ms": 3357.706069946289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:42] (step=0009784) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.19012825495530508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9785, "loss": 0.3245357871055603, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2704277038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:46] (step=0009785) Train Loss: 0.3095, Train Steps/Sec: 0.28, Epoch: 0.1901476875242907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9786, "loss": 0.17743682861328125, "memory_gb": 7.721559524536133, "step_time_ms": 3355.679988861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:49] (step=0009786) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.19016712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9787, "loss": 0.22253285348415375, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7281188964844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:53] (step=0009787) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.19018655266226195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:48:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9788, "loss": 0.18492582440376282, "memory_gb": 7.721559524536133, "step_time_ms": 3353.170871734619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:48:56] (step=0009788) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.19020598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9789, "loss": 0.17701910436153412, "memory_gb": 7.715639114379883, "step_time_ms": 3326.974630355835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:00] (step=0009789) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.1902254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9790, "loss": 0.2423924058675766, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9207611083984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:03] (step=0009790) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.19024485036921882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9791, "loss": 0.20957158505916595, "memory_gb": 7.721559524536133, "step_time_ms": 3362.029790878296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:07] (step=0009791) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.19026428293820444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9792, "loss": 0.23500214517116547, "memory_gb": 7.721559524536133, "step_time_ms": 3362.856864929199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:11] (step=0009792) Train Loss: 0.1919, Train Steps/Sec: 0.28, Epoch: 0.19028371550719006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9793, "loss": 0.1830592304468155, "memory_gb": 7.721559524536133, "step_time_ms": 3361.008882522583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:14] (step=0009793) Train Loss: 0.1751, Train Steps/Sec: 0.28, Epoch: 0.19030314807617568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9794, "loss": 0.23961493372917175, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9441471099854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:18] (step=0009794) Train Loss: 0.2274, Train Steps/Sec: 0.28, Epoch: 0.19032258064516128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9795, "loss": 0.23806831240653992, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2496433258057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:21] (step=0009795) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.1903420132141469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9796, "loss": 0.23824742436408997, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2280101776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:25] (step=0009796) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.19036144578313252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9797, "loss": 0.2330285906791687, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9534244537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:29] (step=0009797) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.19038087835211814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9798, "loss": 0.2398512214422226, "memory_gb": 7.721559524536133, "step_time_ms": 3364.701271057129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:32] (step=0009798) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.19040031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9799, "loss": 0.19482621550559998, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0752563476562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:36] (step=0009799) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.1904197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9800, "loss": 0.21672219038009644, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9650344848633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:39] (step=0009800) Train Loss: 0.1674, Train Steps/Sec: 0.28, Epoch: 0.190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9801, "loss": 0.318569540977478, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7072315216064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:43] (step=0009801) Train Loss: 0.3130, Train Steps/Sec: 0.28, Epoch: 0.19045860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9802, "loss": 0.2196352481842041, "memory_gb": 7.721559524536133, "step_time_ms": 3365.731716156006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:46] (step=0009802) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.19047804119704626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9803, "loss": 0.1955236792564392, "memory_gb": 7.721559524536133, "step_time_ms": 3354.637384414673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:50] (step=0009803) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.19049747376603188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9804, "loss": 0.23725461959838867, "memory_gb": 7.721559524536133, "step_time_ms": 3369.0028190612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:54] (step=0009804) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.1905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:49:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9805, "loss": 0.2238190770149231, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2759323120117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:49:57] (step=0009805) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.19053633890400312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9806, "loss": 0.3651711046695709, "memory_gb": 7.721559524536133, "step_time_ms": 3358.259439468384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:01] (step=0009806) Train Loss: 0.3258, Train Steps/Sec: 0.28, Epoch: 0.19055577147298872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9807, "loss": 0.3379555940628052, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5170459747314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:04] (step=0009807) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.19057520404197434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9808, "loss": 0.25252753496170044, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4094047546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:08] (step=0009808) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.19059463661095996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9809, "loss": 0.24620425701141357, "memory_gb": 7.721559524536133, "step_time_ms": 3364.832878112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:12] (step=0009809) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.19061406917994558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9810, "loss": 0.377789169549942, "memory_gb": 7.721559524536133, "step_time_ms": 3368.129253387451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:15] (step=0009810) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.1906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9811, "loss": 0.23873810470104218, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1088733673096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:19] (step=0009811) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.19065293431791683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9812, "loss": 0.2979103624820709, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3784942626953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:22] (step=0009812) Train Loss: 0.2845, Train Steps/Sec: 0.27, Epoch: 0.19067236688690245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9813, "loss": 0.1963132619857788, "memory_gb": 7.721559524536133, "step_time_ms": 3364.338159561157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:26] (step=0009813) Train Loss: 0.1757, Train Steps/Sec: 0.28, Epoch: 0.19069179945588807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9814, "loss": 0.3117898106575012, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1158599853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:30] (step=0009814) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.1907112320248737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9815, "loss": 0.1969284564256668, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2639904022217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:33] (step=0009815) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.19073066459385932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9816, "loss": 0.26772332191467285, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4635467529297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:37] (step=0009816) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.19075009716284494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9817, "loss": 0.3289428949356079, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7830486297607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:40] (step=0009817) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.19076952973183056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9818, "loss": 0.2505723834037781, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0575618743896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:44] (step=0009818) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.19078896230081616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9819, "loss": 0.20383861660957336, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5119667053223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:48] (step=0009819) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.19080839486980178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9820, "loss": 0.2756842374801636, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4075603485107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:51] (step=0009820) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.1908278274387874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9821, "loss": 0.16122323274612427, "memory_gb": 7.721559524536133, "step_time_ms": 3368.075132369995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:55] (step=0009821) Train Loss: 0.1767, Train Steps/Sec: 0.28, Epoch: 0.19084726000777302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9822, "loss": 0.2532563805580139, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6890182495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:50:58] (step=0009822) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.19086669257675865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9823, "loss": 0.1415019929409027, "memory_gb": 7.721559524536133, "step_time_ms": 3366.00923538208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:02] (step=0009823) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.19088612514574427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9824, "loss": 0.28471535444259644, "memory_gb": 7.721559524536133, "step_time_ms": 3506.011724472046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:06] (step=0009824) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.1909055577147299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9825, "loss": 0.38922977447509766, "memory_gb": 7.721559524536133, "step_time_ms": 3356.879234313965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:09] (step=0009825) Train Loss: 0.3080, Train Steps/Sec: 0.28, Epoch: 0.1909249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9826, "loss": 0.2723196744918823, "memory_gb": 7.721559524536133, "step_time_ms": 3361.971616744995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:13] (step=0009826) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.19094442285270113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9827, "loss": 0.2488349974155426, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2427196502686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:16] (step=0009827) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.19096385542168676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9828, "loss": 0.20680952072143555, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2958965301514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:20] (step=0009828) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.19098328799067238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9829, "loss": 0.2854948341846466, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6692810058594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:24] (step=0009829) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.19100272055965797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9830, "loss": 0.2541385889053345, "memory_gb": 7.721559524536133, "step_time_ms": 3362.245798110962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:27] (step=0009830) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.1910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9831, "loss": 0.2701053023338318, "memory_gb": 7.721559524536133, "step_time_ms": 3364.78590965271, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:31] (step=0009831) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.19104158569762922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9832, "loss": 0.14907342195510864, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1973056793213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:34] (step=0009832) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.19106101826661484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9833, "loss": 0.13352596759796143, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2519035339355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:38] (step=0009833) Train Loss: 0.1767, Train Steps/Sec: 0.28, Epoch: 0.19108045083560046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9834, "loss": 0.3226752281188965, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6832237243652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:42] (step=0009834) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.19109988340458609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9835, "loss": 0.2864450216293335, "memory_gb": 7.721559524536133, "step_time_ms": 3356.12416267395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:45] (step=0009835) Train Loss: 0.2467, Train Steps/Sec: 0.28, Epoch: 0.1911193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9836, "loss": 0.2698054313659668, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9885654449463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:49] (step=0009836) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.19113874854255733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9837, "loss": 0.1415870040655136, "memory_gb": 7.721559524536133, "step_time_ms": 3361.961841583252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:52] (step=0009837) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.19115818111154295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:51:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9838, "loss": 0.21277347207069397, "memory_gb": 7.721559524536133, "step_time_ms": 3360.891819000244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:51:56] (step=0009838) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.19117761368052857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9839, "loss": 0.2822890281677246, "memory_gb": 7.721559524536133, "step_time_ms": 3358.290910720825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:00] (step=0009839) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.1911970462495142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9840, "loss": 0.333830326795578, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4493865966797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:03] (step=0009840) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.19121647881849982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9841, "loss": 0.2712517976760864, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5939407348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:07] (step=0009841) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.1912359113874854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9842, "loss": 0.25996294617652893, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2674522399902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:10] (step=0009842) Train Loss: 0.3086, Train Steps/Sec: 0.28, Epoch: 0.19125534395647104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9843, "loss": 0.22671887278556824, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3734340667725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:14] (step=0009843) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.19127477652545666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9844, "loss": 0.2276088297367096, "memory_gb": 7.721559524536133, "step_time_ms": 3351.628303527832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:18] (step=0009844) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.19129420909444228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9845, "loss": 0.2738455533981323, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5592250823975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:21] (step=0009845) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.1913136416634279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9846, "loss": 0.2741996645927429, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1198711395264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:25] (step=0009846) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.19133307423241352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9847, "loss": 0.1639806181192398, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7025451660156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:28] (step=0009847) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.19135250680139915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9848, "loss": 0.18754622340202332, "memory_gb": 7.721559524536133, "step_time_ms": 3356.204032897949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:32] (step=0009848) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.19137193937038477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9849, "loss": 0.24567025899887085, "memory_gb": 7.721559524536133, "step_time_ms": 3356.187343597412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:35] (step=0009849) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.1913913719393704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9850, "loss": 0.23525334894657135, "memory_gb": 7.721559524536133, "step_time_ms": 3339.3759727478027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:39] (step=0009850) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.19141080450835601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9851, "loss": 0.28724610805511475, "memory_gb": 7.721559524536133, "step_time_ms": 3350.522518157959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:43] (step=0009851) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.19143023707734164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9852, "loss": 0.25152263045310974, "memory_gb": 7.721559524536133, "step_time_ms": 3359.497308731079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:46] (step=0009852) Train Loss: 0.2767, Train Steps/Sec: 0.27, Epoch: 0.19144966964632723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9853, "loss": 0.18756727874279022, "memory_gb": 7.715639114379883, "step_time_ms": 3314.3959045410156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:50] (step=0009853) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.19146910221531285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9854, "loss": 0.3166569471359253, "memory_gb": 7.721559524536133, "step_time_ms": 3342.862844467163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:53] (step=0009854) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.19148853478429848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:52:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9855, "loss": 0.25428199768066406, "memory_gb": 7.721559524536133, "step_time_ms": 3357.428550720215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:52:57] (step=0009855) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.1915079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9856, "loss": 0.23206783831119537, "memory_gb": 7.721559524536133, "step_time_ms": 3352.12779045105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:01] (step=0009856) Train Loss: 0.1888, Train Steps/Sec: 0.28, Epoch: 0.19152739992226972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9857, "loss": 0.3629748821258545, "memory_gb": 7.721559524536133, "step_time_ms": 3357.454538345337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:04] (step=0009857) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.19154683249125534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9858, "loss": 0.23490330576896667, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7236499786377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:08] (step=0009858) Train Loss: 0.2604, Train Steps/Sec: 0.28, Epoch: 0.19156626506024096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9859, "loss": 0.1971888542175293, "memory_gb": 7.721559524536133, "step_time_ms": 3353.80220413208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:11] (step=0009859) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.1915856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9860, "loss": 0.25911766290664673, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2417030334473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:15] (step=0009860) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.1916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9861, "loss": 0.1623300015926361, "memory_gb": 7.721559524536133, "step_time_ms": 3352.097749710083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:19] (step=0009861) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.19162456276719783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9862, "loss": 0.15465541183948517, "memory_gb": 7.721559524536133, "step_time_ms": 3354.964017868042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:22] (step=0009862) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.19164399533618345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9863, "loss": 0.2666594386100769, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6222705841064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:26] (step=0009863) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.19166342790516908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9864, "loss": 0.2564069330692291, "memory_gb": 7.721559524536133, "step_time_ms": 3354.762077331543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:29] (step=0009864) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.19168286047415467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9865, "loss": 0.18922242522239685, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0383529663086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:33] (step=0009865) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.1917022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9866, "loss": 0.22752907872200012, "memory_gb": 7.721559524536133, "step_time_ms": 3354.536771774292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:36] (step=0009866) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.19172172561212592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9867, "loss": 0.30067455768585205, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2626628875732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:40] (step=0009867) Train Loss: 0.2738, Train Steps/Sec: 0.28, Epoch: 0.19174115818111154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9868, "loss": 0.2205260545015335, "memory_gb": 7.721559524536133, "step_time_ms": 3353.031635284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:44] (step=0009868) Train Loss: 0.1713, Train Steps/Sec: 0.28, Epoch: 0.19176059075009716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9869, "loss": 0.21987077593803406, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2122116088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:47] (step=0009869) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.19178002331908278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9870, "loss": 0.20250287652015686, "memory_gb": 7.721559524536133, "step_time_ms": 3352.68497467041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:51] (step=0009870) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.1917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9871, "loss": 0.24483558535575867, "memory_gb": 7.721559524536133, "step_time_ms": 3346.6403484344482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:54] (step=0009871) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.19181888845705403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:53:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9872, "loss": 0.18831831216812134, "memory_gb": 7.721559524536133, "step_time_ms": 3498.6371994018555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:53:58] (step=0009872) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.19183832102603965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9873, "loss": 0.28672364354133606, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8744144439697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:01] (step=0009873) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.19185775359502527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9874, "loss": 0.23279176652431488, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3387603759766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:05] (step=0009874) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.1918771861640109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9875, "loss": 0.22681380808353424, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8151931762695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:09] (step=0009875) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.19189661873299652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9876, "loss": 0.30016210675239563, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3199768066406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:12] (step=0009876) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.1919160513019821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9877, "loss": 0.20879563689231873, "memory_gb": 7.721559524536133, "step_time_ms": 3359.516143798828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:16] (step=0009877) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.19193548387096773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9878, "loss": 0.2667137384414673, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6768169403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:19] (step=0009878) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.19195491643995335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9879, "loss": 0.23639340698719025, "memory_gb": 7.721559524536133, "step_time_ms": 3359.287738800049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:23] (step=0009879) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.19197434900893898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9880, "loss": 0.2804144620895386, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3058586120605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:26] (step=0009880) Train Loss: 0.2946, Train Steps/Sec: 0.28, Epoch: 0.1919937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9881, "loss": 0.19415700435638428, "memory_gb": 7.721559524536133, "step_time_ms": 3358.168125152588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:30] (step=0009881) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.19201321414691022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9882, "loss": 0.3112979531288147, "memory_gb": 7.721559524536133, "step_time_ms": 3354.853630065918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:34] (step=0009882) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.19203264671589584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9883, "loss": 0.24219168722629547, "memory_gb": 7.721559524536133, "step_time_ms": 3348.6270904541016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:37] (step=0009883) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.19205207928488147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 9884, "loss": 0.20000457763671875, "memory_gb": 7.721559524536133, "step_time_ms": 3361.896276473999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:41] (step=0009884) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.1920715118538671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9885, "loss": 0.19056479632854462, "memory_gb": 7.721559524536133, "step_time_ms": 3358.60538482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:44] (step=0009885) Train Loss: 0.2002, Train Steps/Sec: 0.28, Epoch: 0.1920909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9886, "loss": 0.18481045961380005, "memory_gb": 7.721559524536133, "step_time_ms": 3359.581232070923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:48] (step=0009886) Train Loss: 0.1868, Train Steps/Sec: 0.28, Epoch: 0.19211037699183833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9887, "loss": 0.2669305205345154, "memory_gb": 7.721559524536133, "step_time_ms": 3360.457420349121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:52] (step=0009887) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.19212980956082393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9888, "loss": 0.21588531136512756, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6930503845215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:55] (step=0009888) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.19214924212980955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:54:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 9889, "loss": 0.2279176115989685, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4439544677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:54:59] (step=0009889) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.19216867469879517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9890, "loss": 0.16384801268577576, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0969104766846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:02] (step=0009890) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.1921881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9891, "loss": 0.23804518580436707, "memory_gb": 7.721559524536133, "step_time_ms": 3352.341651916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:06] (step=0009891) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.19220753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9892, "loss": 0.20999592542648315, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0399494171143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:09] (step=0009892) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.19222697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9893, "loss": 0.23371927440166473, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1464080810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:13] (step=0009893) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.19224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9894, "loss": 0.2897399365901947, "memory_gb": 7.721559524536133, "step_time_ms": 3362.623453140259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:17] (step=0009894) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.19226583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9895, "loss": 0.22614286839962006, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8375205993652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:20] (step=0009895) Train Loss: 0.2215, Train Steps/Sec: 0.28, Epoch: 0.1922852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9896, "loss": 0.1459749937057495, "memory_gb": 7.721559524536133, "step_time_ms": 3358.795404434204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:24] (step=0009896) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.19230470268169453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9897, "loss": 0.16582079231739044, "memory_gb": 7.721559524536133, "step_time_ms": 3362.16402053833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:27] (step=0009897) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.19232413525068015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9898, "loss": 0.34611448645591736, "memory_gb": 7.721559524536133, "step_time_ms": 3360.334634780884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:31] (step=0009898) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.19234356781966577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9899, "loss": 0.2311374694108963, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5660877227783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:35] (step=0009899) Train Loss: 0.2093, Train Steps/Sec: 0.27, Epoch: 0.19236300038865137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9900, "loss": 0.3275495767593384, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5990200042725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:38] (step=0009900) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9901, "loss": 0.25909775495529175, "memory_gb": 7.721559524536133, "step_time_ms": 3361.966371536255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:42] (step=0009901) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.1924018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9902, "loss": 0.20449841022491455, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2543392181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:45] (step=0009902) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.19242129809560823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9903, "loss": 0.23845797777175903, "memory_gb": 7.721559524536133, "step_time_ms": 3361.786365509033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:49] (step=0009903) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.19244073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9904, "loss": 0.21403345465660095, "memory_gb": 7.721559524536133, "step_time_ms": 3352.07462310791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:53] (step=0009904) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.19246016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:55:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9905, "loss": 0.2109011858701706, "memory_gb": 7.721559524536133, "step_time_ms": 3365.175724029541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:55:56] (step=0009905) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.1924795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9906, "loss": 0.22608056664466858, "memory_gb": 7.721559524536133, "step_time_ms": 3358.743906021118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:00] (step=0009906) Train Loss: 0.1839, Train Steps/Sec: 0.28, Epoch: 0.19249902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9907, "loss": 0.2585470974445343, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3453845977783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:03] (step=0009907) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.19251846094053635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9908, "loss": 0.3409935235977173, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0823669433594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:07] (step=0009908) Train Loss: 0.3001, Train Steps/Sec: 0.28, Epoch: 0.19253789350952197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9909, "loss": 0.2347540259361267, "memory_gb": 7.721559524536133, "step_time_ms": 3353.543996810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:11] (step=0009909) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.1925573260785076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9910, "loss": 0.185689315199852, "memory_gb": 7.721559524536133, "step_time_ms": 3362.165689468384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:14] (step=0009910) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.19257675864749318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9911, "loss": 0.24336250126361847, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8453273773193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:18] (step=0009911) Train Loss: 0.2954, Train Steps/Sec: 0.28, Epoch: 0.1925961912164788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9912, "loss": 0.2206793874502182, "memory_gb": 7.721559524536133, "step_time_ms": 3344.0988063812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:21] (step=0009912) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.19261562378546443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9913, "loss": 0.2310323715209961, "memory_gb": 7.721559524536133, "step_time_ms": 3500.8468627929688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:25] (step=0009913) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.19263505635445005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9914, "loss": 0.20885753631591797, "memory_gb": 7.721559524536133, "step_time_ms": 3359.370231628418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:28] (step=0009914) Train Loss: 0.1962, Train Steps/Sec: 0.28, Epoch: 0.19265448892343567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9915, "loss": 0.20643509924411774, "memory_gb": 7.721559524536133, "step_time_ms": 3363.095760345459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:32] (step=0009915) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.1926739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9916, "loss": 0.1795252561569214, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5266132354736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:36] (step=0009916) Train Loss: 0.1698, Train Steps/Sec: 0.28, Epoch: 0.19269335406140692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9917, "loss": 0.3407228887081146, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6554737091064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:39] (step=0009917) Train Loss: 0.3093, Train Steps/Sec: 0.28, Epoch: 0.19271278663039254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9918, "loss": 0.2684639096260071, "memory_gb": 7.721559524536133, "step_time_ms": 3365.993022918701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:43] (step=0009918) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.19273221919937816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9919, "loss": 0.21781113743782043, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1224212646484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:46] (step=0009919) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.19275165176836379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9920, "loss": 0.2452593743801117, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6339435577393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:50] (step=0009920) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.1927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9921, "loss": 0.2614952325820923, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8953437805176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:54] (step=0009921) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.19279051690633503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:56:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9922, "loss": 0.24192434549331665, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6197834014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:56:57] (step=0009922) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.19280994947532062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9923, "loss": 0.18337473273277283, "memory_gb": 7.721559524536133, "step_time_ms": 3361.466884613037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:01] (step=0009923) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.19282938204430625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9924, "loss": 0.18806254863739014, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4352684020996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:04] (step=0009924) Train Loss: 0.1677, Train Steps/Sec: 0.28, Epoch: 0.19284881461329187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9925, "loss": 0.20371001958847046, "memory_gb": 7.721559524536133, "step_time_ms": 3363.718032836914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:08] (step=0009925) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.1928682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9926, "loss": 0.20269833505153656, "memory_gb": 7.721559524536133, "step_time_ms": 3361.42897605896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:12] (step=0009926) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.1928876797512631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9927, "loss": 0.22307296097278595, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4371128082275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:15] (step=0009927) Train Loss: 0.2043, Train Steps/Sec: 0.28, Epoch: 0.19290711232024874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9928, "loss": 0.10509443283081055, "memory_gb": 7.721559524536133, "step_time_ms": 3364.915370941162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:19] (step=0009928) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.19292654488923436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9929, "loss": 0.2106483429670334, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7141036987305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:22] (step=0009929) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.19294597745821998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9930, "loss": 0.15153557062149048, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2075157165527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:26] (step=0009930) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.1929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9931, "loss": 0.24876099824905396, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6860427856445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:30] (step=0009931) Train Loss: 0.2857, Train Steps/Sec: 0.28, Epoch: 0.19298484259619123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9932, "loss": 0.31645795702934265, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2020740509033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:33] (step=0009932) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.19300427516517685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 9933, "loss": 0.22616595029830933, "memory_gb": 7.721559524536133, "step_time_ms": 3358.279228210449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:37] (step=0009933) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.19302370773416247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9934, "loss": 0.1885688751935959, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9165210723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:40] (step=0009934) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.19304314030314806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 9935, "loss": 0.2319609522819519, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2283000946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:44] (step=0009935) Train Loss: 0.1902, Train Steps/Sec: 0.28, Epoch: 0.1930625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 9936, "loss": 0.2125777006149292, "memory_gb": 7.721559524536133, "step_time_ms": 3360.737085342407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:48] (step=0009936) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.1930820054411193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 9937, "loss": 0.2803259491920471, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4257106781006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:51] (step=0009937) Train Loss: 0.2032, Train Steps/Sec: 0.28, Epoch: 0.19310143801010493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 9938, "loss": 0.1746264398097992, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1184616088867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:55] (step=0009938) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.19312087057909055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:57:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9939, "loss": 0.1543866991996765, "memory_gb": 7.721559524536133, "step_time_ms": 3355.752944946289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:57:58] (step=0009939) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.19314030314807618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 9940, "loss": 0.1799365133047104, "memory_gb": 7.721559524536133, "step_time_ms": 3354.940414428711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:02] (step=0009940) Train Loss: 0.2766, Train Steps/Sec: 0.27, Epoch: 0.1931597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 9941, "loss": 0.254632830619812, "memory_gb": 7.721559524536133, "step_time_ms": 3345.973253250122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:06] (step=0009941) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.19317916828604742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 9942, "loss": 0.29988893866539, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1881103515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:09] (step=0009942) Train Loss: 0.2901, Train Steps/Sec: 0.28, Epoch: 0.19319860085503304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 9943, "loss": 0.15252777934074402, "memory_gb": 7.721559524536133, "step_time_ms": 3357.492685317993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:13] (step=0009943) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.19321803342401866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9944, "loss": 0.30562740564346313, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3010692596436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:16] (step=0009944) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.1932374659930043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 9945, "loss": 0.2510138750076294, "memory_gb": 7.721559524536133, "step_time_ms": 3352.558135986328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:20] (step=0009945) Train Loss: 0.3299, Train Steps/Sec: 0.28, Epoch: 0.19325689856198988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 9946, "loss": 0.22316806018352509, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5306663513184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:24] (step=0009946) Train Loss: 0.2455, Train Steps/Sec: 0.28, Epoch: 0.1932763311309755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 9947, "loss": 0.24708758294582367, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3362617492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:27] (step=0009947) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.19329576369996113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 9948, "loss": 0.20558525621891022, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8692512512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:31] (step=0009948) Train Loss: 0.1868, Train Steps/Sec: 0.28, Epoch: 0.19331519626894675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 9949, "loss": 0.17571790516376495, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8403511047363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:34] (step=0009949) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.19333462883793237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 9950, "loss": 0.21021413803100586, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2562770843506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:38] (step=0009950) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.193354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 9951, "loss": 0.24732355773448944, "memory_gb": 7.721559524536133, "step_time_ms": 3357.949733734131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:42] (step=0009951) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.19337349397590362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 9952, "loss": 0.10892637819051743, "memory_gb": 7.721559524536133, "step_time_ms": 3357.187271118164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:45] (step=0009952) Train Loss: 0.1887, Train Steps/Sec: 0.28, Epoch: 0.19339292654488924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 9953, "loss": 0.3116023540496826, "memory_gb": 7.721559524536133, "step_time_ms": 3355.104684829712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:49] (step=0009953) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.19341235911387486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 9954, "loss": 0.30147257447242737, "memory_gb": 7.721559524536133, "step_time_ms": 3362.860918045044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:52] (step=0009954) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.19343179168286048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 9955, "loss": 0.3326413929462433, "memory_gb": 7.721559524536133, "step_time_ms": 3359.346389770508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:58:56] (step=0009955) Train Loss: 0.3254, Train Steps/Sec: 0.28, Epoch: 0.1934512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9956, "loss": 0.2345620095729828, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7982139587402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:00] (step=0009956) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.19347065682083173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 9957, "loss": 0.15408165752887726, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2324771881104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:03] (step=0009957) Train Loss: 0.1776, Train Steps/Sec: 0.28, Epoch: 0.19349008938981732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 9958, "loss": 0.2619459629058838, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1833114624023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:07] (step=0009958) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.19350952195880294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 9959, "loss": 0.1939058005809784, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7327518463135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:10] (step=0009959) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.19352895452778857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 9960, "loss": 0.23669949173927307, "memory_gb": 7.721559524536133, "step_time_ms": 3502.898931503296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:14] (step=0009960) Train Loss: 0.2215, Train Steps/Sec: 0.28, Epoch: 0.1935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 9961, "loss": 0.26500171422958374, "memory_gb": 7.721559524536133, "step_time_ms": 3353.809595108032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:17] (step=0009961) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.1935678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 9962, "loss": 0.2371228188276291, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2873344421387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:21] (step=0009962) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.19358725223474543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9963, "loss": 0.2039950042963028, "memory_gb": 7.721559524536133, "step_time_ms": 3360.818862915039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:25] (step=0009963) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.19360668480373106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 9964, "loss": 0.1579345017671585, "memory_gb": 7.721559524536133, "step_time_ms": 3361.143112182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:28] (step=0009964) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.19362611737271668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 9965, "loss": 0.26129841804504395, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7333431243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:32] (step=0009965) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.1936455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 9966, "loss": 0.18969105184078217, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7276935577393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:35] (step=0009966) Train Loss: 0.1921, Train Steps/Sec: 0.28, Epoch: 0.19366498251068792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 9967, "loss": 0.2975684404373169, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5216064453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:39] (step=0009967) Train Loss: 0.2645, Train Steps/Sec: 0.28, Epoch: 0.19368441507967354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9968, "loss": 0.19838698208332062, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5517406463623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:43] (step=0009968) Train Loss: 0.1741, Train Steps/Sec: 0.28, Epoch: 0.19370384764865917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 9969, "loss": 0.2520240545272827, "memory_gb": 7.721559524536133, "step_time_ms": 3359.398603439331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:46] (step=0009969) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.19372328021764476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9970, "loss": 0.1706792712211609, "memory_gb": 7.721559524536133, "step_time_ms": 3356.818437576294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:50] (step=0009970) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.19374271278663038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 9971, "loss": 0.2930787205696106, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4000339508057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:53] (step=0009971) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.193762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 09:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 9972, "loss": 0.1570213884115219, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8693141937256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 09:59:57] (step=0009972) Train Loss: 0.1717, Train Steps/Sec: 0.28, Epoch: 0.19378157792460163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 9973, "loss": 0.26789745688438416, "memory_gb": 7.721559524536133, "step_time_ms": 3358.872175216675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:00] (step=0009973) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.19380101049358725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 9974, "loss": 0.21129962801933289, "memory_gb": 7.721559524536133, "step_time_ms": 3359.466791152954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:04] (step=0009974) Train Loss: 0.1851, Train Steps/Sec: 0.28, Epoch: 0.19382044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9975, "loss": 0.22399184107780457, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0238819122314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:08] (step=0009975) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.1938398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 9976, "loss": 0.16443538665771484, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8945350646973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:11] (step=0009976) Train Loss: 0.1898, Train Steps/Sec: 0.28, Epoch: 0.19385930820054412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 9977, "loss": 0.19279566407203674, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5194606781006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:15] (step=0009977) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.19387874076952974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 9978, "loss": 0.1696285754442215, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2278957366943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:18] (step=0009978) Train Loss: 0.1506, Train Steps/Sec: 0.28, Epoch: 0.19389817333851536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 9979, "loss": 0.2187008112668991, "memory_gb": 7.721559524536133, "step_time_ms": 3358.41703414917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:22] (step=0009979) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.19391760590750098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 9980, "loss": 0.3090032637119293, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6514205932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:25] (step=0009980) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.19393703847648658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 9981, "loss": 0.16854670643806458, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7473678588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:29] (step=0009981) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.1939564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9982, "loss": 0.17977383732795715, "memory_gb": 7.721559524536133, "step_time_ms": 3361.147403717041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:33] (step=0009982) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.19397590361445782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 9983, "loss": 0.28051549196243286, "memory_gb": 7.721559524536133, "step_time_ms": 3361.623525619507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:36] (step=0009983) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.19399533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 9984, "loss": 0.2453317642211914, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4206829071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:40] (step=0009984) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.19401476875242907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 9985, "loss": 0.18041324615478516, "memory_gb": 7.721559524536133, "step_time_ms": 3362.415313720703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:43] (step=0009985) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.1940342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 9986, "loss": 0.24750100076198578, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7017879486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:47] (step=0009986) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.1940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 9987, "loss": 0.2616696357727051, "memory_gb": 7.721559524536133, "step_time_ms": 3363.323926925659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:50] (step=0009987) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.19407306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 9988, "loss": 0.14136779308319092, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5381717681885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:54] (step=0009988) Train Loss: 0.2095, Train Steps/Sec: 0.27, Epoch: 0.19409249902837156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 9989, "loss": 0.18475109338760376, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5362434387207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:00:58] (step=0009989) Train Loss: 0.2123, Train Steps/Sec: 0.28, Epoch: 0.19411193159735718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 9990, "loss": 0.222667396068573, "memory_gb": 7.721559524536133, "step_time_ms": 3357.84649848938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:01] (step=0009990) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.1941313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 9991, "loss": 0.24639134109020233, "memory_gb": 7.721559524536133, "step_time_ms": 3361.830234527588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:05] (step=0009991) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.19415079673532842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 9992, "loss": 0.2760266363620758, "memory_gb": 7.721559524536133, "step_time_ms": 3358.377695083618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:08] (step=0009992) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.19417022930431402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 9993, "loss": 0.3042505979537964, "memory_gb": 7.721559524536133, "step_time_ms": 3348.3617305755615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:12] (step=0009993) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.19418966187329964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 9994, "loss": 0.30467575788497925, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6253604888916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:16] (step=0009994) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.19420909444228526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 9995, "loss": 0.26770898699760437, "memory_gb": 7.721559524536133, "step_time_ms": 3344.7694778442383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:19] (step=0009995) Train Loss: 0.3087, Train Steps/Sec: 0.29, Epoch: 0.19422852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 9996, "loss": 0.1806187629699707, "memory_gb": 7.721559524536133, "step_time_ms": 3362.281322479248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:23] (step=0009996) Train Loss: 0.1748, Train Steps/Sec: 0.28, Epoch: 0.1942479595802565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 9997, "loss": 0.24987982213497162, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3217811584473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:26] (step=0009997) Train Loss: 0.3093, Train Steps/Sec: 0.28, Epoch: 0.19426739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 9998, "loss": 0.24387997388839722, "memory_gb": 7.721559524536133, "step_time_ms": 3364.590644836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:30] (step=0009998) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.19428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 9999, "loss": 0.21986258029937744, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6840534210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:33] (step=0009999) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.19430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10000, "loss": 0.24249187111854553, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2540187835693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:37] (step=0010000) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.194325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:37] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0010000/ [2025-07-29 10:01:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10001, "loss": 0.26976844668388367, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6033115386963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:41] (step=0010001) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.19434512242518462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10002, "loss": 0.2581811249256134, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8383922576904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:44] (step=0010002) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.19436455499417024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10003, "loss": 0.2059197574853897, "memory_gb": 7.721559524536133, "step_time_ms": 3510.25652885437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:48] (step=0010003) Train Loss: 0.1674, Train Steps/Sec: 0.28, Epoch: 0.19438398756315584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10004, "loss": 0.2111295461654663, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0847930908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:51] (step=0010004) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.19440342013214146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10005, "loss": 0.2723211646080017, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7333126068115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:55] (step=0010005) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.19442285270112708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10006, "loss": 0.3478652834892273, "memory_gb": 7.715639114379883, "step_time_ms": 3234.3668937683105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:01:58] (step=0010006) Train Loss: 0.3576, Train Steps/Sec: 0.29, Epoch: 0.1944422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10007, "loss": 0.21059760451316833, "memory_gb": 7.721559524536133, "step_time_ms": 3367.161989212036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:02] (step=0010007) Train Loss: 0.1740, Train Steps/Sec: 0.28, Epoch: 0.19446171783909832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10008, "loss": 0.23891940712928772, "memory_gb": 7.721559524536133, "step_time_ms": 3364.121675491333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:06] (step=0010008) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.19448115040808395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10009, "loss": 0.26081138849258423, "memory_gb": 7.721559524536133, "step_time_ms": 3362.389087677002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:09] (step=0010009) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.19450058297706957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10010, "loss": 0.21969816088676453, "memory_gb": 7.721559524536133, "step_time_ms": 3359.980583190918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:13] (step=0010010) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.1945200155460552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10011, "loss": 0.34232693910598755, "memory_gb": 7.721559524536133, "step_time_ms": 3372.058391571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:16] (step=0010011) Train Loss: 0.2980, Train Steps/Sec: 0.28, Epoch: 0.19453944811504081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10012, "loss": 0.3130628764629364, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6960468292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:20] (step=0010012) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.19455888068402644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10013, "loss": 0.21733228862285614, "memory_gb": 7.721559524536133, "step_time_ms": 3367.314577102661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:23] (step=0010013) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.19457831325301206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10014, "loss": 0.23006589710712433, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8291778564453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:27] (step=0010014) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.19459774582199768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10015, "loss": 0.13574175536632538, "memory_gb": 7.721559524536133, "step_time_ms": 3368.380546569824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:31] (step=0010015) Train Loss: 0.1477, Train Steps/Sec: 0.28, Epoch: 0.19461717839098328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10016, "loss": 0.294601708650589, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3650493621826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:34] (step=0010016) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.1946366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10017, "loss": 0.2030625343322754, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4797592163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:38] (step=0010017) Train Loss: 0.2922, Train Steps/Sec: 0.28, Epoch: 0.19465604352895452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10018, "loss": 0.2859979271888733, "memory_gb": 7.721559524536133, "step_time_ms": 3372.7545738220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:41] (step=0010018) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.19467547609794014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10019, "loss": 0.2037055790424347, "memory_gb": 7.721559524536133, "step_time_ms": 3372.286319732666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:45] (step=0010019) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.19469490866692576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10020, "loss": 0.20943282544612885, "memory_gb": 7.721559524536133, "step_time_ms": 3365.952730178833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:49] (step=0010020) Train Loss: 0.1514, Train Steps/Sec: 0.28, Epoch: 0.1947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10021, "loss": 0.2898332476615906, "memory_gb": 7.721559524536133, "step_time_ms": 3370.967149734497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:52] (step=0010021) Train Loss: 0.3414, Train Steps/Sec: 0.28, Epoch: 0.194733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10022, "loss": 0.20345494151115417, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7637577056885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:56] (step=0010022) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.19475320637388263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10023, "loss": 0.20101331174373627, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8793907165527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:02:59] (step=0010023) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.19477263894286825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10024, "loss": 0.28362026810646057, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2045936584473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:03] (step=0010024) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.19479207151185388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10025, "loss": 0.1671750843524933, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1977519989014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:07] (step=0010025) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.1948115040808395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10026, "loss": 0.23511061072349548, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9542541503906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:10] (step=0010026) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.19483093664982512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10027, "loss": 0.3017413914203644, "memory_gb": 7.721559524536133, "step_time_ms": 3369.255781173706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:14] (step=0010027) Train Loss: 0.2843, Train Steps/Sec: 0.28, Epoch: 0.19485036921881072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10028, "loss": 0.1466793417930603, "memory_gb": 7.721559524536133, "step_time_ms": 3366.957664489746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:17] (step=0010028) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.19486980178779634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10029, "loss": 0.3298962414264679, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2168254852295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:21] (step=0010029) Train Loss: 0.2933, Train Steps/Sec: 0.27, Epoch: 0.19488923435678196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10030, "loss": 0.27581238746643066, "memory_gb": 7.715639114379883, "step_time_ms": 3335.7667922973633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:25] (step=0010030) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.19490866692576758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10031, "loss": 0.19739466905593872, "memory_gb": 7.721559524536133, "step_time_ms": 3365.708589553833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:28] (step=0010031) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.1949280994947532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10032, "loss": 0.2115282565355301, "memory_gb": 7.721559524536133, "step_time_ms": 3368.81685256958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:32] (step=0010032) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.19494753206373883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10033, "loss": 0.242367222905159, "memory_gb": 7.715639114379883, "step_time_ms": 3331.718683242798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:36] (step=0010033) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.19496696463272445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10034, "loss": 0.2215200960636139, "memory_gb": 7.721559524536133, "step_time_ms": 3360.650062561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:39] (step=0010034) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.19498639720171007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10035, "loss": 0.2587375044822693, "memory_gb": 7.721559524536133, "step_time_ms": 3360.421895980835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:43] (step=0010035) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.1950058297706957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10036, "loss": 0.22290128469467163, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6456260681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:46] (step=0010036) Train Loss: 0.2749, Train Steps/Sec: 0.28, Epoch: 0.19502526233968132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10037, "loss": 0.19090071320533752, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8429431915283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:50] (step=0010037) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.19504469490866694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10038, "loss": 0.2831110954284668, "memory_gb": 7.721559524536133, "step_time_ms": 3360.501766204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:54] (step=0010038) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.19506412747765253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10039, "loss": 0.20676793158054352, "memory_gb": 7.721559524536133, "step_time_ms": 3350.710153579712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:03:57] (step=0010039) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.19508356004663815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10040, "loss": 0.3215916156768799, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3668003082275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:01] (step=0010040) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.19510299261562378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10041, "loss": 0.23929822444915771, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6929988861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:04] (step=0010041) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.1951224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10042, "loss": 0.17487798631191254, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9184284210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:08] (step=0010042) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.19514185775359502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10043, "loss": 0.28498294949531555, "memory_gb": 7.721559524536133, "step_time_ms": 3499.239206314087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:12] (step=0010043) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.19516129032258064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10044, "loss": 0.23445045948028564, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4039421081543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:15] (step=0010044) Train Loss: 0.2326, Train Steps/Sec: 0.28, Epoch: 0.19518072289156627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10045, "loss": 0.3371109366416931, "memory_gb": 7.721559524536133, "step_time_ms": 3359.602212905884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:19] (step=0010045) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.1952001554605519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10046, "loss": 0.18586191534996033, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9157638549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:22] (step=0010046) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.1952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10047, "loss": 0.2907786965370178, "memory_gb": 7.721559524536133, "step_time_ms": 3357.877731323242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:26] (step=0010047) Train Loss: 0.2034, Train Steps/Sec: 0.28, Epoch: 0.19523902059852313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10048, "loss": 0.14126412570476532, "memory_gb": 7.721559524536133, "step_time_ms": 3361.581802368164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:30] (step=0010048) Train Loss: 0.1481, Train Steps/Sec: 0.28, Epoch: 0.19525845316750876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10049, "loss": 0.32019931077957153, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9345989227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:33] (step=0010049) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.19527788573649438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10050, "loss": 0.26702043414115906, "memory_gb": 7.721559524536133, "step_time_ms": 3359.973907470703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:37] (step=0010050) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.19529731830547997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10051, "loss": 0.1365613043308258, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8717918395996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:40] (step=0010051) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.1953167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10052, "loss": 0.20037227869033813, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7659435272217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:44] (step=0010052) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.19533618344345122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10053, "loss": 0.26841527223587036, "memory_gb": 7.721559524536133, "step_time_ms": 3361.617088317871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:47] (step=0010053) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.19535561601243684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10054, "loss": 0.21893537044525146, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8474521636963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:51] (step=0010054) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.19537504858142246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10055, "loss": 0.23209533095359802, "memory_gb": 7.721559524536133, "step_time_ms": 3360.426902770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:55] (step=0010055) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.19539448115040808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:04:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10056, "loss": 0.27836474776268005, "memory_gb": 7.721559524536133, "step_time_ms": 3357.822895050049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:04:58] (step=0010056) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.1954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10057, "loss": 0.20703673362731934, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3246307373047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:02] (step=0010057) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.19543334628837933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10058, "loss": 0.2326878160238266, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8172664642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:05] (step=0010058) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.19545277885736495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10059, "loss": 0.1862567514181137, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1483154296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:09] (step=0010059) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.19547221142635057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10060, "loss": 0.2816474139690399, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9892177581787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:13] (step=0010060) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.1954916439953362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10061, "loss": 0.1962660253047943, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7308654785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:16] (step=0010061) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.1955110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10062, "loss": 0.1662997305393219, "memory_gb": 7.721559524536133, "step_time_ms": 3362.001657485962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:20] (step=0010062) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.1955305091333074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10063, "loss": 0.2639193832874298, "memory_gb": 7.715639114379883, "step_time_ms": 3322.9072093963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:23] (step=0010063) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.19554994170229303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10064, "loss": 0.17380988597869873, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7676753997803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:27] (step=0010064) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.19556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10065, "loss": 0.1730957329273224, "memory_gb": 7.721559524536133, "step_time_ms": 3346.5003967285156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:31] (step=0010065) Train Loss: 0.1555, Train Steps/Sec: 0.28, Epoch: 0.19558880684026428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10066, "loss": 0.1603766232728958, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8578491210938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:34] (step=0010066) Train Loss: 0.1659, Train Steps/Sec: 0.28, Epoch: 0.1956082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10067, "loss": 0.3094235062599182, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2677001953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:38] (step=0010067) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.19562767197823552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10068, "loss": 0.23025864362716675, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2463779449463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:41] (step=0010068) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.19564710454722115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10069, "loss": 0.27813994884490967, "memory_gb": 7.721559524536133, "step_time_ms": 3354.973793029785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:45] (step=0010069) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.19566653711620677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10070, "loss": 0.15390174090862274, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8672409057617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:48] (step=0010070) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.1956859696851924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10071, "loss": 0.235476553440094, "memory_gb": 7.721559524536133, "step_time_ms": 3355.027198791504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:52] (step=0010071) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.195705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10072, "loss": 0.206819549202919, "memory_gb": 7.721559524536133, "step_time_ms": 3358.861207962036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:56] (step=0010072) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.19572483482316363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:05:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10073, "loss": 0.21602022647857666, "memory_gb": 7.721559524536133, "step_time_ms": 3362.091064453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:05:59] (step=0010073) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.19574426739214923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10074, "loss": 0.2008890062570572, "memory_gb": 7.721559524536133, "step_time_ms": 3354.834794998169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:03] (step=0010074) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.19576369996113485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10075, "loss": 0.3025485873222351, "memory_gb": 7.721559524536133, "step_time_ms": 3357.658624649048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:06] (step=0010075) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.19578313253012047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10076, "loss": 0.24988383054733276, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9445610046387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:10] (step=0010076) Train Loss: 0.2020, Train Steps/Sec: 0.27, Epoch: 0.1958025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10077, "loss": 0.20851950347423553, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4507961273193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:14] (step=0010077) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.19582199766809172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10078, "loss": 0.1825542002916336, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4766387939453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:17] (step=0010078) Train Loss: 0.2142, Train Steps/Sec: 0.28, Epoch: 0.19584143023707734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10079, "loss": 0.15725325047969818, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7959537506104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:21] (step=0010079) Train Loss: 0.1335, Train Steps/Sec: 0.28, Epoch: 0.19586086280606296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10080, "loss": 0.23831726610660553, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9725494384766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:24] (step=0010080) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.19588029537504859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10081, "loss": 0.293252170085907, "memory_gb": 7.721559524536133, "step_time_ms": 3358.982563018799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:28] (step=0010081) Train Loss: 0.2194, Train Steps/Sec: 0.28, Epoch: 0.1958997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10082, "loss": 0.20839859545230865, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2372665405273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:31] (step=0010082) Train Loss: 0.2021, Train Steps/Sec: 0.28, Epoch: 0.19591916051301983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10083, "loss": 0.25443047285079956, "memory_gb": 7.721559524536133, "step_time_ms": 3352.729320526123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:35] (step=0010083) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.19593859308200545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10084, "loss": 0.2379266321659088, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3051223754883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:39] (step=0010084) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.19595802565099107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10085, "loss": 0.23173610866069794, "memory_gb": 7.721559524536133, "step_time_ms": 3354.290723800659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:42] (step=0010085) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.19597745821997667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10086, "loss": 0.3053015470504761, "memory_gb": 7.721559524536133, "step_time_ms": 3357.783317565918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:46] (step=0010086) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.1959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10087, "loss": 0.27639472484588623, "memory_gb": 7.721559524536133, "step_time_ms": 3356.360673904419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:49] (step=0010087) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.1960163233579479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10088, "loss": 0.1869959831237793, "memory_gb": 7.721559524536133, "step_time_ms": 3358.022928237915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:53] (step=0010088) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.19603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:06:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10089, "loss": 0.22961443662643433, "memory_gb": 7.721559524536133, "step_time_ms": 3357.893705368042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:06:56] (step=0010089) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.19605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10090, "loss": 0.19263410568237305, "memory_gb": 7.721559524536133, "step_time_ms": 3493.1955337524414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:00] (step=0010090) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.19607462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10091, "loss": 0.22427091002464294, "memory_gb": 7.721559524536133, "step_time_ms": 3357.329845428467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:04] (step=0010091) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.1960940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10092, "loss": 0.24416086077690125, "memory_gb": 7.721559524536133, "step_time_ms": 3356.908082962036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:07] (step=0010092) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.19611348620287603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10093, "loss": 0.18774402141571045, "memory_gb": 7.721559524536133, "step_time_ms": 3341.033458709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:11] (step=0010093) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.19613291877186165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10094, "loss": 0.2728234529495239, "memory_gb": 7.721559524536133, "step_time_ms": 3354.010820388794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:14] (step=0010094) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.19615235134084727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10095, "loss": 0.20200100541114807, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9046001434326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:18] (step=0010095) Train Loss: 0.1846, Train Steps/Sec: 0.28, Epoch: 0.1961717839098329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10096, "loss": 0.21364742517471313, "memory_gb": 7.721559524536133, "step_time_ms": 3358.039140701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:21] (step=0010096) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.1961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10097, "loss": 0.3269403576850891, "memory_gb": 7.721559524536133, "step_time_ms": 3356.994867324829, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:25] (step=0010097) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.1962106490478041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10098, "loss": 0.2509617805480957, "memory_gb": 7.721559524536133, "step_time_ms": 3361.459493637085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:29] (step=0010098) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.19623008161678973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10099, "loss": 0.22809571027755737, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7587184906006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:32] (step=0010099) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.19624951418577535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10100, "loss": 0.2233194261789322, "memory_gb": 7.721559524536133, "step_time_ms": 3358.293294906616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:36] (step=0010100) Train Loss: 0.2100, Train Steps/Sec: 0.28, Epoch: 0.19626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10101, "loss": 0.2998048663139343, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6673946380615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:39] (step=0010101) Train Loss: 0.3255, Train Steps/Sec: 0.28, Epoch: 0.1962883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10102, "loss": 0.2234320491552353, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2066555023193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:43] (step=0010102) Train Loss: 0.2732, Train Steps/Sec: 0.28, Epoch: 0.19630781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10103, "loss": 0.24278977513313293, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6466312408447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:46] (step=0010103) Train Loss: 0.2245, Train Steps/Sec: 0.28, Epoch: 0.19632724446171784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10104, "loss": 0.14559242129325867, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5516471862793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:50] (step=0010104) Train Loss: 0.1756, Train Steps/Sec: 0.28, Epoch: 0.19634667703070346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10105, "loss": 0.24347037076950073, "memory_gb": 7.721559524536133, "step_time_ms": 3360.271692276001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:54] (step=0010105) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.1963661095996891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:07:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10106, "loss": 0.2058521807193756, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3151569366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:07:57] (step=0010106) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.1963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10107, "loss": 0.1363140195608139, "memory_gb": 7.721559524536133, "step_time_ms": 3363.550901412964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:01] (step=0010107) Train Loss: 0.1799, Train Steps/Sec: 0.28, Epoch: 0.19640497473766033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10108, "loss": 0.15372449159622192, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6465587615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:04] (step=0010108) Train Loss: 0.1584, Train Steps/Sec: 0.28, Epoch: 0.19642440730664593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10109, "loss": 0.30647069215774536, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4469604492188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:08] (step=0010109) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.19644383987563155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10110, "loss": 0.31843194365501404, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1342182159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:12] (step=0010110) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.19646327244461717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10111, "loss": 0.2905133366584778, "memory_gb": 7.721559524536133, "step_time_ms": 3361.912727355957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:15] (step=0010111) Train Loss: 0.2788, Train Steps/Sec: 0.28, Epoch: 0.1964827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10112, "loss": 0.3140330910682678, "memory_gb": 7.721559524536133, "step_time_ms": 3363.635778427124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:19] (step=0010112) Train Loss: 0.3119, Train Steps/Sec: 0.28, Epoch: 0.19650213758258842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10113, "loss": 0.23937231302261353, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3711128234863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:22] (step=0010113) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.19652157015157404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10114, "loss": 0.21836553514003754, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2489700317383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:26] (step=0010114) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.19654100272055966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10115, "loss": 0.26694202423095703, "memory_gb": 7.721559524536133, "step_time_ms": 3364.521026611328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:29] (step=0010115) Train Loss: 0.2980, Train Steps/Sec: 0.28, Epoch: 0.19656043528954528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10116, "loss": 0.24242673814296722, "memory_gb": 7.721559524536133, "step_time_ms": 3364.711284637451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:33] (step=0010116) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.1965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10117, "loss": 0.27465707063674927, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4329051971436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:37] (step=0010117) Train Loss: 0.1973, Train Steps/Sec: 0.27, Epoch: 0.19659930042751653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10118, "loss": 0.22754111886024475, "memory_gb": 7.721559524536133, "step_time_ms": 3349.184274673462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:40] (step=0010118) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.19661873299650215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10119, "loss": 0.28153303265571594, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6199798583984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:44] (step=0010119) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.19663816556548774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10120, "loss": 0.30118003487586975, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0330772399902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:47] (step=0010120) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.19665759813447337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10121, "loss": 0.11754649877548218, "memory_gb": 7.721559524536133, "step_time_ms": 3363.732099533081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:51] (step=0010121) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.196677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10122, "loss": 0.13579905033111572, "memory_gb": 7.721559524536133, "step_time_ms": 3365.135431289673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:55] (step=0010122) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.1966964632724446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10123, "loss": 0.29012811183929443, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6399974823, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:08:58] (step=0010123) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.19671589584143023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10124, "loss": 0.30052971839904785, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0477867126465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:02] (step=0010124) Train Loss: 0.3102, Train Steps/Sec: 0.28, Epoch: 0.19673532841041586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10125, "loss": 0.26342159509658813, "memory_gb": 7.721559524536133, "step_time_ms": 3365.100860595703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:05] (step=0010125) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.19675476097940148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10126, "loss": 0.1934378296136856, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4232749938965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:09] (step=0010126) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.1967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10127, "loss": 0.2705177664756775, "memory_gb": 7.721559524536133, "step_time_ms": 3365.112066268921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:13] (step=0010127) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.19679362611737272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10128, "loss": 0.2542968988418579, "memory_gb": 7.721559524536133, "step_time_ms": 3361.591577529907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:16] (step=0010128) Train Loss: 0.2797, Train Steps/Sec: 0.28, Epoch: 0.19681305868635834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10129, "loss": 0.2559613883495331, "memory_gb": 7.721559524536133, "step_time_ms": 3363.137722015381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:20] (step=0010129) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.19683249125534397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10130, "loss": 0.169207364320755, "memory_gb": 7.721559524536133, "step_time_ms": 3357.804536819458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:23] (step=0010130) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.1968519238243296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10131, "loss": 0.2346745729446411, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4493350982666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:27] (step=0010131) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.19687135639331518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10132, "loss": 0.29898548126220703, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1271381378174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:31] (step=0010132) Train Loss: 0.2918, Train Steps/Sec: 0.28, Epoch: 0.1968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10133, "loss": 0.2506038546562195, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2027893066406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:34] (step=0010133) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.19691022153128643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10134, "loss": 0.25637292861938477, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4119033813477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:38] (step=0010134) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.19692965410027205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10135, "loss": 0.27120441198349, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8412017822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:41] (step=0010135) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.19694908666925767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10136, "loss": 0.16526731848716736, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0551567077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:45] (step=0010136) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.1969685192382433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10137, "loss": 0.2220069169998169, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1322174072266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:48] (step=0010137) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.19698795180722892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10138, "loss": 0.25472599267959595, "memory_gb": 7.721559524536133, "step_time_ms": 3503.92746925354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:52] (step=0010138) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.19700738437621454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10139, "loss": 0.20569849014282227, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3763160705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:56] (step=0010139) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.19702681694520016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10140, "loss": 0.20610105991363525, "memory_gb": 7.721559524536133, "step_time_ms": 3361.403465270996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:09:59] (step=0010140) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.19704624951418578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10141, "loss": 0.28873133659362793, "memory_gb": 7.721559524536133, "step_time_ms": 3363.920211791992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:03] (step=0010141) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.1970656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10142, "loss": 0.18452909588813782, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2824211120605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:06] (step=0010142) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.19708511465215703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10143, "loss": 0.14898300170898438, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0760536193848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:10] (step=0010143) Train Loss: 0.1799, Train Steps/Sec: 0.28, Epoch: 0.19710454722114262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10144, "loss": 0.20291969180107117, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8927001953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:14] (step=0010144) Train Loss: 0.1880, Train Steps/Sec: 0.28, Epoch: 0.19712397979012825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10145, "loss": 0.1863730549812317, "memory_gb": 7.721559524536133, "step_time_ms": 3361.081123352051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:17] (step=0010145) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.19714341235911387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10146, "loss": 0.3017686605453491, "memory_gb": 7.721559524536133, "step_time_ms": 3364.004135131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:21] (step=0010146) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.1971628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10147, "loss": 0.2808355391025543, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8345050811768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:24] (step=0010147) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.1971822774970851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10148, "loss": 0.14048387110233307, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7217330932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:28] (step=0010148) Train Loss: 0.1846, Train Steps/Sec: 0.28, Epoch: 0.19720171006607073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10149, "loss": 0.21501947939395905, "memory_gb": 7.721559524536133, "step_time_ms": 3361.814260482788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:32] (step=0010149) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.19722114263505636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10150, "loss": 0.3115123212337494, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1484508514404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:35] (step=0010150) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.19724057520404198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10151, "loss": 0.3120654225349426, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8015308380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:39] (step=0010151) Train Loss: 0.3052, Train Steps/Sec: 0.28, Epoch: 0.1972600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10152, "loss": 0.19235889613628387, "memory_gb": 7.721559524536133, "step_time_ms": 3359.849452972412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:42] (step=0010152) Train Loss: 0.1582, Train Steps/Sec: 0.28, Epoch: 0.19727944034201322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10153, "loss": 0.2369174063205719, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7941188812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:46] (step=0010153) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.19729887291099885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10154, "loss": 0.23144745826721191, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5786304473877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:49] (step=0010154) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.19731830547998444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10155, "loss": 0.23726703226566315, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0896339416504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:53] (step=0010155) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.19733773804897006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10156, "loss": 0.14046263694763184, "memory_gb": 7.721559524536133, "step_time_ms": 3360.504388809204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:10:57] (step=0010156) Train Loss: 0.1752, Train Steps/Sec: 0.28, Epoch: 0.19735717061795569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10157, "loss": 0.2974858283996582, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4605712890625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:00] (step=0010157) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.1973766031869413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10158, "loss": 0.23755982518196106, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4307174682617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:04] (step=0010158) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.19739603575592693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10159, "loss": 0.2596943974494934, "memory_gb": 7.721559524536133, "step_time_ms": 3359.489679336548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:07] (step=0010159) Train Loss: 0.1982, Train Steps/Sec: 0.28, Epoch: 0.19741546832491255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10160, "loss": 0.2683391571044922, "memory_gb": 7.721559524536133, "step_time_ms": 3360.30650138855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:11] (step=0010160) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.19743490089389817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10161, "loss": 0.3106568157672882, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9767265319824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:14] (step=0010161) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.1974543334628838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10162, "loss": 0.24062587320804596, "memory_gb": 7.721559524536133, "step_time_ms": 3356.081962585449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:18] (step=0010162) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.19747376603186942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10163, "loss": 0.29115235805511475, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3810749053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:22] (step=0010163) Train Loss: 0.3110, Train Steps/Sec: 0.28, Epoch: 0.19749319860085504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10164, "loss": 0.3671918511390686, "memory_gb": 7.721559524536133, "step_time_ms": 3359.666347503662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:25] (step=0010164) Train Loss: 0.3266, Train Steps/Sec: 0.28, Epoch: 0.19751263116984066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10165, "loss": 0.1825169026851654, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2402420043945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:29] (step=0010165) Train Loss: 0.1926, Train Steps/Sec: 0.27, Epoch: 0.19753206373882629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10166, "loss": 0.1644452065229416, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3337841033936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:33] (step=0010166) Train Loss: 0.1769, Train Steps/Sec: 0.28, Epoch: 0.19755149630781188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10167, "loss": 0.18051934242248535, "memory_gb": 7.721559524536133, "step_time_ms": 3358.696222305298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:36] (step=0010167) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.1975709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10168, "loss": 0.2100185602903366, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5809211730957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:40] (step=0010168) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.19759036144578312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10169, "loss": 0.2381136119365692, "memory_gb": 7.721559524536133, "step_time_ms": 3360.935688018799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:43] (step=0010169) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.19760979401476875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10170, "loss": 0.2722489833831787, "memory_gb": 7.721559524536133, "step_time_ms": 3361.070156097412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:47] (step=0010170) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.19762922658375437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10171, "loss": 0.2666093111038208, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3579998016357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:50] (step=0010171) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.19764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10172, "loss": 0.23980294167995453, "memory_gb": 7.721559524536133, "step_time_ms": 3360.745906829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:54] (step=0010172) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.19766809172172561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:11:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10173, "loss": 0.19621151685714722, "memory_gb": 7.721559524536133, "step_time_ms": 3344.9037075042725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:11:58] (step=0010173) Train Loss: 0.2706, Train Steps/Sec: 0.28, Epoch: 0.19768752429071124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10174, "loss": 0.1883232146501541, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5151386260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:01] (step=0010174) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.19770695685969686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10175, "loss": 0.34363728761672974, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3317012786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:05] (step=0010175) Train Loss: 0.2877, Train Steps/Sec: 0.28, Epoch: 0.19772638942868248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10176, "loss": 0.21390961110591888, "memory_gb": 7.721559524536133, "step_time_ms": 3359.931230545044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:08] (step=0010176) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.1977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10177, "loss": 0.26562783122062683, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1527004241943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:12] (step=0010177) Train Loss: 0.3199, Train Steps/Sec: 0.28, Epoch: 0.19776525456665373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10178, "loss": 0.24217434227466583, "memory_gb": 7.715639114379883, "step_time_ms": 3325.767993927002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:15] (step=0010178) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.19778468713563932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10179, "loss": 0.29139626026153564, "memory_gb": 7.721559524536133, "step_time_ms": 3502.7740001678467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:19] (step=0010179) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.19780411970462494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10180, "loss": 0.2536452114582062, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3904247283936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:23] (step=0010180) Train Loss: 0.1887, Train Steps/Sec: 0.28, Epoch: 0.19782355227361056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10181, "loss": 0.28018033504486084, "memory_gb": 7.721559524536133, "step_time_ms": 3352.238416671753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:26] (step=0010181) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.1978429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10182, "loss": 0.14484494924545288, "memory_gb": 7.721559524536133, "step_time_ms": 3359.893560409546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:30] (step=0010182) Train Loss: 0.1734, Train Steps/Sec: 0.28, Epoch: 0.1978624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10183, "loss": 0.1826569139957428, "memory_gb": 7.721559524536133, "step_time_ms": 3359.790802001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:33] (step=0010183) Train Loss: 0.2065, Train Steps/Sec: 0.28, Epoch: 0.19788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10184, "loss": 0.259218692779541, "memory_gb": 7.721559524536133, "step_time_ms": 3361.595630645752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:37] (step=0010184) Train Loss: 0.2626, Train Steps/Sec: 0.28, Epoch: 0.19790128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10185, "loss": 0.29668885469436646, "memory_gb": 7.721559524536133, "step_time_ms": 3360.171318054199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:40] (step=0010185) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.19792071511853868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10186, "loss": 0.146773099899292, "memory_gb": 7.721559524536133, "step_time_ms": 3359.248399734497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:44] (step=0010186) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.1979401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10187, "loss": 0.2217659056186676, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3787422180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:48] (step=0010187) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.19795958025650992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10188, "loss": 0.2186838984489441, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7421855926514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:51] (step=0010188) Train Loss: 0.2034, Train Steps/Sec: 0.28, Epoch: 0.19797901282549554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10189, "loss": 0.3183450698852539, "memory_gb": 7.721559524536133, "step_time_ms": 3360.292434692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:55] (step=0010189) Train Loss: 0.3027, Train Steps/Sec: 0.28, Epoch: 0.19799844539448114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:12:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10190, "loss": 0.18565058708190918, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6653213500977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:12:58] (step=0010190) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.19801787796346676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10191, "loss": 0.3058146834373474, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8853092193604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:02] (step=0010191) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.19803731053245238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10192, "loss": 0.2815041244029999, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6740283966064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:05] (step=0010192) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.198056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10193, "loss": 0.2915459871292114, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3429794311523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:09] (step=0010193) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.19807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10194, "loss": 0.14890961349010468, "memory_gb": 7.721559524536133, "step_time_ms": 3357.248067855835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:13] (step=0010194) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.19809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10195, "loss": 0.24748671054840088, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1985912323, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:16] (step=0010195) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.19811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10196, "loss": 0.3197900056838989, "memory_gb": 7.715639114379883, "step_time_ms": 3325.9353637695312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:20] (step=0010196) Train Loss: 0.3115, Train Steps/Sec: 0.28, Epoch: 0.1981344733773805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10197, "loss": 0.19931015372276306, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0008697509766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:23] (step=0010197) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.19815390594636612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10198, "loss": 0.2310236096382141, "memory_gb": 7.721559524536133, "step_time_ms": 3352.09059715271, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:27] (step=0010198) Train Loss: 0.2138, Train Steps/Sec: 0.28, Epoch: 0.19817333851535174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10199, "loss": 0.29851120710372925, "memory_gb": 7.721559524536133, "step_time_ms": 3342.615842819214, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:30] (step=0010199) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.19819277108433736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10200, "loss": 0.239990234375, "memory_gb": 7.721559524536133, "step_time_ms": 3360.535144805908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:34] (step=0010200) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.19821220365332298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10201, "loss": 0.20478855073451996, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1140670776367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:38] (step=0010201) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.19823163622230858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10202, "loss": 0.1859932690858841, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2868061065674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:41] (step=0010202) Train Loss: 0.1498, Train Steps/Sec: 0.28, Epoch: 0.1982510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10203, "loss": 0.2263895571231842, "memory_gb": 7.721559524536133, "step_time_ms": 3361.558198928833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:45] (step=0010203) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.19827050136027982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10204, "loss": 0.2577509880065918, "memory_gb": 7.721559524536133, "step_time_ms": 3364.337205886841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:48] (step=0010204) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.19828993392926544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10205, "loss": 0.24486619234085083, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4258251190186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:52] (step=0010205) Train Loss: 0.2357, Train Steps/Sec: 0.27, Epoch: 0.19830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10206, "loss": 0.12457215040922165, "memory_gb": 7.721559524536133, "step_time_ms": 3360.872745513916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:56] (step=0010206) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.1983287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:13:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10207, "loss": 0.11746467649936676, "memory_gb": 7.721559524536133, "step_time_ms": 3360.83722114563, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:13:59] (step=0010207) Train Loss: 0.1789, Train Steps/Sec: 0.28, Epoch: 0.1983482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10208, "loss": 0.1707988679409027, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6299629211426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:03] (step=0010208) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.19836766420520793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10209, "loss": 0.30360502004623413, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4192218780518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:06] (step=0010209) Train Loss: 0.2758, Train Steps/Sec: 0.28, Epoch: 0.19838709677419356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10210, "loss": 0.2659026086330414, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9348583221436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:10] (step=0010210) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.19840652934317918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10211, "loss": 0.17734605073928833, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8097820281982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:13] (step=0010211) Train Loss: 0.1571, Train Steps/Sec: 0.28, Epoch: 0.1984259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10212, "loss": 0.13254158198833466, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6022338867188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:17] (step=0010212) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.1984453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10213, "loss": 0.2169203758239746, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5873165130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:21] (step=0010213) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.19846482705013602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10214, "loss": 0.29094699025154114, "memory_gb": 7.721559524536133, "step_time_ms": 3362.119674682617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:24] (step=0010214) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.19848425961912164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10215, "loss": 0.19441397488117218, "memory_gb": 7.721559524536133, "step_time_ms": 3359.362840652466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:28] (step=0010215) Train Loss: 0.1684, Train Steps/Sec: 0.28, Epoch: 0.19850369218810726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10216, "loss": 0.2555176615715027, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1745109558105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:31] (step=0010216) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.19852312475709288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10217, "loss": 0.2698137164115906, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2967987060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:35] (step=0010217) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.1985425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10218, "loss": 0.26913848519325256, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5595569610596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:38] (step=0010218) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.19856198989506413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10219, "loss": 0.26579421758651733, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2220497131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:42] (step=0010219) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.19858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10220, "loss": 0.24708521366119385, "memory_gb": 7.721559524536133, "step_time_ms": 3509.418487548828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:46] (step=0010220) Train Loss: 0.2980, Train Steps/Sec: 0.28, Epoch: 0.19860085503303537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10221, "loss": 0.1502966582775116, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0713901519775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:49] (step=0010221) Train Loss: 0.1576, Train Steps/Sec: 0.28, Epoch: 0.198620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10222, "loss": 0.17373870313167572, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5622520446777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:53] (step=0010222) Train Loss: 0.1987, Train Steps/Sec: 0.28, Epoch: 0.19863972017100662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:14:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10223, "loss": 0.27653375267982483, "memory_gb": 7.721559524536133, "step_time_ms": 3356.071949005127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:14:56] (step=0010223) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.19865915273999224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10224, "loss": 0.24298444390296936, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8273010253906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:00] (step=0010224) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.19867858530897783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10225, "loss": 0.3118817210197449, "memory_gb": 7.721559524536133, "step_time_ms": 3357.726573944092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:03] (step=0010225) Train Loss: 0.2696, Train Steps/Sec: 0.28, Epoch: 0.19869801787796346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10226, "loss": 0.28510332107543945, "memory_gb": 7.721559524536133, "step_time_ms": 3348.982810974121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:07] (step=0010226) Train Loss: 0.3022, Train Steps/Sec: 0.28, Epoch: 0.19871745044694908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10227, "loss": 0.274910032749176, "memory_gb": 7.721559524536133, "step_time_ms": 3364.407539367676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:11] (step=0010227) Train Loss: 0.3061, Train Steps/Sec: 0.28, Epoch: 0.1987368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10228, "loss": 0.20614296197891235, "memory_gb": 7.721559524536133, "step_time_ms": 3360.342502593994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:14] (step=0010228) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.19875631558492032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10229, "loss": 0.34347105026245117, "memory_gb": 7.721559524536133, "step_time_ms": 3364.945411682129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:18] (step=0010229) Train Loss: 0.3044, Train Steps/Sec: 0.28, Epoch: 0.19877574815390595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10230, "loss": 0.2814871668815613, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6314659118652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:21] (step=0010230) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.19879518072289157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10231, "loss": 0.25072169303894043, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7208213806152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:25] (step=0010231) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.1988146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10232, "loss": 0.2543267011642456, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8745079040527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:29] (step=0010232) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.1988340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10233, "loss": 0.26976341009140015, "memory_gb": 7.721559524536133, "step_time_ms": 3357.290744781494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:32] (step=0010233) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.19885347842984843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10234, "loss": 0.24334441125392914, "memory_gb": 7.721559524536133, "step_time_ms": 3360.88490486145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:36] (step=0010234) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.19887291099883406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10235, "loss": 0.2508709132671356, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7586879730225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:39] (step=0010235) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.19889234356781968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10236, "loss": 0.2978851795196533, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5704097747803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:43] (step=0010236) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.19891177613680527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10237, "loss": 0.2572433054447174, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4252643585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:46] (step=0010237) Train Loss: 0.3028, Train Steps/Sec: 0.28, Epoch: 0.1989312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10238, "loss": 0.26227378845214844, "memory_gb": 7.721559524536133, "step_time_ms": 3361.557960510254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:50] (step=0010238) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.19895064127477652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10239, "loss": 0.3527054488658905, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8503131866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:54] (step=0010239) Train Loss: 0.3082, Train Steps/Sec: 0.28, Epoch: 0.19897007384376214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:15:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10240, "loss": 0.332985520362854, "memory_gb": 7.721559524536133, "step_time_ms": 3361.314535140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:15:57] (step=0010240) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.19898950641274776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10241, "loss": 0.2163969725370407, "memory_gb": 7.721559524536133, "step_time_ms": 3348.017692565918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:01] (step=0010241) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.19900893898173339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10242, "loss": 0.16626960039138794, "memory_gb": 7.721559524536133, "step_time_ms": 3361.280679702759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:04] (step=0010242) Train Loss: 0.2246, Train Steps/Sec: 0.28, Epoch: 0.199028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10243, "loss": 0.21893326938152313, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7628135681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:08] (step=0010243) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.19904780411970463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10244, "loss": 0.21633797883987427, "memory_gb": 7.721559524536133, "step_time_ms": 3361.837387084961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:12] (step=0010244) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.19906723668869025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10245, "loss": 0.21236582100391388, "memory_gb": 7.721559524536133, "step_time_ms": 3347.661018371582, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:15] (step=0010245) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.19908666925767587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10246, "loss": 0.3595447540283203, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0853424072266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:19] (step=0010246) Train Loss: 0.2992, Train Steps/Sec: 0.28, Epoch: 0.1991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10247, "loss": 0.23408453166484833, "memory_gb": 7.715639114379883, "step_time_ms": 3319.3488121032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:22] (step=0010247) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.1991255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10248, "loss": 0.338492751121521, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6651554107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:26] (step=0010248) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.1991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10249, "loss": 0.181047260761261, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9835891723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:29] (step=0010249) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.19916439953361834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10250, "loss": 0.32208356261253357, "memory_gb": 7.721559524536133, "step_time_ms": 3364.509105682373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:33] (step=0010250) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.19918383210260396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10251, "loss": 0.16445761919021606, "memory_gb": 7.721559524536133, "step_time_ms": 3362.483024597168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:37] (step=0010251) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.19920326467158958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10252, "loss": 0.2990744411945343, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7871952056885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:40] (step=0010252) Train Loss: 0.2274, Train Steps/Sec: 0.27, Epoch: 0.1992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10253, "loss": 0.3160836100578308, "memory_gb": 7.721559524536133, "step_time_ms": 3364.328384399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:44] (step=0010253) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.19924212980956083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10254, "loss": 0.1835760623216629, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8053874969482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:48] (step=0010254) Train Loss: 0.1670, Train Steps/Sec: 0.28, Epoch: 0.19926156237854645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10255, "loss": 0.29046258330345154, "memory_gb": 7.721559524536133, "step_time_ms": 3364.849805831909, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:51] (step=0010255) Train Loss: 0.3006, Train Steps/Sec: 0.28, Epoch: 0.19928099494753207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10256, "loss": 0.27335160970687866, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0026111602783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:55] (step=0010256) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.1993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:16:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10257, "loss": 0.1375400424003601, "memory_gb": 7.721559524536133, "step_time_ms": 3362.682342529297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:16:58] (step=0010257) Train Loss: 0.1928, Train Steps/Sec: 0.28, Epoch: 0.19931986008550331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10258, "loss": 0.14781729876995087, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1459732055664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:02] (step=0010258) Train Loss: 0.1765, Train Steps/Sec: 0.28, Epoch: 0.19933929265448894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10259, "loss": 0.2764033079147339, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3477478027344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:06] (step=0010259) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.19935872522347453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10260, "loss": 0.15475739538669586, "memory_gb": 7.721559524536133, "step_time_ms": 3361.647367477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:09] (step=0010260) Train Loss: 0.1406, Train Steps/Sec: 0.28, Epoch: 0.19937815779246015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10261, "loss": 0.19649800658226013, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2681159973145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:13] (step=0010261) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.19939759036144578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10262, "loss": 0.11510800570249557, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8149032592773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:16] (step=0010262) Train Loss: 0.1338, Train Steps/Sec: 0.28, Epoch: 0.1994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10263, "loss": 0.23423223197460175, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7113666534424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:20] (step=0010263) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.19943645549941702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10264, "loss": 0.32092803716659546, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3297939300537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:23] (step=0010264) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.19945588806840264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10265, "loss": 0.2532259225845337, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5009784698486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:27] (step=0010265) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.19947532063738826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10266, "loss": 0.19375254213809967, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8599643707275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:31] (step=0010266) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.1994947532063739, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10267, "loss": 0.16066040098667145, "memory_gb": 7.721559524536133, "step_time_ms": 3355.964422225952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:34] (step=0010267) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.1995141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10268, "loss": 0.3177458643913269, "memory_gb": 7.721559524536133, "step_time_ms": 3508.108139038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:38] (step=0010268) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.19953361834434513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10269, "loss": 0.26259830594062805, "memory_gb": 7.721559524536133, "step_time_ms": 3360.476493835449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:41] (step=0010269) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.19955305091333075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10270, "loss": 0.25837379693984985, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9825115203857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:45] (step=0010270) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.19957248348231635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10271, "loss": 0.30344533920288086, "memory_gb": 7.721559524536133, "step_time_ms": 3360.685348510742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:49] (step=0010271) Train Loss: 0.3424, Train Steps/Sec: 0.28, Epoch: 0.19959191605130197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10272, "loss": 0.28266164660453796, "memory_gb": 7.721559524536133, "step_time_ms": 3360.348701477051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:52] (step=0010272) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.1996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10273, "loss": 0.23919561505317688, "memory_gb": 7.721559524536133, "step_time_ms": 3357.404947280884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:56] (step=0010273) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.19963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10274, "loss": 0.14588361978530884, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3054027557373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:17:59] (step=0010274) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.19965021375825884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10275, "loss": 0.2690960764884949, "memory_gb": 7.721559524536133, "step_time_ms": 3358.52313041687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:03] (step=0010275) Train Loss: 0.2745, Train Steps/Sec: 0.28, Epoch: 0.19966964632724446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10276, "loss": 0.14686252176761627, "memory_gb": 7.721559524536133, "step_time_ms": 3354.107141494751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:06] (step=0010276) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.19968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10277, "loss": 0.22061166167259216, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2604656219482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:10] (step=0010277) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.1997085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10278, "loss": 0.35992878675460815, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8785400390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:13] (step=0010278) Train Loss: 0.3191, Train Steps/Sec: 0.28, Epoch: 0.19972794403420133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10279, "loss": 0.20220047235488892, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2625694274902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:17] (step=0010279) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.19974737660318695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10280, "loss": 0.294847309589386, "memory_gb": 7.721559524536133, "step_time_ms": 3357.785224914551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:21] (step=0010280) Train Loss: 0.2953, Train Steps/Sec: 0.28, Epoch: 0.19976680917217257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10281, "loss": 0.18968279659748077, "memory_gb": 7.721559524536133, "step_time_ms": 3361.253261566162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:24] (step=0010281) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.1997862417411582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10282, "loss": 0.1661650836467743, "memory_gb": 7.721559524536133, "step_time_ms": 3362.105131149292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:28] (step=0010282) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.1998056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10283, "loss": 0.18433767557144165, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1883182525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:31] (step=0010283) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.1998251068791294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10284, "loss": 0.3273140490055084, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7632904052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:35] (step=0010284) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.19984453944811503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10285, "loss": 0.24571534991264343, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7825813293457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:39] (step=0010285) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.19986397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10286, "loss": 0.16869699954986572, "memory_gb": 7.721559524536133, "step_time_ms": 3360.163927078247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:42] (step=0010286) Train Loss: 0.2058, Train Steps/Sec: 0.28, Epoch: 0.19988340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10287, "loss": 0.3301229774951935, "memory_gb": 7.721559524536133, "step_time_ms": 3360.318899154663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:46] (step=0010287) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.1999028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10288, "loss": 0.2664308249950409, "memory_gb": 7.721559524536133, "step_time_ms": 3348.5052585601807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:49] (step=0010288) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.19992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10289, "loss": 0.17338643968105316, "memory_gb": 7.721559524536133, "step_time_ms": 3346.290349960327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:53] (step=0010289) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.19994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:18:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10290, "loss": 0.32954132556915283, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8496704101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:18:56] (step=0010290) Train Loss: 0.3194, Train Steps/Sec: 0.28, Epoch: 0.19996113486202877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10291, "loss": 0.30131763219833374, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1670265197754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:00] (step=0010291) Train Loss: 0.2742, Train Steps/Sec: 0.28, Epoch: 0.1999805674310144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10292, "loss": 0.21094590425491333, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4298782348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:03] (step=0010292) Train Loss: 0.2681, Train Steps/Sec: 0.28, Epoch: 0.2, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10293, "loss": 0.21736913919448853, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1956882476807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:07] (step=0010293) Train Loss: 0.1663, Train Steps/Sec: 0.28, Epoch: 0.20001943256898563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10294, "loss": 0.3318948447704315, "memory_gb": 7.721559524536133, "step_time_ms": 3363.008499145508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:11] (step=0010294) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.20003886513797123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10295, "loss": 0.23388105630874634, "memory_gb": 7.721559524536133, "step_time_ms": 3361.20343208313, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:14] (step=0010295) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.20005829770695685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10296, "loss": 0.2739028036594391, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2745113372803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:18] (step=0010296) Train Loss: 0.2841, Train Steps/Sec: 0.28, Epoch: 0.20007773027594247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10297, "loss": 0.2870411276817322, "memory_gb": 7.721559524536133, "step_time_ms": 3364.046096801758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:21] (step=0010297) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.2000971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10298, "loss": 0.3843671679496765, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6531105041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:25] (step=0010298) Train Loss: 0.2964, Train Steps/Sec: 0.28, Epoch: 0.20011659541391372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10299, "loss": 0.18572768568992615, "memory_gb": 7.721559524536133, "step_time_ms": 3360.771417617798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:28] (step=0010299) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.20013602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10300, "loss": 0.25800108909606934, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1443977355957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:32] (step=0010300) Train Loss: 0.2748, Train Steps/Sec: 0.27, Epoch: 0.20015546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10301, "loss": 0.1469387412071228, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1315746307373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:36] (step=0010301) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.20017489312087058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10302, "loss": 0.17987391352653503, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5808277130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:39] (step=0010302) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.2001943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10303, "loss": 0.21690940856933594, "memory_gb": 7.721559524536133, "step_time_ms": 3363.924264907837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:43] (step=0010303) Train Loss: 0.2009, Train Steps/Sec: 0.28, Epoch: 0.20021375825884183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10304, "loss": 0.1811646670103073, "memory_gb": 7.721559524536133, "step_time_ms": 3355.555534362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:46] (step=0010304) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.20023319082782745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10305, "loss": 0.22935090959072113, "memory_gb": 7.721559524536133, "step_time_ms": 3359.556198120117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:50] (step=0010305) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.20025262339681305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10306, "loss": 0.30201536417007446, "memory_gb": 7.721559524536133, "step_time_ms": 3366.940975189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:54] (step=0010306) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.20027205596579867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:19:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10307, "loss": 0.19246657192707062, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8576526641846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:19:57] (step=0010307) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.2002914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10308, "loss": 0.22118011116981506, "memory_gb": 7.721559524536133, "step_time_ms": 3364.023447036743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:01] (step=0010308) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.2003109211037699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10309, "loss": 0.1512327343225479, "memory_gb": 7.721559524536133, "step_time_ms": 3508.080244064331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:04] (step=0010309) Train Loss: 0.1777, Train Steps/Sec: 0.28, Epoch: 0.20033035367275553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10310, "loss": 0.25284528732299805, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1222763061523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:08] (step=0010310) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.20034978624174116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10311, "loss": 0.31639227271080017, "memory_gb": 7.721559524536133, "step_time_ms": 3365.832805633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:12] (step=0010311) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.20036921881072678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10312, "loss": 0.22758567333221436, "memory_gb": 7.721559524536133, "step_time_ms": 3363.884210586548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:15] (step=0010312) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.2003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10313, "loss": 0.2952830195426941, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6675605773926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:19] (step=0010313) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.20040808394869802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10314, "loss": 0.20617537200450897, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7172241210938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:22] (step=0010314) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.20042751651768365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10315, "loss": 0.2860006093978882, "memory_gb": 7.721559524536133, "step_time_ms": 3342.897891998291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:26] (step=0010315) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.20044694908666927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10316, "loss": 0.1148742139339447, "memory_gb": 7.721559524536133, "step_time_ms": 3362.942934036255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:29] (step=0010316) Train Loss: 0.1795, Train Steps/Sec: 0.28, Epoch: 0.2004663816556549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10317, "loss": 0.26311951875686646, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0643100738525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:33] (step=0010317) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.20048581422464049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10318, "loss": 0.2641223073005676, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3994846343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:37] (step=0010318) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.2005052467936261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10319, "loss": 0.310698926448822, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9678230285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:40] (step=0010319) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.20052467936261173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10320, "loss": 0.3086439371109009, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2806282043457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:44] (step=0010320) Train Loss: 0.2946, Train Steps/Sec: 0.28, Epoch: 0.20054411193159735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10321, "loss": 0.2661859393119812, "memory_gb": 7.721559524536133, "step_time_ms": 3364.198684692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:47] (step=0010321) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.20056354450058297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10322, "loss": 0.2142799347639084, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9792881011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:51] (step=0010322) Train Loss: 0.1990, Train Steps/Sec: 0.28, Epoch: 0.2005829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10323, "loss": 0.2819601595401764, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8760318756104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:55] (step=0010323) Train Loss: 0.3084, Train Steps/Sec: 0.28, Epoch: 0.20060240963855422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:20:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10324, "loss": 0.3029101490974426, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6342964172363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:20:58] (step=0010324) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.20062184220753984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10325, "loss": 0.30083662271499634, "memory_gb": 7.721559524536133, "step_time_ms": 3367.379665374756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:02] (step=0010325) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.20064127477652546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10326, "loss": 0.26789313554763794, "memory_gb": 7.721559524536133, "step_time_ms": 3357.010841369629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:05] (step=0010326) Train Loss: 0.2144, Train Steps/Sec: 0.28, Epoch: 0.20066070734551109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10327, "loss": 0.25747406482696533, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4474487304688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:09] (step=0010327) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.2006801399144967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10328, "loss": 0.20080533623695374, "memory_gb": 7.721559524536133, "step_time_ms": 3364.704370498657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:13] (step=0010328) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.2006995724834823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10329, "loss": 0.24456873536109924, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9098148345947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:16] (step=0010329) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.20071900505246792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10330, "loss": 0.21856138110160828, "memory_gb": 7.721559524536133, "step_time_ms": 3365.802764892578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:20] (step=0010330) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.20073843762145355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10331, "loss": 0.28212112188339233, "memory_gb": 7.721559524536133, "step_time_ms": 3365.311861038208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:23] (step=0010331) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.20075787019043917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10332, "loss": 0.23067957162857056, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6939010620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:27] (step=0010332) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.2007773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10333, "loss": 0.2175380438566208, "memory_gb": 7.721559524536133, "step_time_ms": 3365.304708480835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:30] (step=0010333) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.20079673532841041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10334, "loss": 0.21159647405147552, "memory_gb": 7.721559524536133, "step_time_ms": 3368.271589279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:34] (step=0010334) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.20081616789739604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10335, "loss": 0.25408735871315, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9318103790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:38] (step=0010335) Train Loss: 0.3115, Train Steps/Sec: 0.28, Epoch: 0.20083560046638166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10336, "loss": 0.261706680059433, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9063110351562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:41] (step=0010336) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.20085503303536728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10337, "loss": 0.2188383936882019, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2608604431152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:45] (step=0010337) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.2008744656043529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10338, "loss": 0.30653876066207886, "memory_gb": 7.721559524536133, "step_time_ms": 3370.784044265747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:48] (step=0010338) Train Loss: 0.2613, Train Steps/Sec: 0.28, Epoch: 0.20089389817333853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10339, "loss": 0.24100345373153687, "memory_gb": 7.721559524536133, "step_time_ms": 3368.18265914917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:52] (step=0010339) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.20091333074232415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10340, "loss": 0.16701459884643555, "memory_gb": 7.721559524536133, "step_time_ms": 3364.104986190796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:56] (step=0010340) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.20093276331130974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:21:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10341, "loss": 0.2292044460773468, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3547973632812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:21:59] (step=0010341) Train Loss: 0.2375, Train Steps/Sec: 0.27, Epoch: 0.20095219588029536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10342, "loss": 0.16160178184509277, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1827716827393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:03] (step=0010342) Train Loss: 0.1664, Train Steps/Sec: 0.28, Epoch: 0.200971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10343, "loss": 0.25824564695358276, "memory_gb": 7.721559524536133, "step_time_ms": 3367.04158782959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:06] (step=0010343) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.2009910610182666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10344, "loss": 0.1815207302570343, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8290634155273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:10] (step=0010344) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.20101049358725223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10345, "loss": 0.29934030771255493, "memory_gb": 7.721559524536133, "step_time_ms": 3368.485689163208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:14] (step=0010345) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.20102992615623785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10346, "loss": 0.1538850963115692, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1958236694336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:17] (step=0010346) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.20104935872522348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10347, "loss": 0.3054521679878235, "memory_gb": 7.715639114379883, "step_time_ms": 3324.7015476226807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:21] (step=0010347) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.2010687912942091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10348, "loss": 0.1992606222629547, "memory_gb": 7.721559524536133, "step_time_ms": 3363.189935684204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:24] (step=0010348) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.20108822386319472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10349, "loss": 0.23097921907901764, "memory_gb": 7.721559524536133, "step_time_ms": 3358.677864074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:28] (step=0010349) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.20110765643218034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10350, "loss": 0.3121076822280884, "memory_gb": 7.721559524536133, "step_time_ms": 3361.426591873169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:31] (step=0010350) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.20112708900116597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10351, "loss": 0.12701597809791565, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0144386291504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:35] (step=0010351) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.2011465215701516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10352, "loss": 0.3695039749145508, "memory_gb": 7.721559524536133, "step_time_ms": 3361.938238143921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:39] (step=0010352) Train Loss: 0.3134, Train Steps/Sec: 0.28, Epoch: 0.20116595413913718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10353, "loss": 0.3257817029953003, "memory_gb": 7.715639114379883, "step_time_ms": 3325.080633163452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:42] (step=0010353) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.2011853867081228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10354, "loss": 0.21199864149093628, "memory_gb": 7.721559524536133, "step_time_ms": 3361.495018005371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:46] (step=0010354) Train Loss: 0.1987, Train Steps/Sec: 0.28, Epoch: 0.20120481927710843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10355, "loss": 0.30536943674087524, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5568618774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:49] (step=0010355) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.20122425184609405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10356, "loss": 0.2066631317138672, "memory_gb": 7.721559524536133, "step_time_ms": 3505.7945251464844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:53] (step=0010356) Train Loss: 0.2006, Train Steps/Sec: 0.28, Epoch: 0.20124368441507967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:22:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10357, "loss": 0.2239564210176468, "memory_gb": 7.721559524536133, "step_time_ms": 3358.56556892395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:22:57] (step=0010357) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.2012631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10358, "loss": 0.23151423037052155, "memory_gb": 7.721559524536133, "step_time_ms": 3360.649824142456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:00] (step=0010358) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.20128254955305092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10359, "loss": 0.28467464447021484, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5096340179443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:04] (step=0010359) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.20130198212203654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10360, "loss": 0.27143996953964233, "memory_gb": 7.721559524536133, "step_time_ms": 3359.506368637085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:07] (step=0010360) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.20132141469102216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10361, "loss": 0.20796933770179749, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4324588775635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:11] (step=0010361) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.20134084726000778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10362, "loss": 0.351459264755249, "memory_gb": 7.721559524536133, "step_time_ms": 3359.501838684082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:14] (step=0010362) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.2013602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10363, "loss": 0.19802549481391907, "memory_gb": 7.721559524536133, "step_time_ms": 3355.417251586914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:18] (step=0010363) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.201379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10364, "loss": 0.25831663608551025, "memory_gb": 7.721559524536133, "step_time_ms": 3359.718084335327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:22] (step=0010364) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.20139914496696462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10365, "loss": 0.2135494202375412, "memory_gb": 7.721559524536133, "step_time_ms": 3340.7845497131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:25] (step=0010365) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.20141857753595024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10366, "loss": 0.25800904631614685, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7820320129395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:29] (step=0010366) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.20143801010493587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10367, "loss": 0.2343394160270691, "memory_gb": 7.721559524536133, "step_time_ms": 3348.656415939331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:32] (step=0010367) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.2014574426739215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10368, "loss": 0.29039666056632996, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4274520874023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:36] (step=0010368) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.2014768752429071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10369, "loss": 0.14575761556625366, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2098064422607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:39] (step=0010369) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.20149630781189273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10370, "loss": 0.2583502531051636, "memory_gb": 7.721559524536133, "step_time_ms": 3355.224132537842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:43] (step=0010370) Train Loss: 0.2628, Train Steps/Sec: 0.28, Epoch: 0.20151574038087836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10371, "loss": 0.2759350836277008, "memory_gb": 7.721559524536133, "step_time_ms": 3359.189748764038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:47] (step=0010371) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.20153517294986398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10372, "loss": 0.14174996316432953, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0574054718018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:50] (step=0010372) Train Loss: 0.1569, Train Steps/Sec: 0.28, Epoch: 0.2015546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10373, "loss": 0.22821977734565735, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4890575408936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:54] (step=0010373) Train Loss: 0.2790, Train Steps/Sec: 0.28, Epoch: 0.20157403808783522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:23:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10374, "loss": 0.274524986743927, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9745082855225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:23:57] (step=0010374) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.20159347065682084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10375, "loss": 0.187079519033432, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1879653930664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:01] (step=0010375) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.20161290322580644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10376, "loss": 0.2135702222585678, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8120727539062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:04] (step=0010376) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.20163233579479206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10377, "loss": 0.26305606961250305, "memory_gb": 7.721559524536133, "step_time_ms": 3356.839179992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:08] (step=0010377) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.20165176836377768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10378, "loss": 0.2930698096752167, "memory_gb": 7.721559524536133, "step_time_ms": 3341.718912124634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:12] (step=0010378) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.2016712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10379, "loss": 0.13478052616119385, "memory_gb": 7.721559524536133, "step_time_ms": 3355.285406112671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:15] (step=0010379) Train Loss: 0.1279, Train Steps/Sec: 0.28, Epoch: 0.20169063350174893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10380, "loss": 0.30831772089004517, "memory_gb": 7.721559524536133, "step_time_ms": 3359.185218811035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:19] (step=0010380) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.20171006607073455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10381, "loss": 0.28817814588546753, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1716289520264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:22] (step=0010381) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.20172949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10382, "loss": 0.2374895066022873, "memory_gb": 7.721559524536133, "step_time_ms": 3343.9688682556152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:26] (step=0010382) Train Loss: 0.1816, Train Steps/Sec: 0.27, Epoch: 0.2017489312087058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10383, "loss": 0.18902906775474548, "memory_gb": 7.721559524536133, "step_time_ms": 3357.988119125366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:30] (step=0010383) Train Loss: 0.1952, Train Steps/Sec: 0.28, Epoch: 0.20176836377769142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10384, "loss": 0.1928776651620865, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7355403900146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:33] (step=0010384) Train Loss: 0.1847, Train Steps/Sec: 0.28, Epoch: 0.20178779634667704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10385, "loss": 0.15852239727973938, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8931560516357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:37] (step=0010385) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.20180722891566266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10386, "loss": 0.12272943556308746, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4532737731934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:40] (step=0010386) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.20182666148464828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10387, "loss": 0.15778839588165283, "memory_gb": 7.721559524536133, "step_time_ms": 3357.003688812256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:44] (step=0010387) Train Loss: 0.1936, Train Steps/Sec: 0.28, Epoch: 0.20184609405363388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10388, "loss": 0.2793685793876648, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1599674224854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:47] (step=0010388) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.2018655266226195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10389, "loss": 0.19896391034126282, "memory_gb": 7.721559524536133, "step_time_ms": 3359.424829483032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:51] (step=0010389) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.20188495919160512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10390, "loss": 0.2112732231616974, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2966327667236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:55] (step=0010390) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.20190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:24:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10391, "loss": 0.17103685438632965, "memory_gb": 7.721559524536133, "step_time_ms": 3359.877347946167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:24:58] (step=0010391) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.20192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10392, "loss": 0.23977862298488617, "memory_gb": 7.721559524536133, "step_time_ms": 3342.681407928467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:02] (step=0010392) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.201943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10393, "loss": 0.28162312507629395, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2541007995605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:05] (step=0010393) Train Loss: 0.2747, Train Steps/Sec: 0.28, Epoch: 0.2019626894675476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10394, "loss": 0.26270702481269836, "memory_gb": 7.721559524536133, "step_time_ms": 3354.843854904175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:09] (step=0010394) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.20198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10395, "loss": 0.1703346073627472, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3219051361084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:12] (step=0010395) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.20200155460551886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10396, "loss": 0.25188711285591125, "memory_gb": 7.721559524536133, "step_time_ms": 3359.839677810669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:16] (step=0010396) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.20202098717450448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10397, "loss": 0.20273853838443756, "memory_gb": 7.721559524536133, "step_time_ms": 3498.713493347168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:20] (step=0010397) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.2020404197434901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10398, "loss": 0.3348660171031952, "memory_gb": 7.721559524536133, "step_time_ms": 3355.133056640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:23] (step=0010398) Train Loss: 0.2693, Train Steps/Sec: 0.28, Epoch: 0.2020598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10399, "loss": 0.18653710186481476, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1739501953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:27] (step=0010399) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.20207928488146132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10400, "loss": 0.24303072690963745, "memory_gb": 7.721559524536133, "step_time_ms": 3362.571954727173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:30] (step=0010400) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.20209871745044694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10401, "loss": 0.2003592997789383, "memory_gb": 7.721559524536133, "step_time_ms": 3360.015630722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:34] (step=0010401) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.20211815001943256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10402, "loss": 0.25860047340393066, "memory_gb": 7.721559524536133, "step_time_ms": 3362.251043319702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:37] (step=0010402) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.20213758258841819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10403, "loss": 0.29475492238998413, "memory_gb": 7.721559524536133, "step_time_ms": 3358.604669570923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:41] (step=0010403) Train Loss: 0.2689, Train Steps/Sec: 0.28, Epoch: 0.2021570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10404, "loss": 0.18518942594528198, "memory_gb": 7.721559524536133, "step_time_ms": 3359.852075576782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:45] (step=0010404) Train Loss: 0.1800, Train Steps/Sec: 0.28, Epoch: 0.20217644772638943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10405, "loss": 0.20240749418735504, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5170040130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:48] (step=0010405) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.20219588029537505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10406, "loss": 0.17783817648887634, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3012142181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:52] (step=0010406) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.20221531286436067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10407, "loss": 0.25356757640838623, "memory_gb": 7.721559524536133, "step_time_ms": 3358.698606491089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:55] (step=0010407) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.2022347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:25:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10408, "loss": 0.3479054272174835, "memory_gb": 7.721559524536133, "step_time_ms": 3351.715087890625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:25:59] (step=0010408) Train Loss: 0.3262, Train Steps/Sec: 0.28, Epoch: 0.20225417800233192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10409, "loss": 0.23209457099437714, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1129264831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:02] (step=0010409) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.20227361057131754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10410, "loss": 0.22991099953651428, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3455295562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:06] (step=0010410) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.20229304314030314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10411, "loss": 0.2278187870979309, "memory_gb": 7.721559524536133, "step_time_ms": 3358.327627182007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:10] (step=0010411) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.20231247570928876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10412, "loss": 0.3274655044078827, "memory_gb": 7.715639114379883, "step_time_ms": 3326.3444900512695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:13] (step=0010412) Train Loss: 0.3126, Train Steps/Sec: 0.28, Epoch: 0.20233190827827438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10413, "loss": 0.23101170361042023, "memory_gb": 7.721559524536133, "step_time_ms": 3359.896659851074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:17] (step=0010413) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.20235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10414, "loss": 0.2101868987083435, "memory_gb": 7.721559524536133, "step_time_ms": 3361.438751220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:20] (step=0010414) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.20237077341624563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10415, "loss": 0.25240179896354675, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9629096984863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:24] (step=0010415) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.20239020598523125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10416, "loss": 0.2518610656261444, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6466007232666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:27] (step=0010416) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.20240963855421687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10417, "loss": 0.30695492029190063, "memory_gb": 7.721559524536133, "step_time_ms": 3369.554281234741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:31] (step=0010417) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.2024290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10418, "loss": 0.33581018447875977, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1062717437744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:35] (step=0010418) Train Loss: 0.2786, Train Steps/Sec: 0.28, Epoch: 0.20244850369218811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10419, "loss": 0.2810703217983246, "memory_gb": 7.721559524536133, "step_time_ms": 3361.454486846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:38] (step=0010419) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.20246793626117374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10420, "loss": 0.1901722550392151, "memory_gb": 7.721559524536133, "step_time_ms": 3364.882707595825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:42] (step=0010420) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.20248736883015936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10421, "loss": 0.2846660614013672, "memory_gb": 7.721559524536133, "step_time_ms": 3365.90838432312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:45] (step=0010421) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.20250680139914495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10422, "loss": 0.2979552447795868, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0238304138184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:49] (step=0010422) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.20252623396813058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10423, "loss": 0.1568349003791809, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6035194396973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:53] (step=0010423) Train Loss: 0.1926, Train Steps/Sec: 0.28, Epoch: 0.2025456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10424, "loss": 0.20516714453697205, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3996295928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:26:56] (step=0010424) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.20256509910610182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10425, "loss": 0.3479050099849701, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9024448394775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:00] (step=0010425) Train Loss: 0.2802, Train Steps/Sec: 0.28, Epoch: 0.20258453167508744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10426, "loss": 0.2841426432132721, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2076511383057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:03] (step=0010426) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.20260396424407306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10427, "loss": 0.32901716232299805, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3502464294434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:07] (step=0010427) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.2026233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10428, "loss": 0.2098940908908844, "memory_gb": 7.721559524536133, "step_time_ms": 3366.262197494507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:10] (step=0010428) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.2026428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10429, "loss": 0.3153303861618042, "memory_gb": 7.721559524536133, "step_time_ms": 3367.29097366333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:14] (step=0010429) Train Loss: 0.3238, Train Steps/Sec: 0.27, Epoch: 0.20266226195102993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10430, "loss": 0.19747993350028992, "memory_gb": 7.721559524536133, "step_time_ms": 3359.710931777954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:18] (step=0010430) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.20268169452001555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10431, "loss": 0.18465165793895721, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8089637756348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:21] (step=0010431) Train Loss: 0.1820, Train Steps/Sec: 0.28, Epoch: 0.20270112708900118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10432, "loss": 0.18214663863182068, "memory_gb": 7.721559524536133, "step_time_ms": 3362.433910369873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:25] (step=0010432) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.2027205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10433, "loss": 0.25480547547340393, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5339736938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:29] (step=0010433) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.2027399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10434, "loss": 0.18955570459365845, "memory_gb": 7.721559524536133, "step_time_ms": 3357.224225997925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:32] (step=0010434) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.20275942479595802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10435, "loss": 0.2538508176803589, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5428676605225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:36] (step=0010435) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.20277885736494364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10436, "loss": 0.17116108536720276, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6514205932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:39] (step=0010436) Train Loss: 0.2043, Train Steps/Sec: 0.28, Epoch: 0.20279828993392926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10437, "loss": 0.15081849694252014, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2559776306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:43] (step=0010437) Train Loss: 0.1703, Train Steps/Sec: 0.28, Epoch: 0.20281772250291488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10438, "loss": 0.2243412733078003, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4208488464355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:46] (step=0010438) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.2028371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10439, "loss": 0.21425992250442505, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3684692382812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:50] (step=0010439) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.20285658764088613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10440, "loss": 0.13594812154769897, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5451793670654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:54] (step=0010440) Train Loss: 0.1640, Train Steps/Sec: 0.28, Epoch: 0.20287602020987175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10441, "loss": 0.15986214578151703, "memory_gb": 7.721559524536133, "step_time_ms": 3359.396457672119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:27:57] (step=0010441) Train Loss: 0.1479, Train Steps/Sec: 0.28, Epoch: 0.20289545277885737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10442, "loss": 0.32457607984542847, "memory_gb": 7.721559524536133, "step_time_ms": 3356.525182723999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:01] (step=0010442) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.202914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10443, "loss": 0.2513785660266876, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6414070129395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:04] (step=0010443) Train Loss: 0.3089, Train Steps/Sec: 0.28, Epoch: 0.20293431791682862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10444, "loss": 0.37814241647720337, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1979484558105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:08] (step=0010444) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.20295375048581424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10445, "loss": 0.26399073004722595, "memory_gb": 7.721559524536133, "step_time_ms": 3506.4799785614014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:12] (step=0010445) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.20297318305479983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10446, "loss": 0.21847349405288696, "memory_gb": 7.721559524536133, "step_time_ms": 3345.6931114196777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:15] (step=0010446) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.20299261562378546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10447, "loss": 0.2937922775745392, "memory_gb": 7.721559524536133, "step_time_ms": 3361.725330352783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:19] (step=0010447) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.20301204819277108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10448, "loss": 0.2381046712398529, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2357635498047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:22] (step=0010448) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.2030314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10449, "loss": 0.20691357553005219, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1597595214844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:26] (step=0010449) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.20305091333074232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10450, "loss": 0.24888063967227936, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0141792297363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:29] (step=0010450) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.20307034589972794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10451, "loss": 0.2093498557806015, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0493717193604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:33] (step=0010451) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.20308977846871357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10452, "loss": 0.17166240513324738, "memory_gb": 7.721559524536133, "step_time_ms": 3360.16583442688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:37] (step=0010452) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.2031092110376992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10453, "loss": 0.2481236755847931, "memory_gb": 7.721559524536133, "step_time_ms": 3359.794855117798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:40] (step=0010453) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.2031286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10454, "loss": 0.22134044766426086, "memory_gb": 7.721559524536133, "step_time_ms": 3359.076738357544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:44] (step=0010454) Train Loss: 0.1691, Train Steps/Sec: 0.28, Epoch: 0.20314807617567043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10455, "loss": 0.3638492226600647, "memory_gb": 7.721559524536133, "step_time_ms": 3360.484838485718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:47] (step=0010455) Train Loss: 0.3376, Train Steps/Sec: 0.28, Epoch: 0.20316750874465606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10456, "loss": 0.22320140898227692, "memory_gb": 7.721559524536133, "step_time_ms": 3361.673593521118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:51] (step=0010456) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.20318694131364165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10457, "loss": 0.1794348508119583, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7915687561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:54] (step=0010457) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.20320637388262727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10458, "loss": 0.23242364823818207, "memory_gb": 7.721559524536133, "step_time_ms": 3355.844497680664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:28:58] (step=0010458) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.2032258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10459, "loss": 0.1545410454273224, "memory_gb": 7.721559524536133, "step_time_ms": 3354.374885559082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:02] (step=0010459) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.20324523902059852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10460, "loss": 0.14045897126197815, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3364391326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:05] (step=0010460) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.20326467158958414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10461, "loss": 0.21530720591545105, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7020587921143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:09] (step=0010461) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.20328410415856976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10462, "loss": 0.17641320824623108, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2027473449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:12] (step=0010462) Train Loss: 0.1523, Train Steps/Sec: 0.28, Epoch: 0.20330353672755538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10463, "loss": 0.2777611315250397, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9557876586914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:16] (step=0010463) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.203322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10464, "loss": 0.18710613250732422, "memory_gb": 7.721559524536133, "step_time_ms": 3360.971212387085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:19] (step=0010464) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.20334240186552663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10465, "loss": 0.22575683891773224, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2302894592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:23] (step=0010465) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.20336183443451225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10466, "loss": 0.16776379942893982, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5379638671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:27] (step=0010466) Train Loss: 0.2101, Train Steps/Sec: 0.28, Epoch: 0.20338126700349787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10467, "loss": 0.20184950530529022, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1630878448486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:30] (step=0010467) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.2034006995724835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10468, "loss": 0.21220161020755768, "memory_gb": 7.721559524536133, "step_time_ms": 3344.001293182373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:34] (step=0010468) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.2034201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10469, "loss": 0.32347971200942993, "memory_gb": 7.721559524536133, "step_time_ms": 3359.495162963867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:37] (step=0010469) Train Loss: 0.3005, Train Steps/Sec: 0.27, Epoch: 0.2034395647104547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10470, "loss": 0.2025032639503479, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2370071411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:41] (step=0010470) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.20345899727944033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10471, "loss": 0.2939172685146332, "memory_gb": 7.721559524536133, "step_time_ms": 3356.415271759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:45] (step=0010471) Train Loss: 0.2514, Train Steps/Sec: 0.28, Epoch: 0.20347842984842596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10472, "loss": 0.217523992061615, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3603839874268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:48] (step=0010472) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.20349786241741158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10473, "loss": 0.1999916136264801, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2310886383057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:52] (step=0010473) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2035172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10474, "loss": 0.3200908303260803, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6085357666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:55] (step=0010474) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.20353672755538282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:29:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10475, "loss": 0.25035881996154785, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6160202026367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:29:59] (step=0010475) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.20355616012436845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10476, "loss": 0.28477969765663147, "memory_gb": 7.721559524536133, "step_time_ms": 3363.126277923584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:02] (step=0010476) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.20357559269335407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10477, "loss": 0.22271814942359924, "memory_gb": 7.721559524536133, "step_time_ms": 3354.142189025879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:06] (step=0010477) Train Loss: 0.1877, Train Steps/Sec: 0.28, Epoch: 0.2035950252623397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10478, "loss": 0.17609846591949463, "memory_gb": 7.721559524536133, "step_time_ms": 3360.764265060425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:10] (step=0010478) Train Loss: 0.1855, Train Steps/Sec: 0.28, Epoch: 0.2036144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10479, "loss": 0.25482624769210815, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2545051574707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:13] (step=0010479) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.2036338904003109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10480, "loss": 0.1543528139591217, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1256141662598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:17] (step=0010480) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.20365332296929653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10481, "loss": 0.2887851297855377, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6672286987305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:20] (step=0010481) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.20367275553828215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10482, "loss": 0.31934332847595215, "memory_gb": 7.721559524536133, "step_time_ms": 3361.814260482788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:24] (step=0010482) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.20369218810726777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10483, "loss": 0.28466883301734924, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9188842773438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:27] (step=0010483) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.2037116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10484, "loss": 0.16709430515766144, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6528301239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:31] (step=0010484) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.20373105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10485, "loss": 0.32392334938049316, "memory_gb": 7.721559524536133, "step_time_ms": 3504.709005355835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:35] (step=0010485) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.20375048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10486, "loss": 0.20224611461162567, "memory_gb": 7.721559524536133, "step_time_ms": 3353.785276412964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:38] (step=0010486) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.20376991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10487, "loss": 0.3286989629268646, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2605381011963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:42] (step=0010487) Train Loss: 0.3261, Train Steps/Sec: 0.28, Epoch: 0.20378935095219589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10488, "loss": 0.2658914625644684, "memory_gb": 7.721559524536133, "step_time_ms": 3358.546018600464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:45] (step=0010488) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.2038087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10489, "loss": 0.1401950865983963, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5167140960693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:49] (step=0010489) Train Loss: 0.1668, Train Steps/Sec: 0.28, Epoch: 0.20382821609016713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10490, "loss": 0.20204772055149078, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9189777374268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:52] (step=0010490) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.20384764865915275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10491, "loss": 0.3027394413948059, "memory_gb": 7.721559524536133, "step_time_ms": 3356.675624847412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:56] (step=0010491) Train Loss: 0.2989, Train Steps/Sec: 0.28, Epoch: 0.20386708122813835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:30:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10492, "loss": 0.2458423674106598, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7897758483887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:30:59] (step=0010492) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.20388651379712397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10493, "loss": 0.12481184303760529, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2193851470947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:03] (step=0010493) Train Loss: 0.1512, Train Steps/Sec: 0.28, Epoch: 0.2039059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10494, "loss": 0.19918498396873474, "memory_gb": 7.721559524536133, "step_time_ms": 3348.0567932128906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:07] (step=0010494) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.20392537893509521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10495, "loss": 0.15698738396167755, "memory_gb": 7.721559524536133, "step_time_ms": 3350.109100341797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:10] (step=0010495) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.20394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10496, "loss": 0.18184922635555267, "memory_gb": 7.721559524536133, "step_time_ms": 3358.814477920532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:14] (step=0010496) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.20396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10497, "loss": 0.2670307159423828, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9163856506348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:17] (step=0010497) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.20398367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10498, "loss": 0.17820951342582703, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9282550811768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:21] (step=0010498) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.2040031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10499, "loss": 0.25307101011276245, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5011234283447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:25] (step=0010499) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.20402254178002333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10500, "loss": 0.2923307418823242, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1956157684326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:28] (step=0010500) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.20404197434900895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10501, "loss": 0.1293509155511856, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0330142974854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:32] (step=0010501) Train Loss: 0.1805, Train Steps/Sec: 0.28, Epoch: 0.20406140691799457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10502, "loss": 0.2725708484649658, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7028980255127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:35] (step=0010502) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.2040808394869802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10503, "loss": 0.19070985913276672, "memory_gb": 7.721559524536133, "step_time_ms": 3360.029458999634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:39] (step=0010503) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.2041002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10504, "loss": 0.1995621621608734, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7870597839355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:42] (step=0010504) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.2041197046249514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10505, "loss": 0.27997875213623047, "memory_gb": 7.721559524536133, "step_time_ms": 3359.828233718872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:46] (step=0010505) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.20413913719393703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10506, "loss": 0.16257557272911072, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3201637268066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:50] (step=0010506) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.20415856976292265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10507, "loss": 0.2168591022491455, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3363971710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:53] (step=0010507) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.20417800233190828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10508, "loss": 0.21238954365253448, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1065406799316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:31:57] (step=0010508) Train Loss: 0.1619, Train Steps/Sec: 0.28, Epoch: 0.2041974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10509, "loss": 0.272819459438324, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7022552490234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:00] (step=0010509) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.20421686746987952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10510, "loss": 0.24957424402236938, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7917041778564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:04] (step=0010510) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.20423630003886514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10511, "loss": 0.24642279744148254, "memory_gb": 7.721559524536133, "step_time_ms": 3348.724126815796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:07] (step=0010511) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.20425573260785077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10512, "loss": 0.33549919724464417, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9302043914795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:11] (step=0010512) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.2042751651768364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10513, "loss": 0.27764931321144104, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1107807159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:15] (step=0010513) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.204294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10514, "loss": 0.1980562061071396, "memory_gb": 7.721559524536133, "step_time_ms": 3344.5985317230225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:18] (step=0010514) Train Loss: 0.3000, Train Steps/Sec: 0.28, Epoch: 0.2043140303148076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10515, "loss": 0.317618191242218, "memory_gb": 7.721559524536133, "step_time_ms": 3360.013246536255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:22] (step=0010515) Train Loss: 0.3264, Train Steps/Sec: 0.28, Epoch: 0.20433346288379323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10516, "loss": 0.22928911447525024, "memory_gb": 7.721559524536133, "step_time_ms": 3345.477819442749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:25] (step=0010516) Train Loss: 0.1912, Train Steps/Sec: 0.28, Epoch: 0.20435289545277885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10517, "loss": 0.2593831419944763, "memory_gb": 7.715639114379883, "step_time_ms": 3327.71635055542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:29] (step=0010517) Train Loss: 0.2180, Train Steps/Sec: 0.27, Epoch: 0.20437232802176447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10518, "loss": 0.19984513521194458, "memory_gb": 7.721559524536133, "step_time_ms": 3355.462074279785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:32] (step=0010518) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.2043917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10519, "loss": 0.10177900642156601, "memory_gb": 7.721559524536133, "step_time_ms": 3354.119300842285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:36] (step=0010519) Train Loss: 0.1510, Train Steps/Sec: 0.28, Epoch: 0.20441119315973572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10520, "loss": 0.27437376976013184, "memory_gb": 7.721559524536133, "step_time_ms": 3362.027645111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:40] (step=0010520) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.20443062572872134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10521, "loss": 0.26651692390441895, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2941761016846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:43] (step=0010521) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.20445005829770696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10522, "loss": 0.20519472658634186, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7397289276123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:47] (step=0010522) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.20446949086669258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10523, "loss": 0.1622297763824463, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9676151275635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:50] (step=0010523) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.2044889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10524, "loss": 0.1703859567642212, "memory_gb": 7.721559524536133, "step_time_ms": 3352.365493774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:54] (step=0010524) Train Loss: 0.1971, Train Steps/Sec: 0.28, Epoch: 0.20450835600466383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10525, "loss": 0.1968209594488144, "memory_gb": 7.721559524536133, "step_time_ms": 3361.177682876587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:32:57] (step=0010525) Train Loss: 0.2107, Train Steps/Sec: 0.28, Epoch: 0.20452778857364945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10526, "loss": 0.2081642746925354, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0660514831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:01] (step=0010526) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.20454722114263504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10527, "loss": 0.20215266942977905, "memory_gb": 7.721559524536133, "step_time_ms": 3502.147436141968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:05] (step=0010527) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.20456665371162067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10528, "loss": 0.27895867824554443, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5246028900146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:08] (step=0010528) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.2045860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10529, "loss": 0.19601473212242126, "memory_gb": 7.721559524536133, "step_time_ms": 3358.938455581665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:12] (step=0010529) Train Loss: 0.1787, Train Steps/Sec: 0.28, Epoch: 0.2046055188495919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10530, "loss": 0.24674390256404877, "memory_gb": 7.721559524536133, "step_time_ms": 3364.664316177368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:15] (step=0010530) Train Loss: 0.2747, Train Steps/Sec: 0.28, Epoch: 0.20462495141857753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10531, "loss": 0.26979580521583557, "memory_gb": 7.721559524536133, "step_time_ms": 3362.848997116089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:19] (step=0010531) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.20464438398756316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10532, "loss": 0.2635389566421509, "memory_gb": 7.721559524536133, "step_time_ms": 3356.428623199463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:23] (step=0010532) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.20466381655654878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10533, "loss": 0.2935405969619751, "memory_gb": 7.721559524536133, "step_time_ms": 3363.913059234619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:26] (step=0010533) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.2046832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10534, "loss": 0.181089848279953, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2516136169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:30] (step=0010534) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.20470268169452002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10535, "loss": 0.18008340895175934, "memory_gb": 7.721559524536133, "step_time_ms": 3365.220785140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:33] (step=0010535) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.20472211426350564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10536, "loss": 0.253237783908844, "memory_gb": 7.721559524536133, "step_time_ms": 3353.053092956543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:37] (step=0010536) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.20474154683249127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10537, "loss": 0.32546600699424744, "memory_gb": 7.721559524536133, "step_time_ms": 3359.945774078369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:40] (step=0010537) Train Loss: 0.3126, Train Steps/Sec: 0.28, Epoch: 0.20476097940147686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10538, "loss": 0.27431753277778625, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0566596984863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:44] (step=0010538) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.20478041197046248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10539, "loss": 0.3055364489555359, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8509769439697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:48] (step=0010539) Train Loss: 0.3286, Train Steps/Sec: 0.28, Epoch: 0.2047998445394481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10540, "loss": 0.23985064029693604, "memory_gb": 7.721559524536133, "step_time_ms": 3352.473735809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:51] (step=0010540) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.20481927710843373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10541, "loss": 0.19997145235538483, "memory_gb": 7.721559524536133, "step_time_ms": 3365.753650665283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:55] (step=0010541) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.20483870967741935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:33:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10542, "loss": 0.34262725710868835, "memory_gb": 7.721559524536133, "step_time_ms": 3360.755681991577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:33:58] (step=0010542) Train Loss: 0.3020, Train Steps/Sec: 0.28, Epoch: 0.20485814224640497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10543, "loss": 0.18122762441635132, "memory_gb": 7.721559524536133, "step_time_ms": 3357.156753540039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:02] (step=0010543) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.2048775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10544, "loss": 0.2978459596633911, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9514446258545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:05] (step=0010544) Train Loss: 0.2863, Train Steps/Sec: 0.28, Epoch: 0.20489700738437622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10545, "loss": 0.1608978509902954, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3572635650635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:09] (step=0010545) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.20491643995336184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10546, "loss": 0.14721184968948364, "memory_gb": 7.721559524536133, "step_time_ms": 3361.701250076294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:13] (step=0010546) Train Loss: 0.2150, Train Steps/Sec: 0.28, Epoch: 0.20493587252234746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10547, "loss": 0.3377612233161926, "memory_gb": 7.721559524536133, "step_time_ms": 3359.304666519165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:16] (step=0010547) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.20495530509133308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10548, "loss": 0.23599594831466675, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5501232147217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:20] (step=0010548) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.2049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10549, "loss": 0.2795594334602356, "memory_gb": 7.721559524536133, "step_time_ms": 3361.133575439453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:23] (step=0010549) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.2049941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10550, "loss": 0.28502124547958374, "memory_gb": 7.721559524536133, "step_time_ms": 3348.039388656616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:27] (step=0010550) Train Loss: 0.3037, Train Steps/Sec: 0.28, Epoch: 0.20501360279828992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10551, "loss": 0.2769562602043152, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1842651367188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:31] (step=0010551) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.20503303536727555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10552, "loss": 0.1896962821483612, "memory_gb": 7.721559524536133, "step_time_ms": 3352.696180343628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:34] (step=0010552) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.20505246793626117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10553, "loss": 0.2420777678489685, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8601818084717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:38] (step=0010553) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.2050719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10554, "loss": 0.2681080102920532, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8972091674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:41] (step=0010554) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.2050913330742324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10555, "loss": 0.3247262239456177, "memory_gb": 7.721559524536133, "step_time_ms": 3350.543737411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:45] (step=0010555) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.20511076564321803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10556, "loss": 0.22517506778240204, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0713996887207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:48] (step=0010556) Train Loss: 0.2358, Train Steps/Sec: 0.28, Epoch: 0.20513019821220366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10557, "loss": 0.2574724853038788, "memory_gb": 7.721559524536133, "step_time_ms": 3360.65411567688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:52] (step=0010557) Train Loss: 0.2726, Train Steps/Sec: 0.27, Epoch: 0.20514963078118928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10558, "loss": 0.3155960738658905, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1254272460938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:56] (step=0010558) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.2051690633501749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10559, "loss": 0.24162554740905762, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1670989990234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:34:59] (step=0010559) Train Loss: 0.1924, Train Steps/Sec: 0.28, Epoch: 0.20518849591916052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10560, "loss": 0.1608758568763733, "memory_gb": 7.721559524536133, "step_time_ms": 3360.166549682617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:03] (step=0010560) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.20520792848814615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10561, "loss": 0.20713868737220764, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7307720184326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:06] (step=0010561) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.20522736105713174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10562, "loss": 0.23368801176548004, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6204776763916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:10] (step=0010562) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.20524679362611736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10563, "loss": 0.17487704753875732, "memory_gb": 7.721559524536133, "step_time_ms": 3358.355760574341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:14] (step=0010563) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.20526622619510299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10564, "loss": 0.25785571336746216, "memory_gb": 7.721559524536133, "step_time_ms": 3341.236114501953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:17] (step=0010564) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.2052856587640886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10565, "loss": 0.263336181640625, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3530445098877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:21] (step=0010565) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.20530509133307423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10566, "loss": 0.19845986366271973, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3544025421143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:24] (step=0010566) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.20532452390205985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10567, "loss": 0.12483590096235275, "memory_gb": 7.721559524536133, "step_time_ms": 3354.083776473999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:28] (step=0010567) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.20534395647104547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10568, "loss": 0.14820867776870728, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6418628692627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:31] (step=0010568) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.2053633890400311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10569, "loss": 0.27709752321243286, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1229190826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:35] (step=0010569) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.20538282160901672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10570, "loss": 0.25510716438293457, "memory_gb": 7.721559524536133, "step_time_ms": 3356.189966201782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:39] (step=0010570) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.20540225417800234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10571, "loss": 0.18971402943134308, "memory_gb": 7.721559524536133, "step_time_ms": 3348.2792377471924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:42] (step=0010571) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.20542168674698796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10572, "loss": 0.2863579988479614, "memory_gb": 7.721559524536133, "step_time_ms": 3355.797290802002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:46] (step=0010572) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.20544111931597356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10573, "loss": 0.22128963470458984, "memory_gb": 7.721559524536133, "step_time_ms": 3359.951972961426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:49] (step=0010573) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.20546055188495918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10574, "loss": 0.15650343894958496, "memory_gb": 7.721559524536133, "step_time_ms": 3498.6510276794434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:53] (step=0010574) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.2054799844539448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:35:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10575, "loss": 0.30943742394447327, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7490367889404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:35:56] (step=0010575) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.20549941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10576, "loss": 0.1768743395805359, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8693656921387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:00] (step=0010576) Train Loss: 0.1345, Train Steps/Sec: 0.28, Epoch: 0.20551884959191605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10577, "loss": 0.2233784943819046, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5354137420654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:04] (step=0010577) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.20553828216090167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10578, "loss": 0.2822864055633545, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9134006500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:07] (step=0010578) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.2055577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10579, "loss": 0.3339241147041321, "memory_gb": 7.721559524536133, "step_time_ms": 3354.478359222412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:11] (step=0010579) Train Loss: 0.3338, Train Steps/Sec: 0.28, Epoch: 0.20557714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10580, "loss": 0.14189749956130981, "memory_gb": 7.721559524536133, "step_time_ms": 3360.079526901245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:14] (step=0010580) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.20559657986785854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10581, "loss": 0.22982840240001678, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5898876190186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:18] (step=0010581) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.20561601243684416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10582, "loss": 0.17744562029838562, "memory_gb": 7.715639114379883, "step_time_ms": 3315.952777862549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:21] (step=0010582) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.20563544500582978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10583, "loss": 0.26176950335502625, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2727909088135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:25] (step=0010583) Train Loss: 0.2924, Train Steps/Sec: 0.28, Epoch: 0.2056548775748154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10584, "loss": 0.289641797542572, "memory_gb": 7.721559524536133, "step_time_ms": 3354.365825653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:29] (step=0010584) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.205674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10585, "loss": 0.12416107207536697, "memory_gb": 7.721559524536133, "step_time_ms": 3355.362892150879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:32] (step=0010585) Train Loss: 0.1459, Train Steps/Sec: 0.28, Epoch: 0.20569374271278662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10586, "loss": 0.23565725982189178, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7051887512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:36] (step=0010586) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.20571317528177224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10587, "loss": 0.3009515106678009, "memory_gb": 7.715639114379883, "step_time_ms": 3320.796012878418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:39] (step=0010587) Train Loss: 0.2891, Train Steps/Sec: 0.28, Epoch: 0.20573260785075786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10588, "loss": 0.25235211849212646, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5427322387695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:43] (step=0010588) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.2057520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10589, "loss": 0.2948697805404663, "memory_gb": 7.721559524536133, "step_time_ms": 3356.752634048462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:46] (step=0010589) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.2057714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10590, "loss": 0.2408691942691803, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0756912231445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:50] (step=0010590) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.20579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10591, "loss": 0.2082127332687378, "memory_gb": 7.721559524536133, "step_time_ms": 3357.21492767334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:54] (step=0010591) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.20581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:36:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10592, "loss": 0.2468917816877365, "memory_gb": 7.721559524536133, "step_time_ms": 3342.7603244781494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:36:57] (step=0010592) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.20582977069568598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10593, "loss": 0.2560625970363617, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1415672302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:01] (step=0010593) Train Loss: 0.2778, Train Steps/Sec: 0.28, Epoch: 0.2058492032646716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10594, "loss": 0.24811215698719025, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9882431030273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:04] (step=0010594) Train Loss: 0.2748, Train Steps/Sec: 0.28, Epoch: 0.20586863583365722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10595, "loss": 0.30827707052230835, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8169555664062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:08] (step=0010595) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.20588806840264284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10596, "loss": 0.3028413653373718, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4722747802734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:11] (step=0010596) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.20590750097162844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10597, "loss": 0.25859859585762024, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5643043518066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:15] (step=0010597) Train Loss: 0.2090, Train Steps/Sec: 0.28, Epoch: 0.20592693354061406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10598, "loss": 0.23670728504657745, "memory_gb": 7.721559524536133, "step_time_ms": 3361.994981765747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:19] (step=0010598) Train Loss: 0.2197, Train Steps/Sec: 0.27, Epoch: 0.20594636610959968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10599, "loss": 0.1545555740594864, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4460487365723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:22] (step=0010599) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.2059657986785853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10600, "loss": 0.1926361322402954, "memory_gb": 7.721559524536133, "step_time_ms": 3357.372760772705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:26] (step=0010600) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.20598523124757093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10601, "loss": 0.18382200598716736, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2463989257812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:29] (step=0010601) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.20600466381655655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10602, "loss": 0.16494333744049072, "memory_gb": 7.721559524536133, "step_time_ms": 3343.069553375244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:33] (step=0010602) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.20602409638554217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10603, "loss": 0.144433856010437, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3007583618164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:36] (step=0010603) Train Loss: 0.1808, Train Steps/Sec: 0.28, Epoch: 0.2060435289545278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10604, "loss": 0.1423879861831665, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0425510406494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:40] (step=0010604) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.20606296152351342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10605, "loss": 0.29051515460014343, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4256687164307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:44] (step=0010605) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.20608239409249904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10606, "loss": 0.29275256395339966, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6907176971436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:47] (step=0010606) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.20610182666148466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10607, "loss": 0.20018118619918823, "memory_gb": 7.721559524536133, "step_time_ms": 3363.851547241211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:51] (step=0010607) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.20612125923047026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10608, "loss": 0.19987323880195618, "memory_gb": 7.721559524536133, "step_time_ms": 3360.835313796997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:54] (step=0010608) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.20614069179945588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:37:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10609, "loss": 0.20536653697490692, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1698150634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:37:58] (step=0010609) Train Loss: 0.2753, Train Steps/Sec: 0.28, Epoch: 0.2061601243684415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10610, "loss": 0.22428445518016815, "memory_gb": 7.721559524536133, "step_time_ms": 3360.732078552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:01] (step=0010610) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.20617955693742712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10611, "loss": 0.1979471743106842, "memory_gb": 7.721559524536133, "step_time_ms": 3349.989175796509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:05] (step=0010611) Train Loss: 0.2450, Train Steps/Sec: 0.28, Epoch: 0.20619898950641274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10612, "loss": 0.2493647336959839, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4585704803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:09] (step=0010612) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.20621842207539837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10613, "loss": 0.3190591633319855, "memory_gb": 7.721559524536133, "step_time_ms": 3363.361358642578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:12] (step=0010613) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.206237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10614, "loss": 0.21719138324260712, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2821617126465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:16] (step=0010614) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.2062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10615, "loss": 0.23312434554100037, "memory_gb": 7.721559524536133, "step_time_ms": 3493.515729904175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:19] (step=0010615) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.20627671978235523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10616, "loss": 0.1690877079963684, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6609058380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:23] (step=0010616) Train Loss: 0.2122, Train Steps/Sec: 0.28, Epoch: 0.20629615235134086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10617, "loss": 0.25422412157058716, "memory_gb": 7.721559524536133, "step_time_ms": 3363.898992538452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:26] (step=0010617) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.20631558492032648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10618, "loss": 0.18977637588977814, "memory_gb": 7.721559524536133, "step_time_ms": 3363.436698913574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:30] (step=0010618) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.2063350174893121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10619, "loss": 0.19138722121715546, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6116046905518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:34] (step=0010619) Train Loss: 0.2715, Train Steps/Sec: 0.28, Epoch: 0.2063544500582977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10620, "loss": 0.29263293743133545, "memory_gb": 7.721559524536133, "step_time_ms": 3364.279270172119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:37] (step=0010620) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.20637388262728332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10621, "loss": 0.23326635360717773, "memory_gb": 7.721559524536133, "step_time_ms": 3345.376491546631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:41] (step=0010621) Train Loss: 0.2705, Train Steps/Sec: 0.29, Epoch: 0.20639331519626894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10622, "loss": 0.1529402732849121, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3715896606445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:44] (step=0010622) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.20641274776525456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10623, "loss": 0.20163027942180634, "memory_gb": 7.721559524536133, "step_time_ms": 3364.593267440796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:48] (step=0010623) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.20643218033424018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10624, "loss": 0.22837556898593903, "memory_gb": 7.721559524536133, "step_time_ms": 3366.02520942688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:51] (step=0010624) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.2064516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10625, "loss": 0.31999754905700684, "memory_gb": 7.721559524536133, "step_time_ms": 3363.25740814209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:55] (step=0010625) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.20647104547221143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10626, "loss": 0.35594287514686584, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2547855377197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:38:59] (step=0010626) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.20649047804119705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10627, "loss": 0.2096734642982483, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6557121276855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:02] (step=0010627) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.20650991061018267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10628, "loss": 0.23672854900360107, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0904216766357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:06] (step=0010628) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.2065293431791683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10629, "loss": 0.17587873339653015, "memory_gb": 7.721559524536133, "step_time_ms": 3363.837718963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:09] (step=0010629) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.20654877574815392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10630, "loss": 0.29422885179519653, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5286769866943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:13] (step=0010630) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.2065682083171395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10631, "loss": 0.2010771930217743, "memory_gb": 7.721559524536133, "step_time_ms": 3364.865779876709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:17] (step=0010631) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.20658764088612513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10632, "loss": 0.30551379919052124, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9485626220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:20] (step=0010632) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.20660707345511076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10633, "loss": 0.24608993530273438, "memory_gb": 7.721559524536133, "step_time_ms": 3366.539716720581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:24] (step=0010633) Train Loss: 0.1890, Train Steps/Sec: 0.28, Epoch: 0.20662650602409638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10634, "loss": 0.18968454003334045, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3295764923096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:27] (step=0010634) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.206645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10635, "loss": 0.31838035583496094, "memory_gb": 7.721559524536133, "step_time_ms": 3367.656946182251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:31] (step=0010635) Train Loss: 0.2834, Train Steps/Sec: 0.28, Epoch: 0.20666537116206762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10636, "loss": 0.21768797934055328, "memory_gb": 7.721559524536133, "step_time_ms": 3365.345239639282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:34] (step=0010636) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.20668480373105325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10637, "loss": 0.21686716377735138, "memory_gb": 7.721559524536133, "step_time_ms": 3358.036756515503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:38] (step=0010637) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.20670423630003887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10638, "loss": 0.2766445279121399, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8812980651855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:42] (step=0010638) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.2067236688690245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10639, "loss": 0.2902687191963196, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4249019622803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:45] (step=0010639) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.2067431014380101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10640, "loss": 0.2929037809371948, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2068214416504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:49] (step=0010640) Train Loss: 0.2873, Train Steps/Sec: 0.28, Epoch: 0.20676253400699574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10641, "loss": 0.14597058296203613, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9116497039795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:52] (step=0010641) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.20678196657598136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:39:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10642, "loss": 0.24654528498649597, "memory_gb": 7.721559524536133, "step_time_ms": 3363.240957260132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:39:56] (step=0010642) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.20680139914496695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10643, "loss": 0.270857572555542, "memory_gb": 7.721559524536133, "step_time_ms": 3364.074230194092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:00] (step=0010643) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.20682083171395257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10644, "loss": 0.29975610971450806, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6369800567627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:03] (step=0010644) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.2068402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10645, "loss": 0.15105730295181274, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2737865448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:07] (step=0010645) Train Loss: 0.2083, Train Steps/Sec: 0.27, Epoch: 0.20685969685192382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10646, "loss": 0.2503993809223175, "memory_gb": 7.721559524536133, "step_time_ms": 3347.1248149871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:10] (step=0010646) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.20687912942090944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10647, "loss": 0.2048492133617401, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6995601654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:14] (step=0010647) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.20689856198989506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10648, "loss": 0.14790961146354675, "memory_gb": 7.721559524536133, "step_time_ms": 3352.70619392395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:18] (step=0010648) Train Loss: 0.2334, Train Steps/Sec: 0.28, Epoch: 0.20691799455888069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10649, "loss": 0.3798108994960785, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8621826171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:21] (step=0010649) Train Loss: 0.3307, Train Steps/Sec: 0.28, Epoch: 0.2069374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10650, "loss": 0.22767353057861328, "memory_gb": 7.721559524536133, "step_time_ms": 3357.717275619507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:25] (step=0010650) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.20695685969685193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10651, "loss": 0.18826374411582947, "memory_gb": 7.721559524536133, "step_time_ms": 3361.102819442749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:28] (step=0010651) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.20697629226583755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10652, "loss": 0.237216055393219, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2730293273926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:32] (step=0010652) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.20699572483482317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10653, "loss": 0.247882679104805, "memory_gb": 7.721559524536133, "step_time_ms": 3361.015796661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:36] (step=0010653) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2070151574038088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10654, "loss": 0.2306247353553772, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3483276367188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:39] (step=0010654) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.2070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10655, "loss": 0.2203378677368164, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3498516082764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:43] (step=0010655) Train Loss: 0.1883, Train Steps/Sec: 0.28, Epoch: 0.20705402254178001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10656, "loss": 0.23614484071731567, "memory_gb": 7.721559524536133, "step_time_ms": 3356.429100036621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:46] (step=0010656) Train Loss: 0.3144, Train Steps/Sec: 0.28, Epoch: 0.20707345511076564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10657, "loss": 0.1171703189611435, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8476696014404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:50] (step=0010657) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.20709288767975126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10658, "loss": 0.21945150196552277, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5899600982666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:53] (step=0010658) Train Loss: 0.2088, Train Steps/Sec: 0.28, Epoch: 0.20711232024873688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:40:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10659, "loss": 0.2618628740310669, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8289489746094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:40:57] (step=0010659) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.2071317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10660, "loss": 0.2906530797481537, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8007831573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:01] (step=0010660) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.20715118538670813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10661, "loss": 0.2449324131011963, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0047874450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:04] (step=0010661) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.20717061795569375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10662, "loss": 0.25343677401542664, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9225540161133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:08] (step=0010662) Train Loss: 0.2706, Train Steps/Sec: 0.28, Epoch: 0.20719005052467937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10663, "loss": 0.298092782497406, "memory_gb": 7.721559524536133, "step_time_ms": 3499.274969100952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:11] (step=0010663) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.207209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10664, "loss": 0.18259933590888977, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0921421051025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:15] (step=0010664) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.20722891566265061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10665, "loss": 0.30373042821884155, "memory_gb": 7.721559524536133, "step_time_ms": 3358.811616897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:18] (step=0010665) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.2072483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10666, "loss": 0.1644352674484253, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2262268066406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:22] (step=0010666) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.20726778080062183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10667, "loss": 0.28216737508773804, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4157485961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:26] (step=0010667) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.20728721336960745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10668, "loss": 0.22417965531349182, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3943843841553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:29] (step=0010668) Train Loss: 0.2497, Train Steps/Sec: 0.28, Epoch: 0.20730664593859308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10669, "loss": 0.29046350717544556, "memory_gb": 7.721559524536133, "step_time_ms": 3354.753017425537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:33] (step=0010669) Train Loss: 0.2955, Train Steps/Sec: 0.28, Epoch: 0.2073260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10670, "loss": 0.24528035521507263, "memory_gb": 7.721559524536133, "step_time_ms": 3356.240749359131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:36] (step=0010670) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.20734551107656432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10671, "loss": 0.24922290444374084, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5117378234863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:40] (step=0010671) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.20736494364554994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10672, "loss": 0.1698056310415268, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4674129486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:44] (step=0010672) Train Loss: 0.1688, Train Steps/Sec: 0.28, Epoch: 0.20738437621453557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10673, "loss": 0.166632279753685, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0349311828613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:47] (step=0010673) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.2074038087835212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10674, "loss": 0.24541349709033966, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0384769439697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:51] (step=0010674) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.2074232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10675, "loss": 0.1691514104604721, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3924560546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:54] (step=0010675) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.20744267392149243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:41:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10676, "loss": 0.26424744725227356, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9628047943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:41:58] (step=0010676) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.20746210649047805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10677, "loss": 0.16110104322433472, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7019958496094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:01] (step=0010677) Train Loss: 0.1865, Train Steps/Sec: 0.28, Epoch: 0.20748153905946365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10678, "loss": 0.2518952488899231, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1016578674316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:05] (step=0010678) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.20750097162844927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10679, "loss": 0.3461821973323822, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3243618011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:09] (step=0010679) Train Loss: 0.3261, Train Steps/Sec: 0.28, Epoch: 0.2075204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10680, "loss": 0.2010692059993744, "memory_gb": 7.721559524536133, "step_time_ms": 3352.384090423584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:12] (step=0010680) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.20753983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10681, "loss": 0.23795786499977112, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1390895843506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:16] (step=0010681) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.20755926933540614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10682, "loss": 0.18472012877464294, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4106998443604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:19] (step=0010682) Train Loss: 0.1923, Train Steps/Sec: 0.28, Epoch: 0.20757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10683, "loss": 0.23843854665756226, "memory_gb": 7.721559524536133, "step_time_ms": 3353.386402130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:23] (step=0010683) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.20759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10684, "loss": 0.16103096306324005, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4954738616943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:26] (step=0010684) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.207617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10685, "loss": 0.3002164959907532, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5281372070312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:30] (step=0010685) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.20763699961134863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10686, "loss": 0.2926490604877472, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4511280059814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:34] (step=0010686) Train Loss: 0.2231, Train Steps/Sec: 0.27, Epoch: 0.20765643218033425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10687, "loss": 0.2590804994106293, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6125164031982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:37] (step=0010687) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.20767586474931987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10688, "loss": 0.3091273009777069, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6084632873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:41] (step=0010688) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.20769529731830547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10689, "loss": 0.21443109214305878, "memory_gb": 7.721559524536133, "step_time_ms": 3353.595018386841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:45] (step=0010689) Train Loss: 0.1814, Train Steps/Sec: 0.28, Epoch: 0.2077147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10690, "loss": 0.2714894413948059, "memory_gb": 7.721559524536133, "step_time_ms": 3357.320785522461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:48] (step=0010690) Train Loss: 0.3096, Train Steps/Sec: 0.28, Epoch: 0.2077341624562767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10691, "loss": 0.12380366027355194, "memory_gb": 7.721559524536133, "step_time_ms": 3356.17733001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:52] (step=0010691) Train Loss: 0.1831, Train Steps/Sec: 0.28, Epoch: 0.20775359502526233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10692, "loss": 0.18111102283000946, "memory_gb": 7.721559524536133, "step_time_ms": 3358.144760131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:55] (step=0010692) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.20777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:42:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10693, "loss": 0.38208258152008057, "memory_gb": 7.721559524536133, "step_time_ms": 3340.960741043091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:42:59] (step=0010693) Train Loss: 0.3270, Train Steps/Sec: 0.28, Epoch: 0.20779246016323358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10694, "loss": 0.2727394998073578, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0429553985596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:02] (step=0010694) Train Loss: 0.2247, Train Steps/Sec: 0.28, Epoch: 0.2078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10695, "loss": 0.20056533813476562, "memory_gb": 7.721559524536133, "step_time_ms": 3357.041358947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:06] (step=0010695) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.20783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10696, "loss": 0.19130827486515045, "memory_gb": 7.721559524536133, "step_time_ms": 3356.414794921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:09] (step=0010696) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.20785075787019044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10697, "loss": 0.21310095489025116, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0168743133545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:13] (step=0010697) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.20787019043917607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10698, "loss": 0.16164487600326538, "memory_gb": 7.721559524536133, "step_time_ms": 3356.084108352661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:17] (step=0010698) Train Loss: 0.1804, Train Steps/Sec: 0.28, Epoch: 0.2078896230081617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10699, "loss": 0.15365815162658691, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8429222106934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:20] (step=0010699) Train Loss: 0.2017, Train Steps/Sec: 0.28, Epoch: 0.2079090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10700, "loss": 0.20572718977928162, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5093746185303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:24] (step=0010700) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.2079284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10701, "loss": 0.14074990153312683, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9188842773438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:27] (step=0010701) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.20794792071511853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10702, "loss": 0.23932155966758728, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1887950897217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:31] (step=0010702) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.20796735328410415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10703, "loss": 0.24182669818401337, "memory_gb": 7.721559524536133, "step_time_ms": 3496.7432022094727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:34] (step=0010703) Train Loss: 0.2738, Train Steps/Sec: 0.28, Epoch: 0.20798678585308977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10704, "loss": 0.2757963538169861, "memory_gb": 7.721559524536133, "step_time_ms": 3360.811471939087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:38] (step=0010704) Train Loss: 0.2455, Train Steps/Sec: 0.28, Epoch: 0.2080062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10705, "loss": 0.2433539479970932, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2938232421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:42] (step=0010705) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.20802565099106102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10706, "loss": 0.16968777775764465, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5357151031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:45] (step=0010706) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.20804508356004664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10707, "loss": 0.27379271388053894, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5309467315674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:49] (step=0010707) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.20806451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10708, "loss": 0.2587870955467224, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1081581115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:52] (step=0010708) Train Loss: 0.2523, Train Steps/Sec: 0.28, Epoch: 0.20808394869801788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10709, "loss": 0.23070289194583893, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7940254211426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:56] (step=0010709) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.2081033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10710, "loss": 0.23056581616401672, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8389625549316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:43:59] (step=0010710) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.20812281383598913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10711, "loss": 0.2558891773223877, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7797412872314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:03] (step=0010711) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.20814224640497475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10712, "loss": 0.2505621910095215, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1154346466064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:07] (step=0010712) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.20816167897396035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10713, "loss": 0.1519310176372528, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7077808380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:10] (step=0010713) Train Loss: 0.1919, Train Steps/Sec: 0.28, Epoch: 0.20818111154294597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10714, "loss": 0.17331674695014954, "memory_gb": 7.721559524536133, "step_time_ms": 3355.975866317749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:14] (step=0010714) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.2082005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10715, "loss": 0.26953572034835815, "memory_gb": 7.721559524536133, "step_time_ms": 3359.070062637329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:17] (step=0010715) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.2082199766809172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10716, "loss": 0.33258724212646484, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8621826171875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:21] (step=0010716) Train Loss: 0.2858, Train Steps/Sec: 0.28, Epoch: 0.20823940924990283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10717, "loss": 0.2567296624183655, "memory_gb": 7.721559524536133, "step_time_ms": 3361.907482147217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:25] (step=0010717) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.20825884181888846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10718, "loss": 0.31702277064323425, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3758811950684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:28] (step=0010718) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.20827827438787408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10719, "loss": 0.3078494369983673, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2487621307373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:32] (step=0010719) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.2082977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10720, "loss": 0.27976468205451965, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0689640045166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:35] (step=0010720) Train Loss: 0.2752, Train Steps/Sec: 0.28, Epoch: 0.20831713952584532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10721, "loss": 0.21450433135032654, "memory_gb": 7.721559524536133, "step_time_ms": 3361.100435256958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:39] (step=0010721) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.20833657209483095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10722, "loss": 0.2342551350593567, "memory_gb": 7.721559524536133, "step_time_ms": 3352.167844772339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:42] (step=0010722) Train Loss: 0.2142, Train Steps/Sec: 0.28, Epoch: 0.20835600466381657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10723, "loss": 0.23038677871227264, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8004837036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:46] (step=0010723) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.20837543723280216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10724, "loss": 0.21184931695461273, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5871925354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:50] (step=0010724) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.20839486980178779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10725, "loss": 0.23446573317050934, "memory_gb": 7.721559524536133, "step_time_ms": 3358.628749847412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:53] (step=0010725) Train Loss: 0.2800, Train Steps/Sec: 0.28, Epoch: 0.2084143023707734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:44:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10726, "loss": 0.27494317293167114, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1037521362305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:44:57] (step=0010726) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.20843373493975903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10727, "loss": 0.24630248546600342, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2725524902344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:00] (step=0010727) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.20845316750874465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10728, "loss": 0.3079379200935364, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0525856018066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:04] (step=0010728) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.20847260007773027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10729, "loss": 0.21043738722801208, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3200702667236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:08] (step=0010729) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.2084920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10730, "loss": 0.23907525837421417, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8591346740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:11] (step=0010730) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.20851146521570152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10731, "loss": 0.2723735570907593, "memory_gb": 7.721559524536133, "step_time_ms": 3359.954357147217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:15] (step=0010731) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.20853089778468714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10732, "loss": 0.228502556681633, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9253005981445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:18] (step=0010732) Train Loss: 0.1851, Train Steps/Sec: 0.28, Epoch: 0.20855033035367276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10733, "loss": 0.2778497636318207, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4049892425537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:22] (step=0010733) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.20856976292265839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10734, "loss": 0.21588599681854248, "memory_gb": 7.721559524536133, "step_time_ms": 3355.742931365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:26] (step=0010734) Train Loss: 0.2313, Train Steps/Sec: 0.27, Epoch: 0.208589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10735, "loss": 0.28067871928215027, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8974895477295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:29] (step=0010735) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.2086086280606296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10736, "loss": 0.2741852104663849, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1152381896973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:33] (step=0010736) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.20862806062961523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10737, "loss": 0.19400173425674438, "memory_gb": 7.721559524536133, "step_time_ms": 3363.187313079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:36] (step=0010737) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.20864749319860085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10738, "loss": 0.14867544174194336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.316328048706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:40] (step=0010738) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.20866692576758647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10739, "loss": 0.14845146238803864, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9375743865967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:44] (step=0010739) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.2086863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10740, "loss": 0.28681525588035583, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8527793884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:47] (step=0010740) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.20870579090555771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10741, "loss": 0.18930204212665558, "memory_gb": 7.721559524536133, "step_time_ms": 3360.860586166382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:51] (step=0010741) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.20872522347454334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10742, "loss": 0.16875407099723816, "memory_gb": 7.721559524536133, "step_time_ms": 3361.304759979248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:54] (step=0010742) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.20874465604352896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:45:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10743, "loss": 0.22222289443016052, "memory_gb": 7.721559524536133, "step_time_ms": 3361.699104309082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:45:58] (step=0010743) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.20876408861251458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10744, "loss": 0.2873963713645935, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4470024108887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:01] (step=0010744) Train Loss: 0.3208, Train Steps/Sec: 0.28, Epoch: 0.2087835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10745, "loss": 0.2514345645904541, "memory_gb": 7.721559524536133, "step_time_ms": 3349.893569946289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:05] (step=0010745) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.20880295375048583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10746, "loss": 0.3122681975364685, "memory_gb": 7.721559524536133, "step_time_ms": 3361.59610748291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:09] (step=0010746) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.20882238631947142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10747, "loss": 0.28640925884246826, "memory_gb": 7.721559524536133, "step_time_ms": 3361.091136932373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:12] (step=0010747) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.20884181888845704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10748, "loss": 0.19203893840312958, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2829189300537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:16] (step=0010748) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.20886125145744266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10749, "loss": 0.34103864431381226, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7921600341797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:19] (step=0010749) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.2088806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10750, "loss": 0.2611689567565918, "memory_gb": 7.721559524536133, "step_time_ms": 3498.769521713257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:23] (step=0010750) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.2089001165954139, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10751, "loss": 0.2259424328804016, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8526248931885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:27] (step=0010751) Train Loss: 0.2842, Train Steps/Sec: 0.28, Epoch: 0.20891954916439953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10752, "loss": 0.22588828206062317, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6288738250732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:30] (step=0010752) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.20893898173338515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10753, "loss": 0.11951088160276413, "memory_gb": 7.721559524536133, "step_time_ms": 3360.138177871704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:34] (step=0010753) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.20895841430237078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10754, "loss": 0.1847684532403946, "memory_gb": 7.721559524536133, "step_time_ms": 3355.238676071167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:37] (step=0010754) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.2089778468713564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10755, "loss": 0.1813032180070877, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5846633911133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:41] (step=0010755) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.20899727944034202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10756, "loss": 0.2586838901042938, "memory_gb": 7.721559524536133, "step_time_ms": 3358.311414718628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:44] (step=0010756) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.20901671200932764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10757, "loss": 0.20768707990646362, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5995693206787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:48] (step=0010757) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.20903614457831327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10758, "loss": 0.3348221182823181, "memory_gb": 7.715639114379883, "step_time_ms": 3329.6291828155518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:52] (step=0010758) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.20905557714729886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10759, "loss": 0.2958683371543884, "memory_gb": 7.721559524536133, "step_time_ms": 3361.692428588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:55] (step=0010759) Train Loss: 0.2780, Train Steps/Sec: 0.28, Epoch: 0.20907500971628448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10760, "loss": 0.19869129359722137, "memory_gb": 7.721559524536133, "step_time_ms": 3347.743511199951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:46:59] (step=0010760) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.2090944422852701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10761, "loss": 0.1717613935470581, "memory_gb": 7.721559524536133, "step_time_ms": 3354.841947555542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:02] (step=0010761) Train Loss: 0.1919, Train Steps/Sec: 0.28, Epoch: 0.20911387485425573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10762, "loss": 0.16438761353492737, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8096885681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:06] (step=0010762) Train Loss: 0.1790, Train Steps/Sec: 0.28, Epoch: 0.20913330742324135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10763, "loss": 0.2639043629169464, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9134941101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:09] (step=0010763) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.20915273999222697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10764, "loss": 0.29055699706077576, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9087982177734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:13] (step=0010764) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.2091721725612126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10765, "loss": 0.24651333689689636, "memory_gb": 7.721559524536133, "step_time_ms": 3346.198558807373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:17] (step=0010765) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.20919160513019822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10766, "loss": 0.24552804231643677, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8972091674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:20] (step=0010766) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.20921103769918384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10767, "loss": 0.2781313359737396, "memory_gb": 7.721559524536133, "step_time_ms": 3352.619171142578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:24] (step=0010767) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.20923047026816946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10768, "loss": 0.24182076752185822, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5897941589355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:27] (step=0010768) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.20924990283715508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10769, "loss": 0.24828565120697021, "memory_gb": 7.721559524536133, "step_time_ms": 3357.832908630371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:31] (step=0010769) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.2092693354061407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10770, "loss": 0.17810936272144318, "memory_gb": 7.721559524536133, "step_time_ms": 3357.694625854492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:34] (step=0010770) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.2092887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10771, "loss": 0.2887139320373535, "memory_gb": 7.715639114379883, "step_time_ms": 3323.4078884124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:38] (step=0010771) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.20930820054411192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10772, "loss": 0.26801902055740356, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8992614746094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:42] (step=0010772) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.20932763311309754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10773, "loss": 0.2858245074748993, "memory_gb": 7.721559524536133, "step_time_ms": 3355.350971221924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:45] (step=0010773) Train Loss: 0.2938, Train Steps/Sec: 0.28, Epoch: 0.20934706568208317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10774, "loss": 0.17290137708187103, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4765243530273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:49] (step=0010774) Train Loss: 0.2135, Train Steps/Sec: 0.27, Epoch: 0.2093664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10775, "loss": 0.15304136276245117, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5917015075684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:52] (step=0010775) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.2093859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10776, "loss": 0.23542729020118713, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4138412475586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:47:56] (step=0010776) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.20940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10777, "loss": 0.26781097054481506, "memory_gb": 7.721559524536133, "step_time_ms": 3343.7070846557617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:00] (step=0010777) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.20942479595802566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10778, "loss": 0.29677557945251465, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7872257232666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:03] (step=0010778) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.20944422852701128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10779, "loss": 0.22270017862319946, "memory_gb": 7.721559524536133, "step_time_ms": 3355.402708053589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:07] (step=0010779) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.2094636610959969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10780, "loss": 0.29699912667274475, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6795330047607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:10] (step=0010780) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.20948309366498252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10781, "loss": 0.24538549780845642, "memory_gb": 7.721559524536133, "step_time_ms": 3355.435848236084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:14] (step=0010781) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.20950252623396812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10782, "loss": 0.3207240104675293, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5074882507324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:17] (step=0010782) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.20952195880295374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10783, "loss": 0.18325930833816528, "memory_gb": 7.721559524536133, "step_time_ms": 3355.980634689331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:21] (step=0010783) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.20954139137193936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10784, "loss": 0.13682517409324646, "memory_gb": 7.721559524536133, "step_time_ms": 3349.9157428741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:25] (step=0010784) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.20956082394092498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10785, "loss": 0.2792712450027466, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0102710723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:28] (step=0010785) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.2095802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10786, "loss": 0.204370379447937, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9626808166504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:32] (step=0010786) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.20959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10787, "loss": 0.18725740909576416, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5326976776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:35] (step=0010787) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.20961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10788, "loss": 0.17321816086769104, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6953201293945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:39] (step=0010788) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.20963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10789, "loss": 0.13197223842144012, "memory_gb": 7.721559524536133, "step_time_ms": 3357.555389404297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:42] (step=0010789) Train Loss: 0.1683, Train Steps/Sec: 0.28, Epoch: 0.2096579867858531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10790, "loss": 0.3312379717826843, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9485206604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:46] (step=0010790) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.20967741935483872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10791, "loss": 0.1647346168756485, "memory_gb": 7.721559524536133, "step_time_ms": 3501.0387897491455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:50] (step=0010791) Train Loss: 0.1737, Train Steps/Sec: 0.28, Epoch: 0.20969685192382434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10792, "loss": 0.17522624135017395, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8460941314697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:53] (step=0010792) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.20971628449280996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10793, "loss": 0.17336803674697876, "memory_gb": 7.721559524536133, "step_time_ms": 3357.693910598755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:48:57] (step=0010793) Train Loss: 0.1702, Train Steps/Sec: 0.28, Epoch: 0.20973571706179556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10794, "loss": 0.24311447143554688, "memory_gb": 7.721559524536133, "step_time_ms": 3350.353002548218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:00] (step=0010794) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.20975514963078118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10795, "loss": 0.21540118753910065, "memory_gb": 7.721559524536133, "step_time_ms": 3358.522653579712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:04] (step=0010795) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.2097745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10796, "loss": 0.35080116987228394, "memory_gb": 7.721559524536133, "step_time_ms": 3356.127977371216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:07] (step=0010796) Train Loss: 0.2873, Train Steps/Sec: 0.28, Epoch: 0.20979401476875242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10797, "loss": 0.26182684302330017, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8765907287598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:11] (step=0010797) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.20981344733773805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10798, "loss": 0.31481581926345825, "memory_gb": 7.721559524536133, "step_time_ms": 3355.407953262329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:15] (step=0010798) Train Loss: 0.2734, Train Steps/Sec: 0.28, Epoch: 0.20983287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10799, "loss": 0.2106100469827652, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4159450531006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:18] (step=0010799) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.2098523124757093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10800, "loss": 0.2234322875738144, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8778038024902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:22] (step=0010800) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.2098717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10801, "loss": 0.25804370641708374, "memory_gb": 7.721559524536133, "step_time_ms": 3358.694314956665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:25] (step=0010801) Train Loss: 0.2877, Train Steps/Sec: 0.28, Epoch: 0.20989117761368054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10802, "loss": 0.17958566546440125, "memory_gb": 7.721559524536133, "step_time_ms": 3354.811668395996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:29] (step=0010802) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.20991061018266616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10803, "loss": 0.2191447615623474, "memory_gb": 7.721559524536133, "step_time_ms": 3353.092670440674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:32] (step=0010803) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.20993004275165178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10804, "loss": 0.22992642223834991, "memory_gb": 7.721559524536133, "step_time_ms": 3354.374885559082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:36] (step=0010804) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.2099494753206374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10805, "loss": 0.16047470271587372, "memory_gb": 7.721559524536133, "step_time_ms": 3359.692096710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:40] (step=0010805) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.209968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10806, "loss": 0.1917911171913147, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8102378845215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:43] (step=0010806) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.20998834045860862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10807, "loss": 0.1599917709827423, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4144115448, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:47] (step=0010807) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.21000777302759424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10808, "loss": 0.2483053058385849, "memory_gb": 7.721559524536133, "step_time_ms": 3357.335329055786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:50] (step=0010808) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.21002720559657986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10809, "loss": 0.34269779920578003, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3129692077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:54] (step=0010809) Train Loss: 0.3133, Train Steps/Sec: 0.28, Epoch: 0.21004663816556549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:49:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10810, "loss": 0.1649320125579834, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9787998199463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:49:57] (step=0010810) Train Loss: 0.2037, Train Steps/Sec: 0.28, Epoch: 0.2100660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10811, "loss": 0.29071447253227234, "memory_gb": 7.721559524536133, "step_time_ms": 3353.098154067993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:01] (step=0010811) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.21008550330353673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10812, "loss": 0.1470271646976471, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4484119415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:04] (step=0010812) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.21010493587252235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10813, "loss": 0.14190682768821716, "memory_gb": 7.721559524536133, "step_time_ms": 3359.92431640625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:08] (step=0010813) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.21012436844150797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10814, "loss": 0.16692392528057098, "memory_gb": 7.721559524536133, "step_time_ms": 3358.649730682373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:12] (step=0010814) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.2101438010104936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10815, "loss": 0.2683195471763611, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1034202575684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:15] (step=0010815) Train Loss: 0.2781, Train Steps/Sec: 0.28, Epoch: 0.21016323357947922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10816, "loss": 0.32380539178848267, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0740642547607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:19] (step=0010816) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.21018266614846481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10817, "loss": 0.2157740741968155, "memory_gb": 7.721559524536133, "step_time_ms": 3363.193988800049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:22] (step=0010817) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.21020209871745044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10818, "loss": 0.29716020822525024, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0584106445312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:26] (step=0010818) Train Loss: 0.3237, Train Steps/Sec: 0.28, Epoch: 0.21022153128643606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10819, "loss": 0.2207319438457489, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7893714904785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:29] (step=0010819) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.21024096385542168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10820, "loss": 0.15214848518371582, "memory_gb": 7.721559524536133, "step_time_ms": 3364.062786102295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:33] (step=0010820) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.2102603964244073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10821, "loss": 0.19070535898208618, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3809814453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:37] (step=0010821) Train Loss: 0.2575, Train Steps/Sec: 0.27, Epoch: 0.21027982899339293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10822, "loss": 0.10602757334709167, "memory_gb": 7.721559524536133, "step_time_ms": 3361.809015274048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:40] (step=0010822) Train Loss: 0.2076, Train Steps/Sec: 0.28, Epoch: 0.21029926156237855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10823, "loss": 0.2926437258720398, "memory_gb": 7.721559524536133, "step_time_ms": 3361.624002456665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:44] (step=0010823) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.21031869413136417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10824, "loss": 0.17588938772678375, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3711853027344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:47] (step=0010824) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.2103381267003498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10825, "loss": 0.2735239863395691, "memory_gb": 7.721559524536133, "step_time_ms": 3365.161895751953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:51] (step=0010825) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.21035755926933541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10826, "loss": 0.21625840663909912, "memory_gb": 7.721559524536133, "step_time_ms": 3346.984386444092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:55] (step=0010826) Train Loss: 0.2358, Train Steps/Sec: 0.28, Epoch: 0.21037699183832104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10827, "loss": 0.36702603101730347, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4440898895264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:50:58] (step=0010827) Train Loss: 0.3035, Train Steps/Sec: 0.28, Epoch: 0.21039642440730666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10828, "loss": 0.21710334718227386, "memory_gb": 7.721559524536133, "step_time_ms": 3364.105701446533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:02] (step=0010828) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.21041585697629225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10829, "loss": 0.2327442467212677, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7922325134277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:05] (step=0010829) Train Loss: 0.2802, Train Steps/Sec: 0.28, Epoch: 0.21043528954527788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10830, "loss": 0.3221711814403534, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6324615478516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:09] (step=0010830) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.2104547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10831, "loss": 0.21836018562316895, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5522384643555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:12] (step=0010831) Train Loss: 0.2611, Train Steps/Sec: 0.28, Epoch: 0.21047415468324912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10832, "loss": 0.23383083939552307, "memory_gb": 7.721559524536133, "step_time_ms": 3507.3471069335938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:16] (step=0010832) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.21049358725223474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10833, "loss": 0.25011152029037476, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1339588165283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:20] (step=0010833) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.21051301982122037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10834, "loss": 0.24045713245868683, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3133850097656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:23] (step=0010834) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.210532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10835, "loss": 0.36116212606430054, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9908561706543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:27] (step=0010835) Train Loss: 0.2255, Train Steps/Sec: 0.28, Epoch: 0.2105518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10836, "loss": 0.13562238216400146, "memory_gb": 7.721559524536133, "step_time_ms": 3368.650436401367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:30] (step=0010836) Train Loss: 0.1715, Train Steps/Sec: 0.28, Epoch: 0.21057131752817723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10837, "loss": 0.2193998098373413, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3221855163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:34] (step=0010837) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.21059075009716285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10838, "loss": 0.13809657096862793, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5550689697266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:37] (step=0010838) Train Loss: 0.1654, Train Steps/Sec: 0.28, Epoch: 0.21061018266614848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10839, "loss": 0.2023800164461136, "memory_gb": 7.721559524536133, "step_time_ms": 3362.436532974243, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:41] (step=0010839) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.21062961523513407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10840, "loss": 0.2271692454814911, "memory_gb": 7.721559524536133, "step_time_ms": 3362.175703048706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:45] (step=0010840) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2106490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10841, "loss": 0.3138083815574646, "memory_gb": 7.715639114379883, "step_time_ms": 3337.3794555664062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:48] (step=0010841) Train Loss: 0.3125, Train Steps/Sec: 0.28, Epoch: 0.21066848037310532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10842, "loss": 0.32037585973739624, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6064319610596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:52] (step=0010842) Train Loss: 0.2494, Train Steps/Sec: 0.28, Epoch: 0.21068791294209094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10843, "loss": 0.2428416609764099, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2677841186523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:55] (step=0010843) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.21070734551107656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:51:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10844, "loss": 0.29601600766181946, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3642921447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:51:59] (step=0010844) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.21072677808006218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10845, "loss": 0.18894876539707184, "memory_gb": 7.721559524536133, "step_time_ms": 3348.4461307525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:02] (step=0010845) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.2107462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10846, "loss": 0.26835379004478455, "memory_gb": 7.721559524536133, "step_time_ms": 3364.52579498291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:06] (step=0010846) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.21076564321803343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10847, "loss": 0.13343241810798645, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2099628448486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:10] (step=0010847) Train Loss: 0.1702, Train Steps/Sec: 0.28, Epoch: 0.21078507578701905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10848, "loss": 0.25630930066108704, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8765087127686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:13] (step=0010848) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.21080450835600467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10849, "loss": 0.19429436326026917, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6246662139893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:17] (step=0010849) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.2108239409249903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10850, "loss": 0.21549075841903687, "memory_gb": 7.721559524536133, "step_time_ms": 3363.877534866333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:20] (step=0010850) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.21084337349397592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10851, "loss": 0.2990671694278717, "memory_gb": 7.721559524536133, "step_time_ms": 3361.25111579895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:24] (step=0010851) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.2108628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10852, "loss": 0.266011506319046, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4847660064697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:28] (step=0010852) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.21088223863194713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10853, "loss": 0.26527875661849976, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7143630981445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:31] (step=0010853) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.21090167120093276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10854, "loss": 0.12241724133491516, "memory_gb": 7.721559524536133, "step_time_ms": 3348.8998413085938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:35] (step=0010854) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.21092110376991838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10855, "loss": 0.34364962577819824, "memory_gb": 7.721559524536133, "step_time_ms": 3360.714912414551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:38] (step=0010855) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.210940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10856, "loss": 0.25715172290802, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1478385925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:42] (step=0010856) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.21095996890788962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10857, "loss": 0.2442677915096283, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8084449768066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:45] (step=0010857) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.21097940147687524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10858, "loss": 0.27139812707901, "memory_gb": 7.721559524536133, "step_time_ms": 3348.0021953582764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:49] (step=0010858) Train Loss: 0.2823, Train Steps/Sec: 0.28, Epoch: 0.21099883404586087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10859, "loss": 0.2501443326473236, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0056896209717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:53] (step=0010859) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.2110182666148465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10860, "loss": 0.19936615228652954, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3813133239746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:52:56] (step=0010860) Train Loss: 0.1709, Train Steps/Sec: 0.28, Epoch: 0.2110376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10861, "loss": 0.187770277261734, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6853065490723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:00] (step=0010861) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.21105713175281773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10862, "loss": 0.25926029682159424, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4184856414795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:03] (step=0010862) Train Loss: 0.2965, Train Steps/Sec: 0.27, Epoch: 0.21107656432180336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10863, "loss": 0.18711885809898376, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3217391967773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:07] (step=0010863) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.21109599689078895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10864, "loss": 0.2750139534473419, "memory_gb": 7.715639114379883, "step_time_ms": 3323.650598526001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:11] (step=0010864) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.21111542945977457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10865, "loss": 0.38223329186439514, "memory_gb": 7.721559524536133, "step_time_ms": 3359.818935394287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:14] (step=0010865) Train Loss: 0.3205, Train Steps/Sec: 0.28, Epoch: 0.2111348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10866, "loss": 0.35775724053382874, "memory_gb": 7.721559524536133, "step_time_ms": 3350.313901901245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:18] (step=0010866) Train Loss: 0.3264, Train Steps/Sec: 0.28, Epoch: 0.21115429459774582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10867, "loss": 0.2277248054742813, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2604656219482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:21] (step=0010867) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.21117372716673144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10868, "loss": 0.21078690886497498, "memory_gb": 7.721559524536133, "step_time_ms": 3356.795310974121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:25] (step=0010868) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.21119315973571706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10869, "loss": 0.22990083694458008, "memory_gb": 7.721559524536133, "step_time_ms": 3362.006902694702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:28] (step=0010869) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.21121259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10870, "loss": 0.31335556507110596, "memory_gb": 7.721559524536133, "step_time_ms": 3352.689266204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:32] (step=0010870) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.2112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10871, "loss": 0.23094435036182404, "memory_gb": 7.721559524536133, "step_time_ms": 3358.865737915039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:36] (step=0010871) Train Loss: 0.1862, Train Steps/Sec: 0.28, Epoch: 0.21125145744267393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10872, "loss": 0.23409610986709595, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5376014709473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:39] (step=0010872) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.21127089001165955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10873, "loss": 0.33062371611595154, "memory_gb": 7.721559524536133, "step_time_ms": 3348.4418392181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:43] (step=0010873) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.21129032258064517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10874, "loss": 0.17447930574417114, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9140644073486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:46] (step=0010874) Train Loss: 0.1831, Train Steps/Sec: 0.28, Epoch: 0.21130975514963077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10875, "loss": 0.18953575193881989, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2231159210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:50] (step=0010875) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.2113291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10876, "loss": 0.2646944522857666, "memory_gb": 7.721559524536133, "step_time_ms": 3358.680248260498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:53] (step=0010876) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.211348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10877, "loss": 0.10776595771312714, "memory_gb": 7.721559524536133, "step_time_ms": 3354.884147644043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:53:57] (step=0010877) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.21136805285658763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10878, "loss": 0.20224204659461975, "memory_gb": 7.721559524536133, "step_time_ms": 3351.083993911743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:00] (step=0010878) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.21138748542557326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10879, "loss": 0.21111488342285156, "memory_gb": 7.721559524536133, "step_time_ms": 3356.10294342041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:04] (step=0010879) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.21140691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10880, "loss": 0.201386958360672, "memory_gb": 7.721559524536133, "step_time_ms": 3490.931272506714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:08] (step=0010880) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.2114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10881, "loss": 0.33627015352249146, "memory_gb": 7.721559524536133, "step_time_ms": 3357.842206954956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:11] (step=0010881) Train Loss: 0.3297, Train Steps/Sec: 0.28, Epoch: 0.21144578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10882, "loss": 0.30280783772468567, "memory_gb": 7.721559524536133, "step_time_ms": 3358.335494995117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:15] (step=0010882) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.21146521570151575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10883, "loss": 0.25828656554222107, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7355918884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:18] (step=0010883) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.21148464827050137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10884, "loss": 0.3012222349643707, "memory_gb": 7.721559524536133, "step_time_ms": 3352.642059326172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:22] (step=0010884) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.211504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10885, "loss": 0.16185754537582397, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7322750091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:25] (step=0010885) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.2115235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10886, "loss": 0.28142493963241577, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5896282196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:29] (step=0010886) Train Loss: 0.2255, Train Steps/Sec: 0.28, Epoch: 0.2115429459774582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10887, "loss": 0.3301931321620941, "memory_gb": 7.721559524536133, "step_time_ms": 3349.076986312866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:33] (step=0010887) Train Loss: 0.2749, Train Steps/Sec: 0.28, Epoch: 0.21156237854644383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10888, "loss": 0.20383815467357635, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2065620422363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:36] (step=0010888) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.21158181111542945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10889, "loss": 0.18836791813373566, "memory_gb": 7.721559524536133, "step_time_ms": 3351.916790008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:40] (step=0010889) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.21160124368441507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10890, "loss": 0.3477495014667511, "memory_gb": 7.721559524536133, "step_time_ms": 3357.278347015381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:43] (step=0010890) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.2116206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10891, "loss": 0.18217629194259644, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8789024353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:47] (step=0010891) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.21164010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10892, "loss": 0.2689363360404968, "memory_gb": 7.721559524536133, "step_time_ms": 3348.324775695801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:50] (step=0010892) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.21165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10893, "loss": 0.28065454959869385, "memory_gb": 7.721559524536133, "step_time_ms": 3353.877067565918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:54] (step=0010893) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.21167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:54:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10894, "loss": 0.2209339141845703, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1041870117188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:54:58] (step=0010894) Train Loss: 0.1952, Train Steps/Sec: 0.28, Epoch: 0.21169840652934319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10895, "loss": 0.3746439814567566, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4085960388184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:01] (step=0010895) Train Loss: 0.3180, Train Steps/Sec: 0.28, Epoch: 0.2117178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10896, "loss": 0.31111401319503784, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9297580718994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:05] (step=0010896) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.21173727166731443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10897, "loss": 0.10966984927654266, "memory_gb": 7.721559524536133, "step_time_ms": 3350.412368774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:08] (step=0010897) Train Loss: 0.1545, Train Steps/Sec: 0.28, Epoch: 0.21175670423630003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10898, "loss": 0.1413356512784958, "memory_gb": 7.721559524536133, "step_time_ms": 3340.348720550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:12] (step=0010898) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.21177613680528565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10899, "loss": 0.1758812516927719, "memory_gb": 7.721559524536133, "step_time_ms": 3355.470657348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:15] (step=0010899) Train Loss: 0.1723, Train Steps/Sec: 0.28, Epoch: 0.21179556937427127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10900, "loss": 0.21781961619853973, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8823642730713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:19] (step=0010900) Train Loss: 0.1970, Train Steps/Sec: 0.28, Epoch: 0.2118150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10901, "loss": 0.22287294268608093, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2696495056152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:22] (step=0010901) Train Loss: 0.1772, Train Steps/Sec: 0.28, Epoch: 0.21183443451224251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10902, "loss": 0.23925301432609558, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1548252105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:26] (step=0010902) Train Loss: 0.1931, Train Steps/Sec: 0.28, Epoch: 0.21185386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10903, "loss": 0.21440567076206207, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9602756500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:30] (step=0010903) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.21187329965021376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10904, "loss": 0.16884447634220123, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4466285705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:33] (step=0010904) Train Loss: 0.1792, Train Steps/Sec: 0.28, Epoch: 0.21189273221919938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10905, "loss": 0.22992843389511108, "memory_gb": 7.721559524536133, "step_time_ms": 3339.4432067871094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:37] (step=0010905) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.211912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10906, "loss": 0.337189257144928, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4673919677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:40] (step=0010906) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.21193159735717063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10907, "loss": 0.184644877910614, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1672954559326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:44] (step=0010907) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.21195102992615625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10908, "loss": 0.29527685046195984, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1127395629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:47] (step=0010908) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.21197046249514187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10909, "loss": 0.28048795461654663, "memory_gb": 7.721559524536133, "step_time_ms": 3350.505828857422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:51] (step=0010909) Train Loss: 0.2698, Train Steps/Sec: 0.28, Epoch: 0.21198989506412746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10910, "loss": 0.2717241942882538, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0920696258545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:55] (step=0010910) Train Loss: 0.2453, Train Steps/Sec: 0.27, Epoch: 0.2120093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:55:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10911, "loss": 0.31666719913482666, "memory_gb": 7.721559524536133, "step_time_ms": 3357.445478439331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:55:58] (step=0010911) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.2120287602020987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10912, "loss": 0.26155364513397217, "memory_gb": 7.721559524536133, "step_time_ms": 3355.867624282837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:02] (step=0010912) Train Loss: 0.2638, Train Steps/Sec: 0.28, Epoch: 0.21204819277108433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10913, "loss": 0.14515221118927002, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6218662261963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:05] (step=0010913) Train Loss: 0.1513, Train Steps/Sec: 0.28, Epoch: 0.21206762534006995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10914, "loss": 0.12725433707237244, "memory_gb": 7.721559524536133, "step_time_ms": 3349.879741668701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:09] (step=0010914) Train Loss: 0.1807, Train Steps/Sec: 0.28, Epoch: 0.21208705790905558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10915, "loss": 0.2576977610588074, "memory_gb": 7.721559524536133, "step_time_ms": 3360.431671142578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:12] (step=0010915) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.2121064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 10916, "loss": 0.10094568878412247, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0311279296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:16] (step=0010916) Train Loss: 0.1735, Train Steps/Sec: 0.28, Epoch: 0.21212592304702682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10917, "loss": 0.2550591826438904, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2555103302, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:20] (step=0010917) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.21214535561601244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 10918, "loss": 0.1901201605796814, "memory_gb": 7.721559524536133, "step_time_ms": 3355.221748352051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:23] (step=0010918) Train Loss: 0.1905, Train Steps/Sec: 0.28, Epoch: 0.21216478818499807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10919, "loss": 0.24645310640335083, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8504791259766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:27] (step=0010919) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.2121842207539837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10920, "loss": 0.28352972865104675, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6438636779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:30] (step=0010920) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.2122036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 10921, "loss": 0.2877734899520874, "memory_gb": 7.721559524536133, "step_time_ms": 3498.5315799713135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:34] (step=0010921) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.2122230858919549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10922, "loss": 0.22950391471385956, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5540523529053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:37] (step=0010922) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.21224251846094053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 10923, "loss": 0.17018041014671326, "memory_gb": 7.721559524536133, "step_time_ms": 3359.529972076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:41] (step=0010923) Train Loss: 0.1746, Train Steps/Sec: 0.28, Epoch: 0.21226195102992615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10924, "loss": 0.2498357743024826, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4154167175293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:45] (step=0010924) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.21228138359891177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 10925, "loss": 0.35687243938446045, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3706665039062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:48] (step=0010925) Train Loss: 0.3195, Train Steps/Sec: 0.28, Epoch: 0.2123008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10926, "loss": 0.2245868444442749, "memory_gb": 7.721559524536133, "step_time_ms": 3359.807014465332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:52] (step=0010926) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.21232024873688302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10927, "loss": 0.3075125217437744, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0162315368652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:55] (step=0010927) Train Loss: 0.2706, Train Steps/Sec: 0.28, Epoch: 0.21233968130586864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:56:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 10928, "loss": 0.30626893043518066, "memory_gb": 7.721559524536133, "step_time_ms": 3360.443115234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:56:59] (step=0010928) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.21235911387485426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10929, "loss": 0.26577067375183105, "memory_gb": 7.721559524536133, "step_time_ms": 3357.860803604126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:02] (step=0010929) Train Loss: 0.2055, Train Steps/Sec: 0.28, Epoch: 0.21237854644383988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 10930, "loss": 0.2595702111721039, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7754917144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:06] (step=0010930) Train Loss: 0.2944, Train Steps/Sec: 0.28, Epoch: 0.2123979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10931, "loss": 0.18988873064517975, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375476837158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:10] (step=0010931) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.21241741158181113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 10932, "loss": 0.13083197176456451, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3947162628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:13] (step=0010932) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.21243684415079672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 10933, "loss": 0.16884228587150574, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1300506591797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:17] (step=0010933) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.21245627671978234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 10934, "loss": 0.2716553807258606, "memory_gb": 7.721559524536133, "step_time_ms": 3362.724781036377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:20] (step=0010934) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.21247570928876797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 10935, "loss": 0.188193678855896, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1338863372803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:24] (step=0010935) Train Loss: 0.2514, Train Steps/Sec: 0.28, Epoch: 0.2124951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 10936, "loss": 0.32134443521499634, "memory_gb": 7.721559524536133, "step_time_ms": 3362.774133682251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:27] (step=0010936) Train Loss: 0.2807, Train Steps/Sec: 0.28, Epoch: 0.2125145744267392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 10937, "loss": 0.24829353392124176, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5784130096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:31] (step=0010937) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.21253400699572483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10938, "loss": 0.25668495893478394, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2030487060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:35] (step=0010938) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.21255343956471046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 10939, "loss": 0.14693094789981842, "memory_gb": 7.721559524536133, "step_time_ms": 3362.73455619812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:38] (step=0010939) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.21257287213369608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 10940, "loss": 0.16499726474285126, "memory_gb": 7.721559524536133, "step_time_ms": 3358.814239501953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:42] (step=0010940) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.2125923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 10941, "loss": 0.22265858948230743, "memory_gb": 7.721559524536133, "step_time_ms": 3361.501932144165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:45] (step=0010941) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.21261173727166732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 10942, "loss": 0.23144002258777618, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2143058776855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:49] (step=0010942) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.21263116984065294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 10943, "loss": 0.11906156688928604, "memory_gb": 7.721559524536133, "step_time_ms": 3358.879566192627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:52] (step=0010943) Train Loss: 0.1369, Train Steps/Sec: 0.28, Epoch: 0.21265060240963857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:57:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 10944, "loss": 0.32033276557922363, "memory_gb": 7.715639114379883, "step_time_ms": 3330.3661346435547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:57:56] (step=0010944) Train Loss: 0.3081, Train Steps/Sec: 0.28, Epoch: 0.21267003497862416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10945, "loss": 0.197648286819458, "memory_gb": 7.721559524536133, "step_time_ms": 3360.152244567871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:00] (step=0010945) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.21268946754760978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 10946, "loss": 0.21186120808124542, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9843254089355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:03] (step=0010946) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.2127089001165954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 10947, "loss": 0.30688735842704773, "memory_gb": 7.721559524536133, "step_time_ms": 3359.090566635132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:07] (step=0010947) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.21272833268558103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 10948, "loss": 0.3140370547771454, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2371215820312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:10] (step=0010948) Train Loss: 0.3207, Train Steps/Sec: 0.28, Epoch: 0.21274776525456665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 10949, "loss": 0.2065819650888443, "memory_gb": 7.721559524536133, "step_time_ms": 3356.078624725342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:14] (step=0010949) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.21276719782355227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10950, "loss": 0.23883774876594543, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3133220672607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:18] (step=0010950) Train Loss: 0.2706, Train Steps/Sec: 0.27, Epoch: 0.2127866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 10951, "loss": 0.30185216665267944, "memory_gb": 7.721559524536133, "step_time_ms": 3358.718156814575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:21] (step=0010951) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.21280606296152352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10952, "loss": 0.35065165162086487, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8732013702393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:25] (step=0010952) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.21282549553050914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 10953, "loss": 0.25688764452934265, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3389053344727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:28] (step=0010953) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.21284492809949476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 10954, "loss": 0.2041269689798355, "memory_gb": 7.721559524536133, "step_time_ms": 3359.304189682007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:32] (step=0010954) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.21286436066848038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 10955, "loss": 0.28082147240638733, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6740703582764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:35] (step=0010955) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.21288379323746598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 10956, "loss": 0.22687366604804993, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6266765594482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:39] (step=0010956) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.2129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10957, "loss": 0.3124793767929077, "memory_gb": 7.721559524536133, "step_time_ms": 3355.442762374878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:43] (step=0010957) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.21292265837543722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 10958, "loss": 0.4103618264198303, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3495197296143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:46] (step=0010958) Train Loss: 0.3744, Train Steps/Sec: 0.28, Epoch: 0.21294209094442285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10959, "loss": 0.22123503684997559, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7568531036377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:50] (step=0010959) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.21296152351340847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 10960, "loss": 0.25801485776901245, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6884994506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:53] (step=0010960) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.2129809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:58:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 10961, "loss": 0.2600085735321045, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2123985290527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:58:57] (step=0010961) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.2130003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 10962, "loss": 0.3439394235610962, "memory_gb": 7.721559524536133, "step_time_ms": 3358.273983001709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:00] (step=0010962) Train Loss: 0.3359, Train Steps/Sec: 0.28, Epoch: 0.21301982122036534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 10963, "loss": 0.3148968517780304, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7101440429688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:04] (step=0010963) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.21303925378935096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10964, "loss": 0.14967933297157288, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0080528259277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:08] (step=0010964) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.21305868635833658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 10965, "loss": 0.24459370970726013, "memory_gb": 7.721559524536133, "step_time_ms": 3342.407703399658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:11] (step=0010965) Train Loss: 0.1840, Train Steps/Sec: 0.28, Epoch: 0.2130781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10966, "loss": 0.26708078384399414, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1482944488525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:15] (step=0010966) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.21309755149630782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 10967, "loss": 0.2972930669784546, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1314811706543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:18] (step=0010967) Train Loss: 0.2966, Train Steps/Sec: 0.28, Epoch: 0.21311698406529342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10968, "loss": 0.2556382715702057, "memory_gb": 7.721559524536133, "step_time_ms": 3502.2778511047363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:22] (step=0010968) Train Loss: 0.2419, Train Steps/Sec: 0.28, Epoch: 0.21313641663427904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 10969, "loss": 0.36171168088912964, "memory_gb": 7.721559524536133, "step_time_ms": 3358.705520629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:25] (step=0010969) Train Loss: 0.3242, Train Steps/Sec: 0.28, Epoch: 0.21315584920326466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 10970, "loss": 0.1862967610359192, "memory_gb": 7.721559524536133, "step_time_ms": 3354.375123977661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:29] (step=0010970) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.21317528177225029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10971, "loss": 0.2893803119659424, "memory_gb": 7.721559524536133, "step_time_ms": 3357.686758041382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:33] (step=0010971) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.2131947143412359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 10972, "loss": 0.11577825248241425, "memory_gb": 7.721559524536133, "step_time_ms": 3354.635000228882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:36] (step=0010972) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.21321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10973, "loss": 0.17624486982822418, "memory_gb": 7.721559524536133, "step_time_ms": 3350.0847816467285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:40] (step=0010973) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.21323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 10974, "loss": 0.3225964307785034, "memory_gb": 7.721559524536133, "step_time_ms": 3352.720022201538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:43] (step=0010974) Train Loss: 0.3044, Train Steps/Sec: 0.28, Epoch: 0.21325301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10975, "loss": 0.30580267310142517, "memory_gb": 7.721559524536133, "step_time_ms": 3357.703685760498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:47] (step=0010975) Train Loss: 0.3017, Train Steps/Sec: 0.28, Epoch: 0.2132724446171784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 10976, "loss": 0.1664854735136032, "memory_gb": 7.721559524536133, "step_time_ms": 3355.891227722168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:50] (step=0010976) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.21329187718616402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 10977, "loss": 0.2685895264148712, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2301864624023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:54] (step=0010977) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.21331130975514964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 10:59:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10978, "loss": 0.1930357664823532, "memory_gb": 7.721559524536133, "step_time_ms": 3355.008363723755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 10:59:58] (step=0010978) Train Loss: 0.1782, Train Steps/Sec: 0.28, Epoch: 0.21333074232413526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 10979, "loss": 0.2795881927013397, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2262573242188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:01] (step=0010979) Train Loss: 0.3155, Train Steps/Sec: 0.28, Epoch: 0.21335017489312086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10980, "loss": 0.2261515110731125, "memory_gb": 7.721559524536133, "step_time_ms": 3355.755090713501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:05] (step=0010980) Train Loss: 0.2367, Train Steps/Sec: 0.28, Epoch: 0.21336960746210648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 10981, "loss": 0.24389904737472534, "memory_gb": 7.721559524536133, "step_time_ms": 3359.971523284912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:08] (step=0010981) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.2133890400310921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10982, "loss": 0.26213642954826355, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2629947662354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:12] (step=0010982) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.21340847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 10983, "loss": 0.26788949966430664, "memory_gb": 7.721559524536133, "step_time_ms": 3355.402946472168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:15] (step=0010983) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.21342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 10984, "loss": 0.21839302778244019, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6930084228516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:19] (step=0010984) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.21344733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 10985, "loss": 0.2400994598865509, "memory_gb": 7.721559524536133, "step_time_ms": 3356.580972671509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:23] (step=0010985) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.2134667703070346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 10986, "loss": 0.2935693562030792, "memory_gb": 7.721559524536133, "step_time_ms": 3360.304117202759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:26] (step=0010986) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.21348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 10987, "loss": 0.1275091916322708, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8009185791016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:30] (step=0010987) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.21350563544500584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 10988, "loss": 0.25648659467697144, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2916049957275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:33] (step=0010988) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.21352506801399146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 10989, "loss": 0.3097662329673767, "memory_gb": 7.721559524536133, "step_time_ms": 3357.160806655884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:37] (step=0010989) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.21354450058297708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 10990, "loss": 0.1937139630317688, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5620651245117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:40] (step=0010990) Train Loss: 0.1745, Train Steps/Sec: 0.28, Epoch: 0.21356393315196268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 10991, "loss": 0.36184337735176086, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9376468658447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:44] (step=0010991) Train Loss: 0.3137, Train Steps/Sec: 0.28, Epoch: 0.2135833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 10992, "loss": 0.1881151646375656, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7275371551514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:47] (step=0010992) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.21360279828993392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 10993, "loss": 0.20240142941474915, "memory_gb": 7.721559524536133, "step_time_ms": 3353.280782699585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:51] (step=0010993) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.21362223085891954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 10994, "loss": 0.21649441123008728, "memory_gb": 7.721559524536133, "step_time_ms": 3357.646703720093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:55] (step=0010994) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.21364166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 10995, "loss": 0.2010284960269928, "memory_gb": 7.721559524536133, "step_time_ms": 3340.1644229888916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:00:58] (step=0010995) Train Loss: 0.1838, Train Steps/Sec: 0.29, Epoch: 0.2136610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 10996, "loss": 0.19891402125358582, "memory_gb": 7.721559524536133, "step_time_ms": 3359.391927719116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:02] (step=0010996) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.2136805285658764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 10997, "loss": 0.18974286317825317, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5121631622314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:05] (step=0010997) Train Loss: 0.2380, Train Steps/Sec: 0.27, Epoch: 0.21369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 10998, "loss": 0.23871596157550812, "memory_gb": 7.721559524536133, "step_time_ms": 3358.41965675354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:09] (step=0010998) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.21371939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 10999, "loss": 0.199802964925766, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3524227142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:12] (step=0010999) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.21373882627283328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11000, "loss": 0.22088919579982758, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6067428588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:16] (step=0011000) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.2137582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11001, "loss": 0.14676058292388916, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5727939605713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:20] (step=0011001) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.21377769141080452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11002, "loss": 0.2046559900045395, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8851833343506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:23] (step=0011002) Train Loss: 0.2535, Train Steps/Sec: 0.29, Epoch: 0.21379712397979012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11003, "loss": 0.20528927445411682, "memory_gb": 7.721559524536133, "step_time_ms": 3355.774402618408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:27] (step=0011003) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.21381655654877574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11004, "loss": 0.24551090598106384, "memory_gb": 7.721559524536133, "step_time_ms": 3359.851121902466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:30] (step=0011004) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.21383598911776136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11005, "loss": 0.230861097574234, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8993129730225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:34] (step=0011005) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.21385542168674698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11006, "loss": 0.18506185710430145, "memory_gb": 7.721559524536133, "step_time_ms": 3351.771593093872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:37] (step=0011006) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.2138748542557326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11007, "loss": 0.2741630971431732, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3969860076904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:41] (step=0011007) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.21389428682471823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11008, "loss": 0.2759700417518616, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9888343811035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:44] (step=0011008) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.21391371939370385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11009, "loss": 0.3151790499687195, "memory_gb": 7.721559524536133, "step_time_ms": 3502.7003288269043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:48] (step=0011009) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.21393315196268947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11010, "loss": 0.2145741879940033, "memory_gb": 7.721559524536133, "step_time_ms": 3351.046323776245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:52] (step=0011010) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.2139525845316751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11011, "loss": 0.2835121154785156, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6604900360107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:55] (step=0011011) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.21397201710066072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:01:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11012, "loss": 0.17434656620025635, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9954681396484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:01:59] (step=0011012) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.21399144966964634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11013, "loss": 0.1601419299840927, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8756160736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:02] (step=0011013) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.21401088223863196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11014, "loss": 0.2019219547510147, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5378189086914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:06] (step=0011014) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.21403031480761756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11015, "loss": 0.20151296257972717, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3694438934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:09] (step=0011015) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.21404974737660318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11016, "loss": 0.24632218480110168, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7286987304688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:13] (step=0011016) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.2140691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11017, "loss": 0.3074454963207245, "memory_gb": 7.721559524536133, "step_time_ms": 3359.569311141968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:16] (step=0011017) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.21408861251457442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11018, "loss": 0.26251107454299927, "memory_gb": 7.715639114379883, "step_time_ms": 3322.7901458740234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:20] (step=0011018) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.21410804508356004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11019, "loss": 0.2474740743637085, "memory_gb": 7.721559524536133, "step_time_ms": 3357.513904571533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:24] (step=0011019) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.21412747765254567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11020, "loss": 0.2708743214607239, "memory_gb": 7.721559524536133, "step_time_ms": 3361.618757247925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:27] (step=0011020) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.2141469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11021, "loss": 0.2563043236732483, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9745292663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:31] (step=0011021) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.2141663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11022, "loss": 0.19664961099624634, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1928482055664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:34] (step=0011022) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.21418577535950253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11023, "loss": 0.3186529874801636, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0058555603027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:38] (step=0011023) Train Loss: 0.3137, Train Steps/Sec: 0.28, Epoch: 0.21420520792848816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11024, "loss": 0.3097536563873291, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1249198913574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:41] (step=0011024) Train Loss: 0.3066, Train Steps/Sec: 0.28, Epoch: 0.21422464049747378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11025, "loss": 0.26178044080734253, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2722511291504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:45] (step=0011025) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.21424407306645937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11026, "loss": 0.31632858514785767, "memory_gb": 7.721559524536133, "step_time_ms": 3358.684301376343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:49] (step=0011026) Train Loss: 0.3282, Train Steps/Sec: 0.28, Epoch: 0.214263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11027, "loss": 0.2333964705467224, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5945835113525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:52] (step=0011027) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.21428293820443062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11028, "loss": 0.2404545247554779, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8319759368896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:56] (step=0011028) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.21430237077341624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11029, "loss": 0.16226713359355927, "memory_gb": 7.721559524536133, "step_time_ms": 3360.424041748047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:02:59] (step=0011029) Train Loss: 0.2405, Train Steps/Sec: 0.28, Epoch: 0.21432180334240186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11030, "loss": 0.2349257469177246, "memory_gb": 7.721559524536133, "step_time_ms": 3355.961322784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:03] (step=0011030) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.21434123591138748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11031, "loss": 0.21846844255924225, "memory_gb": 7.715639114379883, "step_time_ms": 3325.0224590301514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:06] (step=0011031) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.2143606684803731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11032, "loss": 0.24852630496025085, "memory_gb": 7.721559524536133, "step_time_ms": 3362.914800643921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:10] (step=0011032) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.21438010104935873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11033, "loss": 0.1820574700832367, "memory_gb": 7.721559524536133, "step_time_ms": 3348.001480102539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:14] (step=0011033) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.21439953361834435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11034, "loss": 0.25941771268844604, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7230072021484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:17] (step=0011034) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.21441896618732997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11035, "loss": 0.2918366491794586, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3553047180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:21] (step=0011035) Train Loss: 0.2974, Train Steps/Sec: 0.28, Epoch: 0.2144383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11036, "loss": 0.37599608302116394, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4350299835205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:24] (step=0011036) Train Loss: 0.3188, Train Steps/Sec: 0.28, Epoch: 0.21445783132530122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11037, "loss": 0.2556218206882477, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0645275115967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:28] (step=0011037) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.2144772638942868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11038, "loss": 0.21237103641033173, "memory_gb": 7.721559524536133, "step_time_ms": 3361.701488494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:32] (step=0011038) Train Loss: 0.2624, Train Steps/Sec: 0.27, Epoch: 0.21449669646327243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11039, "loss": 0.22339901328086853, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5521450042725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:35] (step=0011039) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.21451612903225806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11040, "loss": 0.1381816416978836, "memory_gb": 7.721559524536133, "step_time_ms": 3360.592842102051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:39] (step=0011040) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.21453556160124368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11041, "loss": 0.2384955734014511, "memory_gb": 7.721559524536133, "step_time_ms": 3357.212781906128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:42] (step=0011041) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.2145549941702293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11042, "loss": 0.2591358423233032, "memory_gb": 7.721559524536133, "step_time_ms": 3358.288288116455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:46] (step=0011042) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.21457442673921492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11043, "loss": 0.3046132028102875, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2874069213867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:49] (step=0011043) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.21459385930820055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11044, "loss": 0.1814499795436859, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2616786956787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:53] (step=0011044) Train Loss: 0.1980, Train Steps/Sec: 0.28, Epoch: 0.21461329187718617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11045, "loss": 0.2179451584815979, "memory_gb": 7.721559524536133, "step_time_ms": 3354.867696762085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:03:57] (step=0011045) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.2146327244461718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11046, "loss": 0.17612984776496887, "memory_gb": 7.721559524536133, "step_time_ms": 3361.905336380005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:00] (step=0011046) Train Loss: 0.1679, Train Steps/Sec: 0.28, Epoch: 0.2146521570151574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11047, "loss": 0.26363247632980347, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7210903167725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:04] (step=0011047) Train Loss: 0.2981, Train Steps/Sec: 0.28, Epoch: 0.21467158958414304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11048, "loss": 0.23641780018806458, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5754890441895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:07] (step=0011048) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.21469102215312863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11049, "loss": 0.16727879643440247, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5451374053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:11] (step=0011049) Train Loss: 0.1719, Train Steps/Sec: 0.28, Epoch: 0.21471045472211425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11050, "loss": 0.31418120861053467, "memory_gb": 7.721559524536133, "step_time_ms": 3355.612277984619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:14] (step=0011050) Train Loss: 0.2790, Train Steps/Sec: 0.28, Epoch: 0.21472988729109987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11051, "loss": 0.17597095668315887, "memory_gb": 7.721559524536133, "step_time_ms": 3358.414888381958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:18] (step=0011051) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.2147493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11052, "loss": 0.31677284836769104, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0121059417725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:21] (step=0011052) Train Loss: 0.3169, Train Steps/Sec: 0.28, Epoch: 0.21476875242907112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11053, "loss": 0.3012712597846985, "memory_gb": 7.721559524536133, "step_time_ms": 3361.527442932129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:25] (step=0011053) Train Loss: 0.3090, Train Steps/Sec: 0.28, Epoch: 0.21478818499805674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11054, "loss": 0.3083381652832031, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2652645111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:29] (step=0011054) Train Loss: 0.2488, Train Steps/Sec: 0.28, Epoch: 0.21480761756704236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11055, "loss": 0.15974034368991852, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1856231689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:32] (step=0011055) Train Loss: 0.1864, Train Steps/Sec: 0.28, Epoch: 0.21482705013602799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11056, "loss": 0.2799057364463806, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7307929992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:36] (step=0011056) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.2148464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11057, "loss": 0.22628915309906006, "memory_gb": 7.721559524536133, "step_time_ms": 3493.2925701141357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:39] (step=0011057) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.21486591527399923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11058, "loss": 0.29521650075912476, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4228801727295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:43] (step=0011058) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.21488534784298485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11059, "loss": 0.15538525581359863, "memory_gb": 7.721559524536133, "step_time_ms": 3343.421220779419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:46] (step=0011059) Train Loss: 0.2119, Train Steps/Sec: 0.29, Epoch: 0.21490478041197048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11060, "loss": 0.23687458038330078, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3970794677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:50] (step=0011060) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.21492421298095607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11061, "loss": 0.2812492847442627, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8313121795654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:54] (step=0011061) Train Loss: 0.3065, Train Steps/Sec: 0.28, Epoch: 0.2149436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11062, "loss": 0.2768539786338806, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9347858428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:04:57] (step=0011062) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.21496307811892731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11063, "loss": 0.22592049837112427, "memory_gb": 7.721559524536133, "step_time_ms": 3354.497194290161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:01] (step=0011063) Train Loss: 0.1760, Train Steps/Sec: 0.28, Epoch: 0.21498251068791294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11064, "loss": 0.26885291934013367, "memory_gb": 7.721559524536133, "step_time_ms": 3357.801675796509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:04] (step=0011064) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.21500194325689856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11065, "loss": 0.32734665274620056, "memory_gb": 7.721559524536133, "step_time_ms": 3356.273651123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:08] (step=0011065) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.21502137582588418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11066, "loss": 0.15956975519657135, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7121543884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:11] (step=0011066) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.2150408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11067, "loss": 0.22549355030059814, "memory_gb": 7.721559524536133, "step_time_ms": 3346.363306045532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:15] (step=0011067) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.21506024096385543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11068, "loss": 0.26390010118484497, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1761894226074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:18] (step=0011068) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.21507967353284105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11069, "loss": 0.233729749917984, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1953563690186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:22] (step=0011069) Train Loss: 0.3047, Train Steps/Sec: 0.28, Epoch: 0.21509910610182667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11070, "loss": 0.22380155324935913, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9226055145264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:26] (step=0011070) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.2151185386708123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11071, "loss": 0.195212259888649, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4509830474854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:29] (step=0011071) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.21513797123979791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11072, "loss": 0.20828822255134583, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5590381622314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:33] (step=0011072) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.2151574038087835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11073, "loss": 0.17762932181358337, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6487255096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:36] (step=0011073) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.21517683637776913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11074, "loss": 0.20282599329948425, "memory_gb": 7.721559524536133, "step_time_ms": 3350.186824798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:40] (step=0011074) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.21519626894675475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11075, "loss": 0.16822892427444458, "memory_gb": 7.721559524536133, "step_time_ms": 3354.813575744629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:43] (step=0011075) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.21521570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11076, "loss": 0.3334478437900543, "memory_gb": 7.721559524536133, "step_time_ms": 3356.435537338257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:47] (step=0011076) Train Loss: 0.3238, Train Steps/Sec: 0.28, Epoch: 0.215235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11077, "loss": 0.2128857970237732, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9467067718506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:51] (step=0011077) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.21525456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11078, "loss": 0.3142011761665344, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1817874908447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:54] (step=0011078) Train Loss: 0.2486, Train Steps/Sec: 0.28, Epoch: 0.21527399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11079, "loss": 0.21937081217765808, "memory_gb": 7.721559524536133, "step_time_ms": 3342.1812057495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:05:58] (step=0011079) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.21529343179168287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11080, "loss": 0.23066683113574982, "memory_gb": 7.721559524536133, "step_time_ms": 3355.485200881958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:01] (step=0011080) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.2153128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11081, "loss": 0.21351556479930878, "memory_gb": 7.721559524536133, "step_time_ms": 3356.106758117676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:05] (step=0011081) Train Loss: 0.1629, Train Steps/Sec: 0.28, Epoch: 0.2153322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11082, "loss": 0.2910870313644409, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5634956359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:08] (step=0011082) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.21535172949863973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11083, "loss": 0.2602437138557434, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9792766571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:12] (step=0011083) Train Loss: 0.2098, Train Steps/Sec: 0.28, Epoch: 0.21537116206762533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11084, "loss": 0.1999693214893341, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8249473571777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:15] (step=0011084) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.21539059463661095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11085, "loss": 0.26266321539878845, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9485206604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:19] (step=0011085) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.21541002720559657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11086, "loss": 0.2871583104133606, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6172847747803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:23] (step=0011086) Train Loss: 0.3003, Train Steps/Sec: 0.27, Epoch: 0.2154294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11087, "loss": 0.22678020596504211, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5328636169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:26] (step=0011087) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.21544889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11088, "loss": 0.3100377917289734, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3232421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:30] (step=0011088) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.21546832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11089, "loss": 0.2989434599876404, "memory_gb": 7.721559524536133, "step_time_ms": 3359.52091217041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:33] (step=0011089) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.21548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11090, "loss": 0.18191379308700562, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7512550354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:37] (step=0011090) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.21550719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11091, "loss": 0.25282853841781616, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1243286132812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:41] (step=0011091) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.2155266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11092, "loss": 0.22283205389976501, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9980182647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:44] (step=0011092) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.21554605518849593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11093, "loss": 0.14078277349472046, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9960384368896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:48] (step=0011093) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.21556548775748155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11094, "loss": 0.1994551420211792, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6721935272217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:51] (step=0011094) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.21558492032646717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11095, "loss": 0.16923609375953674, "memory_gb": 7.721559524536133, "step_time_ms": 3355.36527633667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:55] (step=0011095) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.21560435289545277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:06:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11096, "loss": 0.2142283320426941, "memory_gb": 7.721559524536133, "step_time_ms": 3340.1355743408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:06:58] (step=0011096) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.2156237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11097, "loss": 0.18277159333229065, "memory_gb": 7.721559524536133, "step_time_ms": 3496.3724613189697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:02] (step=0011097) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.215643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11098, "loss": 0.23965251445770264, "memory_gb": 7.721559524536133, "step_time_ms": 3351.551055908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:05] (step=0011098) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.21566265060240963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11099, "loss": 0.28679358959198, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2115688323975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:09] (step=0011099) Train Loss: 0.2866, Train Steps/Sec: 0.28, Epoch: 0.21568208317139526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11100, "loss": 0.20788437128067017, "memory_gb": 7.721559524536133, "step_time_ms": 3354.943037033081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:13] (step=0011100) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.21570151574038088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11101, "loss": 0.30307042598724365, "memory_gb": 7.715639114379883, "step_time_ms": 3322.202205657959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:16] (step=0011101) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.2157209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11102, "loss": 0.1677165925502777, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0166149139404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:20] (step=0011102) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.21574038087835212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11103, "loss": 0.24535156786441803, "memory_gb": 7.721559524536133, "step_time_ms": 3355.677843093872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:23] (step=0011103) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.21575981344733774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11104, "loss": 0.19931195676326752, "memory_gb": 7.721559524536133, "step_time_ms": 3355.790376663208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:27] (step=0011104) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.21577924601632337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11105, "loss": 0.26591184735298157, "memory_gb": 7.721559524536133, "step_time_ms": 3358.229398727417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:30] (step=0011105) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.215798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11106, "loss": 0.19821308553218842, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2182445526123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:34] (step=0011106) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.21581811115429458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11107, "loss": 0.29638534784317017, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1741371154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:38] (step=0011107) Train Loss: 0.3095, Train Steps/Sec: 0.28, Epoch: 0.2158375437232802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11108, "loss": 0.1949283480644226, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0616245269775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:41] (step=0011108) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.21585697629226583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11109, "loss": 0.2860759496688843, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4015369415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:45] (step=0011109) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.21587640886125145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11110, "loss": 0.21744996309280396, "memory_gb": 7.721559524536133, "step_time_ms": 3356.107473373413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:48] (step=0011110) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.21589584143023707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11111, "loss": 0.13068270683288574, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9772243499756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:52] (step=0011111) Train Loss: 0.1923, Train Steps/Sec: 0.28, Epoch: 0.2159152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11112, "loss": 0.24506282806396484, "memory_gb": 7.721559524536133, "step_time_ms": 3360.011577606201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:55] (step=0011112) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.21593470656820832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:07:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11113, "loss": 0.23316524922847748, "memory_gb": 7.721559524536133, "step_time_ms": 3344.2702293395996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:07:59] (step=0011113) Train Loss: 0.2323, Train Steps/Sec: 0.29, Epoch: 0.21595413913719394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11114, "loss": 0.30960091948509216, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7888946533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:02] (step=0011114) Train Loss: 0.2723, Train Steps/Sec: 0.28, Epoch: 0.21597357170617956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11115, "loss": 0.1834583282470703, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0030460357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:06] (step=0011115) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.21599300427516518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11116, "loss": 0.17899510264396667, "memory_gb": 7.721559524536133, "step_time_ms": 3358.05082321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:10] (step=0011116) Train Loss: 0.2164, Train Steps/Sec: 0.28, Epoch: 0.2160124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11117, "loss": 0.3198200762271881, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5618267059326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:13] (step=0011117) Train Loss: 0.3052, Train Steps/Sec: 0.28, Epoch: 0.21603186941313643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11118, "loss": 0.2257438600063324, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9775047302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:17] (step=0011118) Train Loss: 0.2134, Train Steps/Sec: 0.28, Epoch: 0.21605130198212202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11119, "loss": 0.17485925555229187, "memory_gb": 7.721559524536133, "step_time_ms": 3362.948417663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:20] (step=0011119) Train Loss: 0.1789, Train Steps/Sec: 0.28, Epoch: 0.21607073455110765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11120, "loss": 0.2583053410053253, "memory_gb": 7.721559524536133, "step_time_ms": 3359.631061553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:24] (step=0011120) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.21609016712009327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11121, "loss": 0.31360116600990295, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6576919555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:27] (step=0011121) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.2161095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11122, "loss": 0.1428331732749939, "memory_gb": 7.721559524536133, "step_time_ms": 3347.670793533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:31] (step=0011122) Train Loss: 0.1654, Train Steps/Sec: 0.28, Epoch: 0.2161290322580645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11123, "loss": 0.116634301841259, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6835651397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:34] (step=0011123) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.21614846482705014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11124, "loss": 0.32156693935394287, "memory_gb": 7.721559524536133, "step_time_ms": 3357.652425765991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:38] (step=0011124) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.21616789739603576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11125, "loss": 0.1971667855978012, "memory_gb": 7.721559524536133, "step_time_ms": 3359.844207763672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:42] (step=0011125) Train Loss: 0.2133, Train Steps/Sec: 0.28, Epoch: 0.21618732996502138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11126, "loss": 0.20394758880138397, "memory_gb": 7.721559524536133, "step_time_ms": 3361.846923828125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:45] (step=0011126) Train Loss: 0.2176, Train Steps/Sec: 0.27, Epoch: 0.216206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11127, "loss": 0.20535531640052795, "memory_gb": 7.721559524536133, "step_time_ms": 3353.504180908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:49] (step=0011127) Train Loss: 0.1994, Train Steps/Sec: 0.28, Epoch: 0.21622619510299262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11128, "loss": 0.189028799533844, "memory_gb": 7.721559524536133, "step_time_ms": 3358.138084411621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:52] (step=0011128) Train Loss: 0.2514, Train Steps/Sec: 0.28, Epoch: 0.21624562767197825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:08:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11129, "loss": 0.211807519197464, "memory_gb": 7.721559524536133, "step_time_ms": 3360.41522026062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:08:56] (step=0011129) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.21626506024096387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11130, "loss": 0.2874411642551422, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3815307617188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:00] (step=0011130) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.21628449280994946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11131, "loss": 0.26624974608421326, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8647842407227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:03] (step=0011131) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.21630392537893509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11132, "loss": 0.3642290234565735, "memory_gb": 7.721559524536133, "step_time_ms": 3360.51344871521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:07] (step=0011132) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.2163233579479207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11133, "loss": 0.36291730403900146, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1856746673584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:10] (step=0011133) Train Loss: 0.2645, Train Steps/Sec: 0.28, Epoch: 0.21634279051690633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11134, "loss": 0.2100134789943695, "memory_gb": 7.721559524536133, "step_time_ms": 3356.18257522583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:14] (step=0011134) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.21636222308589195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11135, "loss": 0.15051856637001038, "memory_gb": 7.721559524536133, "step_time_ms": 3355.905532836914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:17] (step=0011135) Train Loss: 0.1833, Train Steps/Sec: 0.28, Epoch: 0.21638165565487757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11136, "loss": 0.3199159801006317, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8433780670166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:21] (step=0011136) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.2164010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11137, "loss": 0.2872253656387329, "memory_gb": 7.721559524536133, "step_time_ms": 3358.788013458252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:25] (step=0011137) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.21642052079284882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11138, "loss": 0.16323626041412354, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4583740234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:28] (step=0011138) Train Loss: 0.1878, Train Steps/Sec: 0.28, Epoch: 0.21643995336183444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11139, "loss": 0.26054781675338745, "memory_gb": 7.721559524536133, "step_time_ms": 3500.270128250122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:32] (step=0011139) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.21645938593082006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11140, "loss": 0.29171767830848694, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1700325012207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:35] (step=0011140) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.2164788184998057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11141, "loss": 0.21528075635433197, "memory_gb": 7.721559524536133, "step_time_ms": 3342.7672386169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:39] (step=0011141) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.21649825106879128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11142, "loss": 0.19234240055084229, "memory_gb": 7.721559524536133, "step_time_ms": 3359.090805053711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:42] (step=0011142) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.2165176836377769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11143, "loss": 0.2835070788860321, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3764305114746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:46] (step=0011143) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.21653711620676253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11144, "loss": 0.2777709364891052, "memory_gb": 7.721559524536133, "step_time_ms": 3356.894016265869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:50] (step=0011144) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.21655654877574815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11145, "loss": 0.2314050793647766, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9716472625732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:53] (step=0011145) Train Loss: 0.2516, Train Steps/Sec: 0.28, Epoch: 0.21657598134473377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:09:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11146, "loss": 0.15564121305942535, "memory_gb": 7.721559524536133, "step_time_ms": 3351.377010345459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:09:57] (step=0011146) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.2165954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11147, "loss": 0.2400342971086502, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3575019836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:00] (step=0011147) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.21661484648270501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11148, "loss": 0.40834933519363403, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2565784454346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:04] (step=0011148) Train Loss: 0.3398, Train Steps/Sec: 0.28, Epoch: 0.21663427905169064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11149, "loss": 0.2589457333087921, "memory_gb": 7.721559524536133, "step_time_ms": 3360.551357269287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:07] (step=0011149) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.21665371162067626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11150, "loss": 0.17318692803382874, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4751358032227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:11] (step=0011150) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.21667314418966188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11151, "loss": 0.27133283019065857, "memory_gb": 7.721559524536133, "step_time_ms": 3357.01060295105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:15] (step=0011151) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.2166925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11152, "loss": 0.2145753651857376, "memory_gb": 7.721559524536133, "step_time_ms": 3360.684394836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:18] (step=0011152) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.21671200932763313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11153, "loss": 0.1369193196296692, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9444675445557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:22] (step=0011153) Train Loss: 0.1441, Train Steps/Sec: 0.28, Epoch: 0.21673144189661872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11154, "loss": 0.17781895399093628, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3967685699463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:25] (step=0011154) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.21675087446560434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11155, "loss": 0.3901759386062622, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1411838531494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:29] (step=0011155) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.21677030703458997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11156, "loss": 0.3296728730201721, "memory_gb": 7.721559524536133, "step_time_ms": 3363.020181655884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:32] (step=0011156) Train Loss: 0.3170, Train Steps/Sec: 0.28, Epoch: 0.2167897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11157, "loss": 0.2328992635011673, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3715686798096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:36] (step=0011157) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.2168091721725612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11158, "loss": 0.21297553181648254, "memory_gb": 7.721559524536133, "step_time_ms": 3360.990285873413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:40] (step=0011158) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.21682860474154683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11159, "loss": 0.23592308163642883, "memory_gb": 7.721559524536133, "step_time_ms": 3358.25777053833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:43] (step=0011159) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.21684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11160, "loss": 0.15434983372688293, "memory_gb": 7.721559524536133, "step_time_ms": 3354.276418685913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:47] (step=0011160) Train Loss: 0.1438, Train Steps/Sec: 0.28, Epoch: 0.21686746987951808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11161, "loss": 0.30740392208099365, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8599433898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:50] (step=0011161) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.2168869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11162, "loss": 0.2783204913139343, "memory_gb": 7.721559524536133, "step_time_ms": 3353.907346725464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:54] (step=0011162) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.21690633501748932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:10:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11163, "loss": 0.25030678510665894, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5775928497314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:10:57] (step=0011163) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.21692576758647494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11164, "loss": 0.2202330231666565, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6389808654785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:01] (step=0011164) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.21694520015546054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11165, "loss": 0.16238462924957275, "memory_gb": 7.721559524536133, "step_time_ms": 3360.046863555908, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:05] (step=0011165) Train Loss: 0.1777, Train Steps/Sec: 0.28, Epoch: 0.21696463272444616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11166, "loss": 0.24180403351783752, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1769981384277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:08] (step=0011166) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.21698406529343178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11167, "loss": 0.26103928685188293, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1839332580566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:12] (step=0011167) Train Loss: 0.2923, Train Steps/Sec: 0.28, Epoch: 0.2170034978624174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11168, "loss": 0.10046790540218353, "memory_gb": 7.721559524536133, "step_time_ms": 3348.844051361084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:15] (step=0011168) Train Loss: 0.1781, Train Steps/Sec: 0.28, Epoch: 0.21702293043140303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11169, "loss": 0.20484040677547455, "memory_gb": 7.721559524536133, "step_time_ms": 3359.989643096924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:19] (step=0011169) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.21704236300038865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11170, "loss": 0.1959470510482788, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3996295928955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:23] (step=0011170) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.21706179556937427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11171, "loss": 0.17287710309028625, "memory_gb": 7.721559524536133, "step_time_ms": 3339.29443359375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:26] (step=0011171) Train Loss: 0.1749, Train Steps/Sec: 0.28, Epoch: 0.2170812281383599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11172, "loss": 0.2895548641681671, "memory_gb": 7.721559524536133, "step_time_ms": 3356.275796890259, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:30] (step=0011172) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.21710066070734552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11173, "loss": 0.2659165561199188, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2066040039062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:33] (step=0011173) Train Loss: 0.2778, Train Steps/Sec: 0.27, Epoch: 0.21712009327633114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11174, "loss": 0.3121897280216217, "memory_gb": 7.721559524536133, "step_time_ms": 3350.848436355591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:37] (step=0011174) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.21713952584531676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11175, "loss": 0.27237468957901, "memory_gb": 7.721559524536133, "step_time_ms": 3352.60272026062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:40] (step=0011175) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.21715895841430238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11176, "loss": 0.17834359407424927, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9553413391113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:44] (step=0011176) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.21717839098328798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11177, "loss": 0.3195938467979431, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9110679626465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:48] (step=0011177) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.2171978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11178, "loss": 0.1409912407398224, "memory_gb": 7.721559524536133, "step_time_ms": 3350.291967391968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:51] (step=0011178) Train Loss: 0.1975, Train Steps/Sec: 0.28, Epoch: 0.21721725612125922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11179, "loss": 0.28681379556655884, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8128395080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:55] (step=0011179) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.21723668869024484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:11:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11180, "loss": 0.2642451524734497, "memory_gb": 7.721559524536133, "step_time_ms": 3347.8078842163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:11:58] (step=0011180) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.21725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11181, "loss": 0.18315385282039642, "memory_gb": 7.721559524536133, "step_time_ms": 3346.904993057251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:02] (step=0011181) Train Loss: 0.1591, Train Steps/Sec: 0.28, Epoch: 0.2172755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11182, "loss": 0.24389146268367767, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3171882629395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:05] (step=0011182) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.2172949863972017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11183, "loss": 0.27117443084716797, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0516834259033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:09] (step=0011183) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.21731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11184, "loss": 0.2107788473367691, "memory_gb": 7.721559524536133, "step_time_ms": 3352.968215942383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:13] (step=0011184) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.21733385153517296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11185, "loss": 0.3189184367656708, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5748043060303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:16] (step=0011185) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.21735328410415858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11186, "loss": 0.20202669501304626, "memory_gb": 7.721559524536133, "step_time_ms": 3504.7497749328613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:20] (step=0011186) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.2173727166731442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11187, "loss": 0.3042500615119934, "memory_gb": 7.721559524536133, "step_time_ms": 3358.915328979492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:23] (step=0011187) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.21739214924212982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11188, "loss": 0.32169613242149353, "memory_gb": 7.721559524536133, "step_time_ms": 3347.92160987854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:27] (step=0011188) Train Loss: 0.2778, Train Steps/Sec: 0.28, Epoch: 0.21741158181111542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11189, "loss": 0.2648952901363373, "memory_gb": 7.721559524536133, "step_time_ms": 3356.480121612549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:30] (step=0011189) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.21743101438010104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11190, "loss": 0.36024442315101624, "memory_gb": 7.721559524536133, "step_time_ms": 3357.191562652588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:34] (step=0011190) Train Loss: 0.3103, Train Steps/Sec: 0.28, Epoch: 0.21745044694908666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11191, "loss": 0.21741703152656555, "memory_gb": 7.721559524536133, "step_time_ms": 3353.872537612915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:38] (step=0011191) Train Loss: 0.2059, Train Steps/Sec: 0.28, Epoch: 0.21746987951807228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11192, "loss": 0.2597174644470215, "memory_gb": 7.721559524536133, "step_time_ms": 3357.788324356079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:41] (step=0011192) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.2174893120870579, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11193, "loss": 0.29679012298583984, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7239723205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:45] (step=0011193) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.21750874465604353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11194, "loss": 0.3273511826992035, "memory_gb": 7.721559524536133, "step_time_ms": 3357.830286026001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:48] (step=0011194) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.21752817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11195, "loss": 0.25844255089759827, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7699871063232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:52] (step=0011195) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.21754760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11196, "loss": 0.27739110589027405, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3732376098633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:55] (step=0011196) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.2175670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:12:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11197, "loss": 0.3759404420852661, "memory_gb": 7.721559524536133, "step_time_ms": 3359.052896499634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:12:59] (step=0011197) Train Loss: 0.3330, Train Steps/Sec: 0.28, Epoch: 0.21758647493198602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11198, "loss": 0.11853931844234467, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5100898742676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:03] (step=0011198) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.21760590750097164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11199, "loss": 0.27347618341445923, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7615489959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:06] (step=0011199) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.21762534006995723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11200, "loss": 0.2459583580493927, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8849658966064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:10] (step=0011200) Train Loss: 0.2467, Train Steps/Sec: 0.28, Epoch: 0.21764477263894286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11201, "loss": 0.2552894055843353, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4156761169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:13] (step=0011201) Train Loss: 0.3080, Train Steps/Sec: 0.28, Epoch: 0.21766420520792848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11202, "loss": 0.3471381664276123, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8945140838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:17] (step=0011202) Train Loss: 0.3190, Train Steps/Sec: 0.28, Epoch: 0.2176836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11203, "loss": 0.2607581317424774, "memory_gb": 7.721559524536133, "step_time_ms": 3360.395669937134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:20] (step=0011203) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.21770307034589972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11204, "loss": 0.3306039869785309, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7517108917236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:24] (step=0011204) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.21772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11205, "loss": 0.1774398386478424, "memory_gb": 7.721559524536133, "step_time_ms": 3350.041151046753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:27] (step=0011205) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.21774193548387097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11206, "loss": 0.36928418278694153, "memory_gb": 7.721559524536133, "step_time_ms": 3360.562324523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:31] (step=0011206) Train Loss: 0.2929, Train Steps/Sec: 0.28, Epoch: 0.2177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11207, "loss": 0.2690460979938507, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2281341552734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:35] (step=0011207) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.2177808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11208, "loss": 0.18971766531467438, "memory_gb": 7.721559524536133, "step_time_ms": 3348.902463912964, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:38] (step=0011208) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.21780023319082784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11209, "loss": 0.17913968861103058, "memory_gb": 7.721559524536133, "step_time_ms": 3354.135513305664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:42] (step=0011209) Train Loss: 0.1574, Train Steps/Sec: 0.28, Epoch: 0.21781966575981346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11210, "loss": 0.15565401315689087, "memory_gb": 7.721559524536133, "step_time_ms": 3361.813545227051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:45] (step=0011210) Train Loss: 0.1650, Train Steps/Sec: 0.28, Epoch: 0.21783909832879908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11211, "loss": 0.13686421513557434, "memory_gb": 7.721559524536133, "step_time_ms": 3361.760377883911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:49] (step=0011211) Train Loss: 0.1362, Train Steps/Sec: 0.28, Epoch: 0.21785853089778467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11212, "loss": 0.20845404267311096, "memory_gb": 7.721559524536133, "step_time_ms": 3363.553047180176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:52] (step=0011212) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.2178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11213, "loss": 0.17096920311450958, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6558055877686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:13:56] (step=0011213) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.21789739603575592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11214, "loss": 0.2944546639919281, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3009243011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:00] (step=0011214) Train Loss: 0.2494, Train Steps/Sec: 0.27, Epoch: 0.21791682860474154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11215, "loss": 0.16357874870300293, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3365936279297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:03] (step=0011215) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.21793626117372716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11216, "loss": 0.11377651244401932, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2759113311768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:07] (step=0011216) Train Loss: 0.1863, Train Steps/Sec: 0.28, Epoch: 0.21795569374271279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11217, "loss": 0.12186099588871002, "memory_gb": 7.721559524536133, "step_time_ms": 3358.891248703003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:10] (step=0011217) Train Loss: 0.1591, Train Steps/Sec: 0.28, Epoch: 0.2179751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11218, "loss": 0.16439023613929749, "memory_gb": 7.721559524536133, "step_time_ms": 3357.405424118042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:14] (step=0011218) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.21799455888068403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11219, "loss": 0.2875075936317444, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8694801330566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:18] (step=0011219) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.21801399144966965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11220, "loss": 0.20043541491031647, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4230880737305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:21] (step=0011220) Train Loss: 0.2785, Train Steps/Sec: 0.28, Epoch: 0.21803342401865528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11221, "loss": 0.13644751906394958, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7425899505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:25] (step=0011221) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.2180528565876409, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11222, "loss": 0.12821707129478455, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0690155029297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:28] (step=0011222) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.21807228915662652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11223, "loss": 0.15732768177986145, "memory_gb": 7.721559524536133, "step_time_ms": 3360.962390899658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:32] (step=0011223) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.21809172172561211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11224, "loss": 0.1661713719367981, "memory_gb": 7.721559524536133, "step_time_ms": 3362.583637237549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:35] (step=0011224) Train Loss: 0.2098, Train Steps/Sec: 0.28, Epoch: 0.21811115429459774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11225, "loss": 0.10596118867397308, "memory_gb": 7.721559524536133, "step_time_ms": 3359.699249267578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:39] (step=0011225) Train Loss: 0.1231, Train Steps/Sec: 0.28, Epoch: 0.21813058686358336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11226, "loss": 0.3140786588191986, "memory_gb": 7.721559524536133, "step_time_ms": 3360.302209854126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:43] (step=0011226) Train Loss: 0.3353, Train Steps/Sec: 0.28, Epoch: 0.21815001943256898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11227, "loss": 0.27652281522750854, "memory_gb": 7.715639114379883, "step_time_ms": 3471.2531566619873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:46] (step=0011227) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.2181694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11228, "loss": 0.29653507471084595, "memory_gb": 7.721559524536133, "step_time_ms": 3365.461587905884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:50] (step=0011228) Train Loss: 0.2954, Train Steps/Sec: 0.28, Epoch: 0.21818888457054023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11229, "loss": 0.22247576713562012, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9816093444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:53] (step=0011229) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.21820831713952585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11230, "loss": 0.21567195653915405, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0311279296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:14:57] (step=0011230) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.21822774970851147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11231, "loss": 0.23953863978385925, "memory_gb": 7.715639114379883, "step_time_ms": 3337.1422290802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:00] (step=0011231) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.2182471822774971, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11232, "loss": 0.3369307518005371, "memory_gb": 7.721559524536133, "step_time_ms": 3372.2028732299805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:04] (step=0011232) Train Loss: 0.3155, Train Steps/Sec: 0.28, Epoch: 0.21826661484648271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11233, "loss": 0.2585122585296631, "memory_gb": 7.721559524536133, "step_time_ms": 3364.455461502075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:08] (step=0011233) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.21828604741546834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11234, "loss": 0.235981285572052, "memory_gb": 7.721559524536133, "step_time_ms": 3365.814447402954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:11] (step=0011234) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.21830547998445393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11235, "loss": 0.2442176640033722, "memory_gb": 7.721559524536133, "step_time_ms": 3370.917797088623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:15] (step=0011235) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.21832491255343955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11236, "loss": 0.2887396514415741, "memory_gb": 7.715639114379883, "step_time_ms": 3341.4089679718018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:18] (step=0011236) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.21834434512242518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11237, "loss": 0.27409985661506653, "memory_gb": 7.721559524536133, "step_time_ms": 3364.516019821167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:22] (step=0011237) Train Loss: 0.3110, Train Steps/Sec: 0.28, Epoch: 0.2183637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11238, "loss": 0.3209530711174011, "memory_gb": 7.721559524536133, "step_time_ms": 3368.286609649658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:25] (step=0011238) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.21838321026039642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11239, "loss": 0.2719588577747345, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0972118377686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:29] (step=0011239) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.21840264282938204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11240, "loss": 0.2913808822631836, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7631874084473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:33] (step=0011240) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.21842207539836767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11241, "loss": 0.26517224311828613, "memory_gb": 7.721559524536133, "step_time_ms": 3363.769054412842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:36] (step=0011241) Train Loss: 0.2644, Train Steps/Sec: 0.28, Epoch: 0.2184415079673533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11242, "loss": 0.2889958620071411, "memory_gb": 7.721559524536133, "step_time_ms": 3362.185001373291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:40] (step=0011242) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.2184609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11243, "loss": 0.2145231068134308, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7774925231934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:43] (step=0011243) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.21848037310532453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11244, "loss": 0.25672850012779236, "memory_gb": 7.721559524536133, "step_time_ms": 3363.558292388916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:47] (step=0011244) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.21849980567431015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11245, "loss": 0.19435033202171326, "memory_gb": 7.721559524536133, "step_time_ms": 3364.762783050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:50] (step=0011245) Train Loss: 0.2114, Train Steps/Sec: 0.28, Epoch: 0.21851923824329578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11246, "loss": 0.117055743932724, "memory_gb": 7.721559524536133, "step_time_ms": 3357.177734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:54] (step=0011246) Train Loss: 0.1263, Train Steps/Sec: 0.28, Epoch: 0.21853867081228137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11247, "loss": 0.36982816457748413, "memory_gb": 7.721559524536133, "step_time_ms": 3367.401361465454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:15:58] (step=0011247) Train Loss: 0.3048, Train Steps/Sec: 0.28, Epoch: 0.218558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11248, "loss": 0.245986670255661, "memory_gb": 7.721559524536133, "step_time_ms": 3362.457275390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:01] (step=0011248) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.21857753595025262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11249, "loss": 0.20144256949424744, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1884841918945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:05] (step=0011249) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.21859696851923824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11250, "loss": 0.2557853162288666, "memory_gb": 7.721559524536133, "step_time_ms": 3362.410068511963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:08] (step=0011250) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.21861640108822386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11251, "loss": 0.28635409474372864, "memory_gb": 7.721559524536133, "step_time_ms": 3361.814022064209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:12] (step=0011251) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.21863583365720948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11252, "loss": 0.256384015083313, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1115684509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:16] (step=0011252) Train Loss: 0.2247, Train Steps/Sec: 0.28, Epoch: 0.2186552662261951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11253, "loss": 0.16194352507591248, "memory_gb": 7.721559524536133, "step_time_ms": 3362.812042236328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:19] (step=0011253) Train Loss: 0.1740, Train Steps/Sec: 0.28, Epoch: 0.21867469879518073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11254, "loss": 0.19386933743953705, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7402572631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:23] (step=0011254) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.21869413136416635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11255, "loss": 0.2846447825431824, "memory_gb": 7.721559524536133, "step_time_ms": 3358.078718185425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:26] (step=0011255) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.21871356393315197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11256, "loss": 0.1681056171655655, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9696159362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:30] (step=0011256) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.2187329965021376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11257, "loss": 0.24549907445907593, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9090881347656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:33] (step=0011257) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.2187524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11258, "loss": 0.2150980532169342, "memory_gb": 7.721559524536133, "step_time_ms": 3357.957601547241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:37] (step=0011258) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.2187718616401088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11259, "loss": 0.206122025847435, "memory_gb": 7.721559524536133, "step_time_ms": 3353.825569152832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:41] (step=0011259) Train Loss: 0.2032, Train Steps/Sec: 0.28, Epoch: 0.21879129420909443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11260, "loss": 0.249131441116333, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2983226776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:44] (step=0011260) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.21881072677808006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11261, "loss": 0.2536413073539734, "memory_gb": 7.721559524536133, "step_time_ms": 3359.389305114746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:48] (step=0011261) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.21883015934706568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11262, "loss": 0.2730940282344818, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3709678649902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:51] (step=0011262) Train Loss: 0.2720, Train Steps/Sec: 0.27, Epoch: 0.2188495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11263, "loss": 0.18301540613174438, "memory_gb": 7.721559524536133, "step_time_ms": 3359.128475189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:55] (step=0011263) Train Loss: 0.1923, Train Steps/Sec: 0.28, Epoch: 0.21886902448503692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11264, "loss": 0.2308984398841858, "memory_gb": 7.721559524536133, "step_time_ms": 3359.78364944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:16:59] (step=0011264) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.21888845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11265, "loss": 0.13418427109718323, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7215671539307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:02] (step=0011265) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.21890788962300817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11266, "loss": 0.27434539794921875, "memory_gb": 7.721559524536133, "step_time_ms": 3355.104446411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:06] (step=0011266) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.2189273221919938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11267, "loss": 0.27290597558021545, "memory_gb": 7.721559524536133, "step_time_ms": 3360.071897506714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:09] (step=0011267) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.2189467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11268, "loss": 0.20230332016944885, "memory_gb": 7.721559524536133, "step_time_ms": 3347.817897796631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:13] (step=0011268) Train Loss: 0.1715, Train Steps/Sec: 0.28, Epoch: 0.21896618732996503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11269, "loss": 0.2899770736694336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.384515762329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:16] (step=0011269) Train Loss: 0.2841, Train Steps/Sec: 0.28, Epoch: 0.21898561989895063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11270, "loss": 0.3251105546951294, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3435802459717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:20] (step=0011270) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.21900505246793625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11271, "loss": 0.23593775928020477, "memory_gb": 7.721559524536133, "step_time_ms": 3357.875347137451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:24] (step=0011271) Train Loss: 0.2161, Train Steps/Sec: 0.28, Epoch: 0.21902448503692187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11272, "loss": 0.277208149433136, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7210903167725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:27] (step=0011272) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.2190439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11273, "loss": 0.30542001128196716, "memory_gb": 7.721559524536133, "step_time_ms": 3358.137607574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:31] (step=0011273) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.21906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11274, "loss": 0.232373908162117, "memory_gb": 7.721559524536133, "step_time_ms": 3356.492280960083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:34] (step=0011274) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.21908278274387874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11275, "loss": 0.1281486451625824, "memory_gb": 7.721559524536133, "step_time_ms": 3503.7930011749268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:38] (step=0011275) Train Loss: 0.1739, Train Steps/Sec: 0.28, Epoch: 0.21910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11276, "loss": 0.2836536765098572, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9231548309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:41] (step=0011276) Train Loss: 0.2082, Train Steps/Sec: 0.28, Epoch: 0.21912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11277, "loss": 0.1950250267982483, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8614463806152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:45] (step=0011277) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.2191410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11278, "loss": 0.34511762857437134, "memory_gb": 7.721559524536133, "step_time_ms": 3355.820894241333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:49] (step=0011278) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.21916051301982123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11279, "loss": 0.18423235416412354, "memory_gb": 7.721559524536133, "step_time_ms": 3357.550621032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:52] (step=0011279) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.21917994558880685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11280, "loss": 0.11234814673662186, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7091693878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:56] (step=0011280) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.21919937815779247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:17:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11281, "loss": 0.29727473855018616, "memory_gb": 7.721559524536133, "step_time_ms": 3353.034496307373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:17:59] (step=0011281) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.21921881072677807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11282, "loss": 0.14985820651054382, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3600101470947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:03] (step=0011282) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.2192382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11283, "loss": 0.2585204243659973, "memory_gb": 7.721559524536133, "step_time_ms": 3350.113868713379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:06] (step=0011283) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.2192576758647493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11284, "loss": 0.2496243566274643, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3714027404785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:10] (step=0011284) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.21927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11285, "loss": 0.20829778909683228, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2121601104736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:14] (step=0011285) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.21929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11286, "loss": 0.2088341861963272, "memory_gb": 7.715639114379883, "step_time_ms": 3317.5809383392334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:17] (step=0011286) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.21931597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11287, "loss": 0.16714483499526978, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1277389526367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:21] (step=0011287) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.2193354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11288, "loss": 0.2470516711473465, "memory_gb": 7.721559524536133, "step_time_ms": 3350.313663482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:24] (step=0011288) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.21935483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11289, "loss": 0.2906784117221832, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6034355163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:28] (step=0011289) Train Loss: 0.2673, Train Steps/Sec: 0.28, Epoch: 0.21937427127866305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11290, "loss": 0.15865129232406616, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5097579956055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:31] (step=0011290) Train Loss: 0.1648, Train Steps/Sec: 0.28, Epoch: 0.21939370384764867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11291, "loss": 0.12388138473033905, "memory_gb": 7.721559524536133, "step_time_ms": 3354.355573654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:35] (step=0011291) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.2194131364166343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11292, "loss": 0.2890429198741913, "memory_gb": 7.715639114379883, "step_time_ms": 3319.4024562835693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:39] (step=0011292) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.21943256898561989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11293, "loss": 0.23807929456233978, "memory_gb": 7.721559524536133, "step_time_ms": 3351.998805999756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:42] (step=0011293) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.2194520015546055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11294, "loss": 0.23359693586826324, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8809547424316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:46] (step=0011294) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.21947143412359113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11295, "loss": 0.27310431003570557, "memory_gb": 7.721559524536133, "step_time_ms": 3355.102777481079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:49] (step=0011295) Train Loss: 0.3068, Train Steps/Sec: 0.28, Epoch: 0.21949086669257675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11296, "loss": 0.21088340878486633, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8596420288086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:53] (step=0011296) Train Loss: 0.2114, Train Steps/Sec: 0.28, Epoch: 0.21951029926156237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:18:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11297, "loss": 0.2301214635372162, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5429706573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:18:56] (step=0011297) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.219529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11298, "loss": 0.16719567775726318, "memory_gb": 7.721559524536133, "step_time_ms": 3354.499340057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:00] (step=0011298) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.21954916439953362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11299, "loss": 0.2528102993965149, "memory_gb": 7.721559524536133, "step_time_ms": 3352.391004562378, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:03] (step=0011299) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.21956859696851924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11300, "loss": 0.24060086905956268, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8246364593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:07] (step=0011300) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.21958802953750486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11301, "loss": 0.3471958637237549, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9495677948, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:11] (step=0011301) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.2196074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11302, "loss": 0.3522191643714905, "memory_gb": 7.715639114379883, "step_time_ms": 3323.469877243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:14] (step=0011302) Train Loss: 0.2833, Train Steps/Sec: 0.27, Epoch: 0.2196268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11303, "loss": 0.31923550367355347, "memory_gb": 7.721559524536133, "step_time_ms": 3351.78279876709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:18] (step=0011303) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.21964632724446173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11304, "loss": 0.28204572200775146, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8057498931885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:21] (step=0011304) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.21966575981344733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11305, "loss": 0.24702101945877075, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4235229492188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:25] (step=0011305) Train Loss: 0.2473, Train Steps/Sec: 0.28, Epoch: 0.21968519238243295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11306, "loss": 0.32180851697921753, "memory_gb": 7.721559524536133, "step_time_ms": 3352.438449859619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:29] (step=0011306) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.21970462495141857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11307, "loss": 0.275422066450119, "memory_gb": 7.721559524536133, "step_time_ms": 3357.886791229248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:32] (step=0011307) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.2197240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11308, "loss": 0.24061407148838043, "memory_gb": 7.721559524536133, "step_time_ms": 3358.370542526245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:36] (step=0011308) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.21974349008938981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11309, "loss": 0.11490410566329956, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9364337921143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:39] (step=0011309) Train Loss: 0.1319, Train Steps/Sec: 0.28, Epoch: 0.21976292265837544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11310, "loss": 0.14360147714614868, "memory_gb": 7.721559524536133, "step_time_ms": 3353.37233543396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:43] (step=0011310) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.21978235522736106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11311, "loss": 0.18554317951202393, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6905937194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:46] (step=0011311) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.21980178779634668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11312, "loss": 0.250137060880661, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3482246398926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:50] (step=0011312) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.2198212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11313, "loss": 0.2007264941930771, "memory_gb": 7.721559524536133, "step_time_ms": 3359.96413230896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:54] (step=0011313) Train Loss: 0.1743, Train Steps/Sec: 0.28, Epoch: 0.21984065293431793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:19:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11314, "loss": 0.3595580458641052, "memory_gb": 7.721559524536133, "step_time_ms": 3358.532667160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:19:57] (step=0011314) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.21986008550330355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11315, "loss": 0.19270023703575134, "memory_gb": 7.721559524536133, "step_time_ms": 3502.1746158599854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:01] (step=0011315) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.21987951807228914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11316, "loss": 0.18614394962787628, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2472286224365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:04] (step=0011316) Train Loss: 0.1829, Train Steps/Sec: 0.28, Epoch: 0.21989895064127477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11317, "loss": 0.18607336282730103, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1406860351562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:08] (step=0011317) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.2199183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11318, "loss": 0.298936665058136, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3836040496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:11] (step=0011318) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.219937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11319, "loss": 0.12069407105445862, "memory_gb": 7.721559524536133, "step_time_ms": 3354.382038116455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:15] (step=0011319) Train Loss: 0.1737, Train Steps/Sec: 0.28, Epoch: 0.21995724834823163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11320, "loss": 0.18808603286743164, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6839179992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:19] (step=0011320) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.21997668091721725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11321, "loss": 0.26365041732788086, "memory_gb": 7.721559524536133, "step_time_ms": 3346.8751907348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:22] (step=0011321) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.21999611348620288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11322, "loss": 0.30001112818717957, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3009243011475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:26] (step=0011322) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.2200155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11323, "loss": 0.181609109044075, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1315116882324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:29] (step=0011323) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.22003497862417412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11324, "loss": 0.28760573267936707, "memory_gb": 7.721559524536133, "step_time_ms": 3358.67977142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:33] (step=0011324) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.22005441119315974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11325, "loss": 0.27198901772499084, "memory_gb": 7.721559524536133, "step_time_ms": 3364.293098449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:36] (step=0011325) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.22007384376214537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11326, "loss": 0.33062952756881714, "memory_gb": 7.721559524536133, "step_time_ms": 3344.958782196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:40] (step=0011326) Train Loss: 0.2991, Train Steps/Sec: 0.28, Epoch: 0.220093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11327, "loss": 0.18982508778572083, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1837882995605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:44] (step=0011327) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.22011270890011658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11328, "loss": 0.22141113877296448, "memory_gb": 7.721559524536133, "step_time_ms": 3362.039566040039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:47] (step=0011328) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2201321414691022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11329, "loss": 0.12851974368095398, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7803840637207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:51] (step=0011329) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.22015157403808783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11330, "loss": 0.21398577094078064, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2511882781982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:54] (step=0011330) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.22017100660707345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:20:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11331, "loss": 0.13477812707424164, "memory_gb": 7.721559524536133, "step_time_ms": 3357.865810394287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:20:58] (step=0011331) Train Loss: 0.1522, Train Steps/Sec: 0.28, Epoch: 0.22019043917605907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11332, "loss": 0.22544610500335693, "memory_gb": 7.721559524536133, "step_time_ms": 3359.083414077759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:01] (step=0011332) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.2202098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11333, "loss": 0.27469807863235474, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0073795318604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:05] (step=0011333) Train Loss: 0.2725, Train Steps/Sec: 0.28, Epoch: 0.22022930431403032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11334, "loss": 0.2779577672481537, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0091819763184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:09] (step=0011334) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.22024873688301594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11335, "loss": 0.13805559277534485, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6701107025146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:12] (step=0011335) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.22026816945200156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11336, "loss": 0.12975631654262543, "memory_gb": 7.715639114379883, "step_time_ms": 3323.0953216552734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:16] (step=0011336) Train Loss: 0.2077, Train Steps/Sec: 0.28, Epoch: 0.22028760202098718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11337, "loss": 0.22685584425926208, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9959869384766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:19] (step=0011337) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.2203070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11338, "loss": 0.2396388202905655, "memory_gb": 7.721559524536133, "step_time_ms": 3353.879451751709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:23] (step=0011338) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.22032646715895843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11339, "loss": 0.2507878839969635, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1814556121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:27] (step=0011339) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.22034589972794402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11340, "loss": 0.23998908698558807, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2823696136475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:30] (step=0011340) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.22036533229692964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11341, "loss": 0.32288798689842224, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9366416931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:34] (step=0011341) Train Loss: 0.2764, Train Steps/Sec: 0.28, Epoch: 0.22038476486591527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11342, "loss": 0.13109637796878815, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5693321228027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:37] (step=0011342) Train Loss: 0.1481, Train Steps/Sec: 0.28, Epoch: 0.2204041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11343, "loss": 0.271198570728302, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8438968658447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:41] (step=0011343) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.2204236300038865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11344, "loss": 0.25615763664245605, "memory_gb": 7.721559524536133, "step_time_ms": 3364.727258682251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:44] (step=0011344) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.22044306257287213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11345, "loss": 0.25455257296562195, "memory_gb": 7.721559524536133, "step_time_ms": 3359.116315841675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:48] (step=0011345) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.22046249514185776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11346, "loss": 0.2009742558002472, "memory_gb": 7.721559524536133, "step_time_ms": 3360.469102859497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:52] (step=0011346) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.22048192771084338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11347, "loss": 0.21430560946464539, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2134761810303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:55] (step=0011347) Train Loss: 0.2097, Train Steps/Sec: 0.28, Epoch: 0.220501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:21:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11348, "loss": 0.2769726812839508, "memory_gb": 7.721559524536133, "step_time_ms": 3360.728979110718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:21:59] (step=0011348) Train Loss: 0.2638, Train Steps/Sec: 0.28, Epoch: 0.22052079284881462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11349, "loss": 0.14032290875911713, "memory_gb": 7.721559524536133, "step_time_ms": 3361.971855163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:03] (step=0011349) Train Loss: 0.1757, Train Steps/Sec: 0.27, Epoch: 0.22054022541780025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11350, "loss": 0.19401915371418, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8929901123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:06] (step=0011350) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.22055965798678584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11351, "loss": 0.1542368233203888, "memory_gb": 7.721559524536133, "step_time_ms": 3356.949806213379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:10] (step=0011351) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.22057909055577146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11352, "loss": 0.2925635576248169, "memory_gb": 7.715639114379883, "step_time_ms": 3329.3139934539795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:13] (step=0011352) Train Loss: 0.3177, Train Steps/Sec: 0.28, Epoch: 0.22059852312475708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11353, "loss": 0.20242659747600555, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0467491149902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:17] (step=0011353) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.2206179556937427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11354, "loss": 0.28637397289276123, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5497817993164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:20] (step=0011354) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.22063738826272833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11355, "loss": 0.259737104177475, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1156005859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:24] (step=0011355) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.22065682083171395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11356, "loss": 0.2494482696056366, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1325492858887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:28] (step=0011356) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.22067625340069957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11357, "loss": 0.1562451869249344, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5122661590576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:31] (step=0011357) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.2206956859696852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11358, "loss": 0.3314213454723358, "memory_gb": 7.721559524536133, "step_time_ms": 3357.21492767334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:35] (step=0011358) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.22071511853867082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11359, "loss": 0.2694973647594452, "memory_gb": 7.721559524536133, "step_time_ms": 3359.821557998657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:38] (step=0011359) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.22073455110765644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11360, "loss": 0.32680654525756836, "memory_gb": 7.721559524536133, "step_time_ms": 3360.137462615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:42] (step=0011360) Train Loss: 0.3023, Train Steps/Sec: 0.28, Epoch: 0.22075398367664206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11361, "loss": 0.2903151512145996, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5376014709473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:45] (step=0011361) Train Loss: 0.2678, Train Steps/Sec: 0.28, Epoch: 0.22077341624562768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11362, "loss": 0.2965293526649475, "memory_gb": 7.721559524536133, "step_time_ms": 3500.3089904785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:49] (step=0011362) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.22079284881461328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11363, "loss": 0.2613517642021179, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4214191436768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:53] (step=0011363) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.2208122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:22:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11364, "loss": 0.1450105905532837, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7921390533447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:22:56] (step=0011364) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.22083171395258452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11365, "loss": 0.15655189752578735, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0705184936523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:00] (step=0011365) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.22085114652157015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11366, "loss": 0.21520186960697174, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9350967407227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:03] (step=0011366) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.22087057909055577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11367, "loss": 0.26507580280303955, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5267791748047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:07] (step=0011367) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.2208900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11368, "loss": 0.3239850401878357, "memory_gb": 7.721559524536133, "step_time_ms": 3349.921226501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:11] (step=0011368) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.220909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11369, "loss": 0.18378576636314392, "memory_gb": 7.721559524536133, "step_time_ms": 3353.179693222046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:14] (step=0011369) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.22092887679751264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11370, "loss": 0.12164906412363052, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6443405151367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:18] (step=0011370) Train Loss: 0.1509, Train Steps/Sec: 0.28, Epoch: 0.22094830936649826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11371, "loss": 0.20787306129932404, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2707901000977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:21] (step=0011371) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.22096774193548388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11372, "loss": 0.3411442041397095, "memory_gb": 7.721559524536133, "step_time_ms": 3357.99503326416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:25] (step=0011372) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.2209871745044695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11373, "loss": 0.159386545419693, "memory_gb": 7.721559524536133, "step_time_ms": 3353.203535079956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:28] (step=0011373) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.2210066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11374, "loss": 0.18885275721549988, "memory_gb": 7.721559524536133, "step_time_ms": 3349.794864654541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:32] (step=0011374) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.22102603964244072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11375, "loss": 0.18899396061897278, "memory_gb": 7.721559524536133, "step_time_ms": 3355.268716812134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:36] (step=0011375) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.22104547221142634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11376, "loss": 0.2679537236690521, "memory_gb": 7.721559524536133, "step_time_ms": 3339.7209644317627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:39] (step=0011376) Train Loss: 0.2887, Train Steps/Sec: 0.28, Epoch: 0.22106490478041196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11377, "loss": 0.22682970762252808, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8965454101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:43] (step=0011377) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.22108433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11378, "loss": 0.19094756245613098, "memory_gb": 7.721559524536133, "step_time_ms": 3355.666160583496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:46] (step=0011378) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.2211037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11379, "loss": 0.15062913298606873, "memory_gb": 7.721559524536133, "step_time_ms": 3354.661226272583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:50] (step=0011379) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.22112320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11380, "loss": 0.24869760870933533, "memory_gb": 7.721559524536133, "step_time_ms": 3357.520341873169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:53] (step=0011380) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.22114263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:23:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11381, "loss": 0.1774204969406128, "memory_gb": 7.721559524536133, "step_time_ms": 3340.4860496520996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:23:57] (step=0011381) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.22116206762534008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11382, "loss": 0.3393087387084961, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8947105407715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:01] (step=0011382) Train Loss: 0.3138, Train Steps/Sec: 0.28, Epoch: 0.2211815001943257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11383, "loss": 0.25229355692863464, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5633296966553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:04] (step=0011383) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.22120093276331132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11384, "loss": 0.16425199806690216, "memory_gb": 7.721559524536133, "step_time_ms": 3357.326030731201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:08] (step=0011384) Train Loss: 0.2090, Train Steps/Sec: 0.28, Epoch: 0.22122036533229694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11385, "loss": 0.24299772083759308, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9813289642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:11] (step=0011385) Train Loss: 0.3030, Train Steps/Sec: 0.28, Epoch: 0.22123979790128254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11386, "loss": 0.24801819026470184, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1666011810303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:15] (step=0011386) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.22125923047026816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11387, "loss": 0.2328718602657318, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7188510894775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:18] (step=0011387) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.22127866303925378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11388, "loss": 0.2696431875228882, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3313694000244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:22] (step=0011388) Train Loss: 0.2886, Train Steps/Sec: 0.28, Epoch: 0.2212980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11389, "loss": 0.3102368414402008, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1928577423096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:26] (step=0011389) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.22131752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11390, "loss": 0.16612401604652405, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7049293518066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:29] (step=0011390) Train Loss: 0.1592, Train Steps/Sec: 0.28, Epoch: 0.22133696074621065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11391, "loss": 0.2630775570869446, "memory_gb": 7.721559524536133, "step_time_ms": 3358.595848083496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:33] (step=0011391) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.22135639331519627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11392, "loss": 0.23140069842338562, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5080890655518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:36] (step=0011392) Train Loss: 0.2076, Train Steps/Sec: 0.28, Epoch: 0.2213758258841819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11393, "loss": 0.23960985243320465, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7213592529297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:40] (step=0011393) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.22139525845316751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11394, "loss": 0.25529152154922485, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0704460144043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:43] (step=0011394) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.22141469102215314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11395, "loss": 0.1734362542629242, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7468605041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:47] (step=0011395) Train Loss: 0.1801, Train Steps/Sec: 0.28, Epoch: 0.22143412359113876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11396, "loss": 0.3319410979747772, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5328330993652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:51] (step=0011396) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.22145355616012438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11397, "loss": 0.24750933051109314, "memory_gb": 7.721559524536133, "step_time_ms": 3352.372884750366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:54] (step=0011397) Train Loss: 0.2608, Train Steps/Sec: 0.27, Epoch: 0.22147298872910998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:24:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11398, "loss": 0.20044049620628357, "memory_gb": 7.721559524536133, "step_time_ms": 3343.7294960021973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:24:58] (step=0011398) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.2214924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11399, "loss": 0.20868325233459473, "memory_gb": 7.721559524536133, "step_time_ms": 3359.898567199707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:01] (step=0011399) Train Loss: 0.1873, Train Steps/Sec: 0.28, Epoch: 0.22151185386708122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11400, "loss": 0.29600971937179565, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2793731689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:05] (step=0011400) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.22153128643606684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11401, "loss": 0.2074982076883316, "memory_gb": 7.721559524536133, "step_time_ms": 3357.727527618408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:08] (step=0011401) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.22155071900505247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11402, "loss": 0.20894065499305725, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5916500091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:12] (step=0011402) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2215701515740381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11403, "loss": 0.1356443464756012, "memory_gb": 7.721559524536133, "step_time_ms": 3504.0555000305176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:16] (step=0011403) Train Loss: 0.1685, Train Steps/Sec: 0.28, Epoch: 0.2215895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11404, "loss": 0.27868181467056274, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3315143585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:19] (step=0011404) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.22160901671200933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11405, "loss": 0.2957638204097748, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3669662475586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:23] (step=0011405) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.22162844928099495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11406, "loss": 0.21701101958751678, "memory_gb": 7.721559524536133, "step_time_ms": 3355.772018432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:26] (step=0011406) Train Loss: 0.2141, Train Steps/Sec: 0.28, Epoch: 0.22164788184998058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11407, "loss": 0.2492181658744812, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8438243865967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:30] (step=0011407) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.2216673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11408, "loss": 0.27195975184440613, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8385581970215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:33] (step=0011408) Train Loss: 0.3141, Train Steps/Sec: 0.28, Epoch: 0.2216867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11409, "loss": 0.33326488733291626, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7936840057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:37] (step=0011409) Train Loss: 0.2869, Train Steps/Sec: 0.28, Epoch: 0.22170617955693742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11410, "loss": 0.2971886694431305, "memory_gb": 7.721559524536133, "step_time_ms": 3360.938310623169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:41] (step=0011410) Train Loss: 0.2087, Train Steps/Sec: 0.28, Epoch: 0.22172561212592304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11411, "loss": 0.23642241954803467, "memory_gb": 7.721559524536133, "step_time_ms": 3364.405870437622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:44] (step=0011411) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.22174504469490866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11412, "loss": 0.266262412071228, "memory_gb": 7.721559524536133, "step_time_ms": 3360.851764678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:48] (step=0011412) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.22176447726389428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11413, "loss": 0.3067338466644287, "memory_gb": 7.721559524536133, "step_time_ms": 3355.959415435791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:51] (step=0011413) Train Loss: 0.3277, Train Steps/Sec: 0.28, Epoch: 0.2217839098328799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11414, "loss": 0.25297003984451294, "memory_gb": 7.721559524536133, "step_time_ms": 3353.985071182251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:55] (step=0011414) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.22180334240186553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11415, "loss": 0.1804373413324356, "memory_gb": 7.721559524536133, "step_time_ms": 3362.271308898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:25:58] (step=0011415) Train Loss: 0.1696, Train Steps/Sec: 0.28, Epoch: 0.22182277497085115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11416, "loss": 0.12560346722602844, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6062355041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:02] (step=0011416) Train Loss: 0.1747, Train Steps/Sec: 0.28, Epoch: 0.22184220753983677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11417, "loss": 0.20419831573963165, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7876300811768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:06] (step=0011417) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.2218616401088224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11418, "loss": 0.19233642518520355, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8674488067627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:09] (step=0011418) Train Loss: 0.1812, Train Steps/Sec: 0.28, Epoch: 0.22188107267780802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11419, "loss": 0.24842774868011475, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4492416381836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:13] (step=0011419) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.22190050524679364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11420, "loss": 0.29580387473106384, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6178245544434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:16] (step=0011420) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.22191993781577923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11421, "loss": 0.23047524690628052, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8876934051514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:20] (step=0011421) Train Loss: 0.2098, Train Steps/Sec: 0.28, Epoch: 0.22193937038476486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11422, "loss": 0.2636684477329254, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7146015167236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:24] (step=0011422) Train Loss: 0.3248, Train Steps/Sec: 0.28, Epoch: 0.22195880295375048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11423, "loss": 0.35352006554603577, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9067668914795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:27] (step=0011423) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.2219782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11424, "loss": 0.2289431095123291, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3453845977783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:31] (step=0011424) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.22199766809172172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11425, "loss": 0.18981590867042542, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9353046417236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:34] (step=0011425) Train Loss: 0.2076, Train Steps/Sec: 0.28, Epoch: 0.22201710066070734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11426, "loss": 0.32388123869895935, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2819442749023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:38] (step=0011426) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.22203653322969297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11427, "loss": 0.15657994151115417, "memory_gb": 7.721559524536133, "step_time_ms": 3366.776704788208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:41] (step=0011427) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.2220559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11428, "loss": 0.2001442313194275, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2104816436768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:45] (step=0011428) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.2220753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11429, "loss": 0.2649546265602112, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0338859558105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:49] (step=0011429) Train Loss: 0.2536, Train Steps/Sec: 0.28, Epoch: 0.22209483093664983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11430, "loss": 0.1815689504146576, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7757720947266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:52] (step=0011430) Train Loss: 0.1890, Train Steps/Sec: 0.28, Epoch: 0.22211426350563546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11431, "loss": 0.2918057441711426, "memory_gb": 7.721559524536133, "step_time_ms": 3359.666347503662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:56] (step=0011431) Train Loss: 0.3029, Train Steps/Sec: 0.28, Epoch: 0.22213369607462108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11432, "loss": 0.23269924521446228, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4184036254883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:26:59] (step=0011432) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.22215312864360667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11433, "loss": 0.20563095808029175, "memory_gb": 7.721559524536133, "step_time_ms": 3370.103359222412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:03] (step=0011433) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.2221725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11434, "loss": 0.1943299025297165, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6018505096436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:06] (step=0011434) Train Loss: 0.2164, Train Steps/Sec: 0.28, Epoch: 0.22219199378157792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11435, "loss": 0.16700705885887146, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9473190307617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:10] (step=0011435) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.22221142635056354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11436, "loss": 0.3596845269203186, "memory_gb": 7.721559524536133, "step_time_ms": 3353.686809539795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:14] (step=0011436) Train Loss: 0.2934, Train Steps/Sec: 0.28, Epoch: 0.22223085891954916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11437, "loss": 0.1630897969007492, "memory_gb": 7.721559524536133, "step_time_ms": 3366.743803024292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:17] (step=0011437) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.22225029148853478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11438, "loss": 0.2844458520412445, "memory_gb": 7.721559524536133, "step_time_ms": 3372.380256652832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:21] (step=0011438) Train Loss: 0.2791, Train Steps/Sec: 0.27, Epoch: 0.2222697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11439, "loss": 0.30080556869506836, "memory_gb": 7.721559524536133, "step_time_ms": 3369.500160217285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:24] (step=0011439) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.22228915662650603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11440, "loss": 0.24821744859218597, "memory_gb": 7.721559524536133, "step_time_ms": 3365.447759628296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:28] (step=0011440) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.22230858919549165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11441, "loss": 0.15100161731243134, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5401096343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:32] (step=0011441) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.22232802176447727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11442, "loss": 0.25335946679115295, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4369583129883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:35] (step=0011442) Train Loss: 0.2979, Train Steps/Sec: 0.28, Epoch: 0.2223474543334629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11443, "loss": 0.3145507574081421, "memory_gb": 7.721559524536133, "step_time_ms": 3362.534523010254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:39] (step=0011443) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.2223668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11444, "loss": 0.10764244198799133, "memory_gb": 7.721559524536133, "step_time_ms": 3508.079767227173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:42] (step=0011444) Train Loss: 0.1530, Train Steps/Sec: 0.28, Epoch: 0.2223863194714341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11445, "loss": 0.26523417234420776, "memory_gb": 7.721559524536133, "step_time_ms": 3351.979970932007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:46] (step=0011445) Train Loss: 0.2914, Train Steps/Sec: 0.28, Epoch: 0.22240575204041974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11446, "loss": 0.21609735488891602, "memory_gb": 7.721559524536133, "step_time_ms": 3368.271589279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:50] (step=0011446) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.22242518460940536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11447, "loss": 0.31423771381378174, "memory_gb": 7.721559524536133, "step_time_ms": 3360.537528991699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:53] (step=0011447) Train Loss: 0.2664, Train Steps/Sec: 0.28, Epoch: 0.22244461717839098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:27:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11448, "loss": 0.22495150566101074, "memory_gb": 7.721559524536133, "step_time_ms": 3364.008903503418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:27:57] (step=0011448) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.2224640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11449, "loss": 0.2820236086845398, "memory_gb": 7.721559524536133, "step_time_ms": 3367.499828338623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:00] (step=0011449) Train Loss: 0.3259, Train Steps/Sec: 0.28, Epoch: 0.22248348231636222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11450, "loss": 0.18479956686496735, "memory_gb": 7.721559524536133, "step_time_ms": 3364.78853225708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:04] (step=0011450) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.22250291488534785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11451, "loss": 0.3322297930717468, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9826259613037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:07] (step=0011451) Train Loss: 0.3617, Train Steps/Sec: 0.28, Epoch: 0.22252234745433347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11452, "loss": 0.25640562176704407, "memory_gb": 7.721559524536133, "step_time_ms": 3369.912624359131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:11] (step=0011452) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.2225417800233191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11453, "loss": 0.2595784068107605, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6813888549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:15] (step=0011453) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.2225612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11454, "loss": 0.21575668454170227, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7641201019287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:18] (step=0011454) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.22258064516129034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11455, "loss": 0.32345104217529297, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0524616241455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:22] (step=0011455) Train Loss: 0.3126, Train Steps/Sec: 0.28, Epoch: 0.22260007773027593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11456, "loss": 0.19316613674163818, "memory_gb": 7.721559524536133, "step_time_ms": 3361.231803894043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:25] (step=0011456) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.22261951029926155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11457, "loss": 0.1759820431470871, "memory_gb": 7.721559524536133, "step_time_ms": 3362.393379211426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:29] (step=0011457) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.22263894286824717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11458, "loss": 0.20879146456718445, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0007553100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:33] (step=0011458) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.2226583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11459, "loss": 0.14642776548862457, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2209300994873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:36] (step=0011459) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.22267780800621842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11460, "loss": 0.27510857582092285, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8664531707764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:40] (step=0011460) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.22269724057520404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11461, "loss": 0.19756077229976654, "memory_gb": 7.721559524536133, "step_time_ms": 3344.1643714904785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:43] (step=0011461) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.22271667314418966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11462, "loss": 0.2142542451620102, "memory_gb": 7.721559524536133, "step_time_ms": 3364.543914794922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:47] (step=0011462) Train Loss: 0.2604, Train Steps/Sec: 0.28, Epoch: 0.2227361057131753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11463, "loss": 0.2547738552093506, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5917224884033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:50] (step=0011463) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.2227555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11464, "loss": 0.3083740770816803, "memory_gb": 7.721559524536133, "step_time_ms": 3360.187768936157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:54] (step=0011464) Train Loss: 0.2326, Train Steps/Sec: 0.28, Epoch: 0.22277497085114653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:28:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11465, "loss": 0.2381172776222229, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7487468719482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:28:58] (step=0011465) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.22279440342013215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11466, "loss": 0.16973023116588593, "memory_gb": 7.715639114379883, "step_time_ms": 3318.6697959899902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:01] (step=0011466) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.22281383598911775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11467, "loss": 0.26938626170158386, "memory_gb": 7.721559524536133, "step_time_ms": 3361.243486404419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:05] (step=0011467) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.22283326855810337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11468, "loss": 0.30789411067962646, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0263175964355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:08] (step=0011468) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.222852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11469, "loss": 0.24027995765209198, "memory_gb": 7.721559524536133, "step_time_ms": 3353.085994720459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:12] (step=0011469) Train Loss: 0.2742, Train Steps/Sec: 0.28, Epoch: 0.22287213369607461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11470, "loss": 0.277182400226593, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7449741363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:15] (step=0011470) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.22289156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11471, "loss": 0.24168354272842407, "memory_gb": 7.721559524536133, "step_time_ms": 3359.557867050171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:19] (step=0011471) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.22291099883404586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11472, "loss": 0.28566691279411316, "memory_gb": 7.721559524536133, "step_time_ms": 3357.367515563965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:23] (step=0011472) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.22293043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11473, "loss": 0.32528001070022583, "memory_gb": 7.721559524536133, "step_time_ms": 3342.9503440856934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:26] (step=0011473) Train Loss: 0.3048, Train Steps/Sec: 0.28, Epoch: 0.2229498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11474, "loss": 0.2510901987552643, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3991527557373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:30] (step=0011474) Train Loss: 0.3024, Train Steps/Sec: 0.28, Epoch: 0.22296929654100273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11475, "loss": 0.18536494672298431, "memory_gb": 7.721559524536133, "step_time_ms": 3355.642080307007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:33] (step=0011475) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.22298872910998835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11476, "loss": 0.291537880897522, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6361408233643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:37] (step=0011476) Train Loss: 0.2871, Train Steps/Sec: 0.28, Epoch: 0.22300816167897397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11477, "loss": 0.18689924478530884, "memory_gb": 7.721559524536133, "step_time_ms": 3354.279041290283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:40] (step=0011477) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.2230275942479596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11478, "loss": 0.3104765713214874, "memory_gb": 7.721559524536133, "step_time_ms": 3355.560302734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:44] (step=0011478) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.2230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11479, "loss": 0.22883065044879913, "memory_gb": 7.721559524536133, "step_time_ms": 3354.640245437622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:48] (step=0011479) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.2230664593859308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11480, "loss": 0.250637948513031, "memory_gb": 7.721559524536133, "step_time_ms": 3357.616186141968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:51] (step=0011480) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.22308589195491643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11481, "loss": 0.20796748995780945, "memory_gb": 7.721559524536133, "step_time_ms": 3345.529794692993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:55] (step=0011481) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.22310532452390205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:29:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11482, "loss": 0.26596778631210327, "memory_gb": 7.721559524536133, "step_time_ms": 3352.95033454895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:29:58] (step=0011482) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.22312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11483, "loss": 0.36548879742622375, "memory_gb": 7.721559524536133, "step_time_ms": 3355.174779891968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:02] (step=0011483) Train Loss: 0.3172, Train Steps/Sec: 0.28, Epoch: 0.2231441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11484, "loss": 0.24929556250572205, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2089252471924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:05] (step=0011484) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.22316362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11485, "loss": 0.11030176281929016, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7335815429688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:09] (step=0011485) Train Loss: 0.1880, Train Steps/Sec: 0.28, Epoch: 0.22318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11486, "loss": 0.22480569779872894, "memory_gb": 7.721559524536133, "step_time_ms": 3347.6662635803223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:13] (step=0011486) Train Loss: 0.2371, Train Steps/Sec: 0.27, Epoch: 0.22320248736883017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11487, "loss": 0.2039484679698944, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3848266601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:16] (step=0011487) Train Loss: 0.2381, Train Steps/Sec: 0.28, Epoch: 0.2232219199378158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11488, "loss": 0.32325366139411926, "memory_gb": 7.721559524536133, "step_time_ms": 3356.68683052063, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:20] (step=0011488) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.2232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11489, "loss": 0.2350645661354065, "memory_gb": 7.721559524536133, "step_time_ms": 3343.0237770080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:23] (step=0011489) Train Loss: 0.2841, Train Steps/Sec: 0.28, Epoch: 0.22326078507578703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11490, "loss": 0.13668115437030792, "memory_gb": 7.721559524536133, "step_time_ms": 3354.670286178589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:27] (step=0011490) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.22328021764477263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11491, "loss": 0.28327515721321106, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6346893310547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:31] (step=0011491) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.22329965021375825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11492, "loss": 0.12963029742240906, "memory_gb": 7.721559524536133, "step_time_ms": 3494.377374649048, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:34] (step=0011492) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.22331908278274387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11493, "loss": 0.09552285820245743, "memory_gb": 7.721559524536133, "step_time_ms": 3351.264238357544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:38] (step=0011493) Train Loss: 0.1623, Train Steps/Sec: 0.28, Epoch: 0.2233385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11494, "loss": 0.23542049527168274, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8665771484375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:41] (step=0011494) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.22335794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11495, "loss": 0.17829130589962006, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8314056396484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:45] (step=0011495) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.22337738048970074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11496, "loss": 0.14825257658958435, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0962677001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:49] (step=0011496) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.22339681305868636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11497, "loss": 0.15928179025650024, "memory_gb": 7.721559524536133, "step_time_ms": 3354.954719543457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:52] (step=0011497) Train Loss: 0.2289, Train Steps/Sec: 0.28, Epoch: 0.22341624562767198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11498, "loss": 0.2015969455242157, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2929725646973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:56] (step=0011498) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.2234356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:30:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11499, "loss": 0.2060261070728302, "memory_gb": 7.721559524536133, "step_time_ms": 3352.436304092407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:30:59] (step=0011499) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.22345511076564323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11500, "loss": 0.278165340423584, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0896644592285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:03] (step=0011500) Train Loss: 0.2835, Train Steps/Sec: 0.28, Epoch: 0.22347454333462885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11501, "loss": 0.36574095487594604, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9575805664062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:06] (step=0011501) Train Loss: 0.3579, Train Steps/Sec: 0.28, Epoch: 0.22349397590361444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11502, "loss": 0.2704049348831177, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4137268066406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:10] (step=0011502) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.22351340847260007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11503, "loss": 0.20600536465644836, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5501747131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:14] (step=0011503) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.2235328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11504, "loss": 0.20791643857955933, "memory_gb": 7.721559524536133, "step_time_ms": 3356.351375579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:17] (step=0011504) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.2235522736105713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11505, "loss": 0.18648065626621246, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8973846435547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:21] (step=0011505) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.22357170617955693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11506, "loss": 0.26579052209854126, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7559719085693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:24] (step=0011506) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.22359113874854256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11507, "loss": 0.15623041987419128, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9485416412354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:28] (step=0011507) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.22361057131752818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11508, "loss": 0.2804414629936218, "memory_gb": 7.721559524536133, "step_time_ms": 3345.5324172973633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:31] (step=0011508) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.2236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11509, "loss": 0.2697637677192688, "memory_gb": 7.715639114379883, "step_time_ms": 3304.619312286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:35] (step=0011509) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.22364943645549942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11510, "loss": 0.20362038910388947, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5802783966064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:39] (step=0011510) Train Loss: 0.2719, Train Steps/Sec: 0.28, Epoch: 0.22366886902448505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11511, "loss": 0.18514582514762878, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7072620391846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:42] (step=0011511) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.22368830159347067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11512, "loss": 0.1843211054801941, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1129570007324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:46] (step=0011512) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.2237077341624563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11513, "loss": 0.22192132472991943, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0101680755615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:49] (step=0011513) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.22372716673144188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11514, "loss": 0.22728052735328674, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8691272735596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:53] (step=0011514) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.2237465993004275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:31:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11515, "loss": 0.3075852692127228, "memory_gb": 7.721559524536133, "step_time_ms": 3360.33034324646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:31:56] (step=0011515) Train Loss: 0.2706, Train Steps/Sec: 0.28, Epoch: 0.22376603186941313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11516, "loss": 0.18119680881500244, "memory_gb": 7.721559524536133, "step_time_ms": 3357.947826385498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:00] (step=0011516) Train Loss: 0.1462, Train Steps/Sec: 0.28, Epoch: 0.22378546443839875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11517, "loss": 0.24121299386024475, "memory_gb": 7.721559524536133, "step_time_ms": 3358.189105987549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:04] (step=0011517) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.22380489700738437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11518, "loss": 0.2769339978694916, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3595027923584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:07] (step=0011518) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.22382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11519, "loss": 0.34467118978500366, "memory_gb": 7.721559524536133, "step_time_ms": 3368.878126144409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:11] (step=0011519) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.22384376214535562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11520, "loss": 0.3071928918361664, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6280040740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:14] (step=0011520) Train Loss: 0.3214, Train Steps/Sec: 0.28, Epoch: 0.22386319471434124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11521, "loss": 0.2549540102481842, "memory_gb": 7.721559524536133, "step_time_ms": 3365.828275680542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:18] (step=0011521) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.22388262728332686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11522, "loss": 0.1276429295539856, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6316108703613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:21] (step=0011522) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.22390205985231248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11523, "loss": 0.24303552508354187, "memory_gb": 7.721559524536133, "step_time_ms": 3364.365816116333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:25] (step=0011523) Train Loss: 0.2133, Train Steps/Sec: 0.28, Epoch: 0.2239214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11524, "loss": 0.14620515704154968, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3104305267334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:29] (step=0011524) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.2239409249902837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11525, "loss": 0.2462657243013382, "memory_gb": 7.721559524536133, "step_time_ms": 3365.537166595459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:32] (step=0011525) Train Loss: 0.2866, Train Steps/Sec: 0.28, Epoch: 0.22396035755926932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11526, "loss": 0.23042765259742737, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9132766723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:36] (step=0011526) Train Loss: 0.1972, Train Steps/Sec: 0.27, Epoch: 0.22397979012825495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11527, "loss": 0.3120020031929016, "memory_gb": 7.721559524536133, "step_time_ms": 3362.476348876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:40] (step=0011527) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.22399922269724057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11528, "loss": 0.20020529627799988, "memory_gb": 7.721559524536133, "step_time_ms": 3363.286256790161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:43] (step=0011528) Train Loss: 0.2245, Train Steps/Sec: 0.28, Epoch: 0.2240186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11529, "loss": 0.21134456992149353, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0877265930176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:47] (step=0011529) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.2240380878352118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11530, "loss": 0.2824494242668152, "memory_gb": 7.721559524536133, "step_time_ms": 3361.783504486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:50] (step=0011530) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.22405752040419744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11531, "loss": 0.29166343808174133, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2468452453613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:54] (step=0011531) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.22407695297318306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:32:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11532, "loss": 0.3171209990978241, "memory_gb": 7.721559524536133, "step_time_ms": 3361.988067626953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:32:57] (step=0011532) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.22409638554216868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11533, "loss": 0.24209515750408173, "memory_gb": 7.721559524536133, "step_time_ms": 3506.157159805298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:01] (step=0011533) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.2241158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11534, "loss": 0.23014575242996216, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4091148376465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:05] (step=0011534) Train Loss: 0.2078, Train Steps/Sec: 0.28, Epoch: 0.22413525068013992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11535, "loss": 0.1721188724040985, "memory_gb": 7.721559524536133, "step_time_ms": 3369.4796562194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:08] (step=0011535) Train Loss: 0.1515, Train Steps/Sec: 0.28, Epoch: 0.22415468324912555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11536, "loss": 0.23484963178634644, "memory_gb": 7.721559524536133, "step_time_ms": 3364.811658859253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:12] (step=0011536) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.22417411581811114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11537, "loss": 0.20759055018424988, "memory_gb": 7.721559524536133, "step_time_ms": 3363.271713256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:15] (step=0011537) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.22419354838709676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11538, "loss": 0.25187546014785767, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5060272216797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:19] (step=0011538) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.22421298095608239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11539, "loss": 0.26766955852508545, "memory_gb": 7.721559524536133, "step_time_ms": 3359.412908554077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:22] (step=0011539) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.224232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11540, "loss": 0.16814851760864258, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4680347442627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:26] (step=0011540) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.22425184609405363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11541, "loss": 0.2608467936515808, "memory_gb": 7.715639114379883, "step_time_ms": 3338.1752967834473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:30] (step=0011541) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.22427127866303925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11542, "loss": 0.17201101779937744, "memory_gb": 7.721559524536133, "step_time_ms": 3366.816520690918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:33] (step=0011542) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.22429071123202488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11543, "loss": 0.1505173146724701, "memory_gb": 7.721559524536133, "step_time_ms": 3368.589162826538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:37] (step=0011543) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.2243101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11544, "loss": 0.11872455477714539, "memory_gb": 7.721559524536133, "step_time_ms": 3348.475933074951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:40] (step=0011544) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.22432957636999612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11545, "loss": 0.1412162184715271, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4510765075684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:44] (step=0011545) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.22434900893898174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11546, "loss": 0.28809526562690735, "memory_gb": 7.721559524536133, "step_time_ms": 3362.675666809082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:48] (step=0011546) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.22436844150796736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11547, "loss": 0.27767467498779297, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3900413513184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:51] (step=0011547) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.224387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11548, "loss": 0.2718985080718994, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7185821533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:55] (step=0011548) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.22440730664593858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:33:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11549, "loss": 0.2940610647201538, "memory_gb": 7.721559524536133, "step_time_ms": 3368.274211883545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:33:58] (step=0011549) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.2244267392149242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11550, "loss": 0.1013852208852768, "memory_gb": 7.721559524536133, "step_time_ms": 3358.609914779663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:02] (step=0011550) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.22444617178390983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11551, "loss": 0.19660672545433044, "memory_gb": 7.721559524536133, "step_time_ms": 3360.175848007202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:05] (step=0011551) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.22446560435289545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11552, "loss": 0.34695637226104736, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8343601226807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:09] (step=0011552) Train Loss: 0.3002, Train Steps/Sec: 0.28, Epoch: 0.22448503692188107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11553, "loss": 0.26080119609832764, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7213916778564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:13] (step=0011553) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.2245044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11554, "loss": 0.2794729471206665, "memory_gb": 7.721559524536133, "step_time_ms": 3362.205982208252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:16] (step=0011554) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.22452390205985231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11555, "loss": 0.175923690199852, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3107204437256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:20] (step=0011555) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.22454333462883794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11556, "loss": 0.21902354061603546, "memory_gb": 7.721559524536133, "step_time_ms": 3360.663652420044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:23] (step=0011556) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.22456276719782356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11557, "loss": 0.3143576681613922, "memory_gb": 7.721559524536133, "step_time_ms": 3365.776777267456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:27] (step=0011557) Train Loss: 0.2968, Train Steps/Sec: 0.28, Epoch: 0.22458219976680918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11558, "loss": 0.2859750986099243, "memory_gb": 7.721559524536133, "step_time_ms": 3363.743782043457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:31] (step=0011558) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.2246016323357948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11559, "loss": 0.25499844551086426, "memory_gb": 7.721559524536133, "step_time_ms": 3347.230911254883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:34] (step=0011559) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.2246210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11560, "loss": 0.10571016371250153, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7445697784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:38] (step=0011560) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.22464049747376602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11561, "loss": 0.1671009510755539, "memory_gb": 7.721559524536133, "step_time_ms": 3365.49711227417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:41] (step=0011561) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.22465993004275164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11562, "loss": 0.23454611003398895, "memory_gb": 7.721559524536133, "step_time_ms": 3362.001895904541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:45] (step=0011562) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.22467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11563, "loss": 0.25310614705085754, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9725704193115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:49] (step=0011563) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.2246987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11564, "loss": 0.24239149689674377, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4534816741943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:52] (step=0011564) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.2247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11565, "loss": 0.3602757453918457, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8108081817627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:56] (step=0011565) Train Loss: 0.3074, Train Steps/Sec: 0.28, Epoch: 0.22473766031869413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:34:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11566, "loss": 0.22318971157073975, "memory_gb": 7.715639114379883, "step_time_ms": 3323.780059814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:34:59] (step=0011566) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.22475709288767975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11567, "loss": 0.1661549210548401, "memory_gb": 7.721559524536133, "step_time_ms": 3360.583543777466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:03] (step=0011567) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.22477652545666538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11568, "loss": 0.19767025113105774, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6528606414795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:06] (step=0011568) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.224795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11569, "loss": 0.3164617419242859, "memory_gb": 7.721559524536133, "step_time_ms": 3361.644506454468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:10] (step=0011569) Train Loss: 0.3355, Train Steps/Sec: 0.28, Epoch: 0.22481539059463662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11570, "loss": 0.25527000427246094, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2397956848145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:14] (step=0011570) Train Loss: 0.2905, Train Steps/Sec: 0.28, Epoch: 0.22483482316362224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11571, "loss": 0.2882941663265228, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5033626556396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:17] (step=0011571) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.22485425573260784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11572, "loss": 0.27390849590301514, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0398139953613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:21] (step=0011572) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.22487368830159346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11573, "loss": 0.31918951869010925, "memory_gb": 7.721559524536133, "step_time_ms": 3360.412836074829, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:25] (step=0011573) Train Loss: 0.2618, Train Steps/Sec: 0.27, Epoch: 0.22489312087057908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11574, "loss": 0.2463233470916748, "memory_gb": 7.721559524536133, "step_time_ms": 3357.574224472046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:28] (step=0011574) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.2249125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11575, "loss": 0.32531648874282837, "memory_gb": 7.721559524536133, "step_time_ms": 3358.771800994873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:32] (step=0011575) Train Loss: 0.2804, Train Steps/Sec: 0.28, Epoch: 0.22493198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11576, "loss": 0.2675431966781616, "memory_gb": 7.721559524536133, "step_time_ms": 3358.280897140503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:35] (step=0011576) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.22495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11577, "loss": 0.26945430040359497, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3128967285156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:39] (step=0011577) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.22497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11578, "loss": 0.20044387876987457, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9959869384766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:42] (step=0011578) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2249902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11579, "loss": 0.18966706097126007, "memory_gb": 7.721559524536133, "step_time_ms": 3353.294849395752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:46] (step=0011579) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.22500971628449282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11580, "loss": 0.2395819127559662, "memory_gb": 7.721559524536133, "step_time_ms": 3536.139726638794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:50] (step=0011580) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.22502914885347844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11581, "loss": 0.33639246225357056, "memory_gb": 7.721559524536133, "step_time_ms": 3355.881452560425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:53] (step=0011581) Train Loss: 0.3220, Train Steps/Sec: 0.28, Epoch: 0.22504858142246406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:35:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11582, "loss": 0.21561704576015472, "memory_gb": 7.721559524536133, "step_time_ms": 3357.837438583374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:35:57] (step=0011582) Train Loss: 0.2724, Train Steps/Sec: 0.28, Epoch: 0.22506801399144966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11583, "loss": 0.2155984789133072, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2535820007324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:00] (step=0011583) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.22508744656043528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11584, "loss": 0.2691715955734253, "memory_gb": 7.721559524536133, "step_time_ms": 3357.675790786743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:04] (step=0011584) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.2251068791294209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11585, "loss": 0.3092530071735382, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7371368408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:07] (step=0011585) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.22512631169840652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11586, "loss": 0.2941552996635437, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9584617614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:11] (step=0011586) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.22514574426739214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11587, "loss": 0.28596505522727966, "memory_gb": 7.715639114379883, "step_time_ms": 3322.2813606262207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:15] (step=0011587) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.22516517683637777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11588, "loss": 0.280392587184906, "memory_gb": 7.721559524536133, "step_time_ms": 3355.496644973755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:18] (step=0011588) Train Loss: 0.2893, Train Steps/Sec: 0.28, Epoch: 0.2251846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11589, "loss": 0.27177053689956665, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5290699005127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:22] (step=0011589) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.225204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11590, "loss": 0.22051408886909485, "memory_gb": 7.721559524536133, "step_time_ms": 3340.5137062072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:25] (step=0011590) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.22522347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11591, "loss": 0.2956699728965759, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4417362213135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:29] (step=0011591) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.22524290711232026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11592, "loss": 0.18831312656402588, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0240268707275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:32] (step=0011592) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.22526233968130588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11593, "loss": 0.2703549861907959, "memory_gb": 7.721559524536133, "step_time_ms": 3358.314037322998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:36] (step=0011593) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.2252817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11594, "loss": 0.243150994181633, "memory_gb": 7.721559524536133, "step_time_ms": 3346.6718196868896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:40] (step=0011594) Train Loss: 0.2870, Train Steps/Sec: 0.28, Epoch: 0.2253012048192771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11595, "loss": 0.24753902852535248, "memory_gb": 7.721559524536133, "step_time_ms": 3357.971668243408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:43] (step=0011595) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.22532063738826272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11596, "loss": 0.20227539539337158, "memory_gb": 7.715639114379883, "step_time_ms": 3315.9570693969727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:47] (step=0011596) Train Loss: 0.2161, Train Steps/Sec: 0.28, Epoch: 0.22534006995724834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11597, "loss": 0.22416314482688904, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4161014556885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:50] (step=0011597) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.22535950252623396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11598, "loss": 0.20120719075202942, "memory_gb": 7.721559524536133, "step_time_ms": 3360.095500946045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:54] (step=0011598) Train Loss: 0.1594, Train Steps/Sec: 0.28, Epoch: 0.22537893509521958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:36:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11599, "loss": 0.25971537828445435, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2475910186768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:36:57] (step=0011599) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.2253983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11600, "loss": 0.16671669483184814, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5150032043457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:01] (step=0011600) Train Loss: 0.1835, Train Steps/Sec: 0.28, Epoch: 0.22541780023319083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11601, "loss": 0.2894088625907898, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4139347076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:05] (step=0011601) Train Loss: 0.3105, Train Steps/Sec: 0.28, Epoch: 0.22543723280217645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11602, "loss": 0.25976210832595825, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7613830566406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:08] (step=0011602) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.22545666537116207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11603, "loss": 0.28646665811538696, "memory_gb": 7.721559524536133, "step_time_ms": 3359.525203704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:12] (step=0011603) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.2254760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11604, "loss": 0.23282024264335632, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4204444885254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:15] (step=0011604) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.22549553050913332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11605, "loss": 0.19915062189102173, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9844703674316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:19] (step=0011605) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.22551496307811894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11606, "loss": 0.2632812261581421, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8223247528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:22] (step=0011606) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.22553439564710454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11607, "loss": 0.30963677167892456, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8382148742676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:26] (step=0011607) Train Loss: 0.3118, Train Steps/Sec: 0.28, Epoch: 0.22555382821609016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11608, "loss": 0.2340201437473297, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1528663635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:30] (step=0011608) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.22557326078507578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11609, "loss": 0.27822017669677734, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1604747772217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:33] (step=0011609) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.2255926933540614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11610, "loss": 0.28225696086883545, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5518131256104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:37] (step=0011610) Train Loss: 0.2983, Train Steps/Sec: 0.28, Epoch: 0.22561212592304702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11611, "loss": 0.18759465217590332, "memory_gb": 7.721559524536133, "step_time_ms": 3358.245611190796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:40] (step=0011611) Train Loss: 0.2357, Train Steps/Sec: 0.28, Epoch: 0.22563155849203265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11612, "loss": 0.22309726476669312, "memory_gb": 7.721559524536133, "step_time_ms": 3357.705593109131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:44] (step=0011612) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.22565099106101827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11613, "loss": 0.3250390291213989, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1716289520264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:48] (step=0011613) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.2256704236300039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11614, "loss": 0.30178558826446533, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8527698516846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:51] (step=0011614) Train Loss: 0.2465, Train Steps/Sec: 0.27, Epoch: 0.2256898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11615, "loss": 0.27374184131622314, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7733974456787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:55] (step=0011615) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.22570928876797514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:37:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11616, "loss": 0.36461561918258667, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9183349609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:37:58] (step=0011616) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.22572872133696076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11617, "loss": 0.18784253299236298, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3802967071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:02] (step=0011617) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.22574815390594635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11618, "loss": 0.28158628940582275, "memory_gb": 7.721559524536133, "step_time_ms": 3362.460136413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:06] (step=0011618) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.22576758647493197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11619, "loss": 0.16582950949668884, "memory_gb": 7.721559524536133, "step_time_ms": 3361.677646636963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:09] (step=0011619) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.2257870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11620, "loss": 0.15416213870048523, "memory_gb": 7.721559524536133, "step_time_ms": 3361.339807510376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:13] (step=0011620) Train Loss: 0.1764, Train Steps/Sec: 0.28, Epoch: 0.22580645161290322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11621, "loss": 0.34374985098838806, "memory_gb": 7.721559524536133, "step_time_ms": 3499.799966812134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:16] (step=0011621) Train Loss: 0.3026, Train Steps/Sec: 0.28, Epoch: 0.22582588418188884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11622, "loss": 0.16923058032989502, "memory_gb": 7.721559524536133, "step_time_ms": 3357.60235786438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:20] (step=0011622) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.22584531675087446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11623, "loss": 0.33550381660461426, "memory_gb": 7.721559524536133, "step_time_ms": 3346.5137481689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:23] (step=0011623) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.2258647493198601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11624, "loss": 0.1931561380624771, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4307899475098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:27] (step=0011624) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.2258841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11625, "loss": 0.21268975734710693, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0875911712646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:31] (step=0011625) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.22590361445783133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11626, "loss": 0.17699357867240906, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6977462768555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:34] (step=0011626) Train Loss: 0.2634, Train Steps/Sec: 0.28, Epoch: 0.22592304702681695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11627, "loss": 0.1825924515724182, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9563369750977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:38] (step=0011627) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.22594247959580258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11628, "loss": 0.3021382987499237, "memory_gb": 7.721559524536133, "step_time_ms": 3364.736795425415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:41] (step=0011628) Train Loss: 0.2920, Train Steps/Sec: 0.28, Epoch: 0.2259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11629, "loss": 0.26446259021759033, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1652641296387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:45] (step=0011629) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.2259813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11630, "loss": 0.20332062244415283, "memory_gb": 7.721559524536133, "step_time_ms": 3364.678382873535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:48] (step=0011630) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.22600077730275941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11631, "loss": 0.2615828514099121, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1673583984375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:52] (step=0011631) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.22602020987174504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11632, "loss": 0.16076283156871796, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9826049804688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:56] (step=0011632) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.22603964244073066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11633, "loss": 0.21901866793632507, "memory_gb": 7.721559524536133, "step_time_ms": 3358.752489089966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:38:59] (step=0011633) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.22605907500971628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11634, "loss": 0.19395047426223755, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1346950531006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:03] (step=0011634) Train Loss: 0.1918, Train Steps/Sec: 0.28, Epoch: 0.2260785075787019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11635, "loss": 0.31406527757644653, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5054874420166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:06] (step=0011635) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.22609794014768753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11636, "loss": 0.3681071996688843, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2072677612305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:10] (step=0011636) Train Loss: 0.3139, Train Steps/Sec: 0.28, Epoch: 0.22611737271667315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11637, "loss": 0.21117711067199707, "memory_gb": 7.721559524536133, "step_time_ms": 3367.262601852417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:13] (step=0011637) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.22613680528565877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11638, "loss": 0.2665891647338867, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4747009277344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:17] (step=0011638) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.2261562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11639, "loss": 0.19035789370536804, "memory_gb": 7.721559524536133, "step_time_ms": 3364.114999771118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:21] (step=0011639) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.22617567042363002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11640, "loss": 0.2316020280122757, "memory_gb": 7.721559524536133, "step_time_ms": 3356.99725151062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:24] (step=0011640) Train Loss: 0.2000, Train Steps/Sec: 0.28, Epoch: 0.22619510299261564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11641, "loss": 0.28245818614959717, "memory_gb": 7.721559524536133, "step_time_ms": 3367.243528366089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:28] (step=0011641) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.22621453556160123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11642, "loss": 0.21609236299991608, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8511638641357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:31] (step=0011642) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.22623396813058685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11643, "loss": 0.22775088250637054, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5782260894775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:35] (step=0011643) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.22625340069957248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11644, "loss": 0.19205304980278015, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7163124084473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:39] (step=0011644) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.2262728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11645, "loss": 0.21258202195167542, "memory_gb": 7.721559524536133, "step_time_ms": 3364.976644515991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:42] (step=0011645) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.22629226583754372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11646, "loss": 0.22998681664466858, "memory_gb": 7.721559524536133, "step_time_ms": 3357.158899307251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:46] (step=0011646) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.22631169840652934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11647, "loss": 0.319236159324646, "memory_gb": 7.721559524536133, "step_time_ms": 3365.115165710449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:49] (step=0011647) Train Loss: 0.2869, Train Steps/Sec: 0.28, Epoch: 0.22633113097551497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11648, "loss": 0.22984860837459564, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6671772003174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:53] (step=0011648) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.2263505635445006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11649, "loss": 0.20004546642303467, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6784343719482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:39:57] (step=0011649) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.2263699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11650, "loss": 0.3054853081703186, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7847061157227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:00] (step=0011650) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.22638942868247183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11651, "loss": 0.17455649375915527, "memory_gb": 7.721559524536133, "step_time_ms": 3363.870143890381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:04] (step=0011651) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.22640886125145745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11652, "loss": 0.21256861090660095, "memory_gb": 7.721559524536133, "step_time_ms": 3362.743377685547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:07] (step=0011652) Train Loss: 0.1657, Train Steps/Sec: 0.28, Epoch: 0.22642829382044305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11653, "loss": 0.34395042061805725, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9769134521484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:11] (step=0011653) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.22644772638942867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11654, "loss": 0.19398143887519836, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2374744415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:15] (step=0011654) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.2264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11655, "loss": 0.2097020149230957, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0339374542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:18] (step=0011655) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.22648659152739992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11656, "loss": 0.19783322513103485, "memory_gb": 7.721559524536133, "step_time_ms": 3360.027074813843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:22] (step=0011656) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.22650602409638554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11657, "loss": 0.26208022236824036, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5029582977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:25] (step=0011657) Train Loss: 0.2921, Train Steps/Sec: 0.28, Epoch: 0.22652545666537116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11658, "loss": 0.23168879747390747, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2115383148193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:29] (step=0011658) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.22654488923435678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11659, "loss": 0.3363785743713379, "memory_gb": 7.721559524536133, "step_time_ms": 3360.543966293335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:33] (step=0011659) Train Loss: 0.3169, Train Steps/Sec: 0.28, Epoch: 0.2265643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11660, "loss": 0.24845245480537415, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1252307891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:36] (step=0011660) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.22658375437232803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11661, "loss": 0.18547323346138, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8333854675293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:40] (step=0011661) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.22660318694131365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11662, "loss": 0.21046791970729828, "memory_gb": 7.721559524536133, "step_time_ms": 3360.615015029907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:43] (step=0011662) Train Loss: 0.2083, Train Steps/Sec: 0.27, Epoch: 0.22662261951029927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11663, "loss": 0.3295553922653198, "memory_gb": 7.721559524536133, "step_time_ms": 3360.090970993042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:47] (step=0011663) Train Loss: 0.2841, Train Steps/Sec: 0.28, Epoch: 0.2266420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11664, "loss": 0.2710023522377014, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5828285217285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:51] (step=0011664) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.2266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11665, "loss": 0.1868540197610855, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3935546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:54] (step=0011665) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.2266809172172561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11666, "loss": 0.1787242591381073, "memory_gb": 7.721559524536133, "step_time_ms": 3360.443353652954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:40:58] (step=0011666) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.22670034978624173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11667, "loss": 0.21980495750904083, "memory_gb": 7.721559524536133, "step_time_ms": 3363.430976867676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:01] (step=0011667) Train Loss: 0.2538, Train Steps/Sec: 0.28, Epoch: 0.22671978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11668, "loss": 0.20534870028495789, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8502616882324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:05] (step=0011668) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.22673921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11669, "loss": 0.19750043749809265, "memory_gb": 7.721559524536133, "step_time_ms": 3497.8270530700684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:09] (step=0011669) Train Loss: 0.1792, Train Steps/Sec: 0.28, Epoch: 0.2267586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11670, "loss": 0.1982310563325882, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9746017456055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:12] (step=0011670) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.22677808006218422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11671, "loss": 0.3251853585243225, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2982807159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:16] (step=0011671) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.22679751263116985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11672, "loss": 0.10401000082492828, "memory_gb": 7.721559524536133, "step_time_ms": 3355.354070663452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:19] (step=0011672) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.22681694520015547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11673, "loss": 0.32684582471847534, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1668605804443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:23] (step=0011673) Train Loss: 0.3164, Train Steps/Sec: 0.28, Epoch: 0.2268363777691411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11674, "loss": 0.19756020605564117, "memory_gb": 7.721559524536133, "step_time_ms": 3355.816602706909, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:27] (step=0011674) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.2268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11675, "loss": 0.19781646132469177, "memory_gb": 7.721559524536133, "step_time_ms": 3349.849224090576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:30] (step=0011675) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.2268752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11676, "loss": 0.20991793274879456, "memory_gb": 7.721559524536133, "step_time_ms": 3353.766441345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:34] (step=0011676) Train Loss: 0.2909, Train Steps/Sec: 0.28, Epoch: 0.22689467547609793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11677, "loss": 0.3155575692653656, "memory_gb": 7.721559524536133, "step_time_ms": 3353.832244873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:37] (step=0011677) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.22691410804508355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11678, "loss": 0.30462077260017395, "memory_gb": 7.721559524536133, "step_time_ms": 3357.309341430664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:41] (step=0011678) Train Loss: 0.2995, Train Steps/Sec: 0.28, Epoch: 0.22693354061406917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11679, "loss": 0.25713229179382324, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5735092163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:44] (step=0011679) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.2269529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11680, "loss": 0.17590902745723724, "memory_gb": 7.721559524536133, "step_time_ms": 3354.618549346924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:48] (step=0011680) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.22697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11681, "loss": 0.1525832712650299, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7446937561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:52] (step=0011681) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.22699183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11682, "loss": 0.20090840756893158, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3690185546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:55] (step=0011682) Train Loss: 0.1857, Train Steps/Sec: 0.28, Epoch: 0.22701127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:41:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11683, "loss": 0.3445833623409271, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9631366729736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:41:59] (step=0011683) Train Loss: 0.3165, Train Steps/Sec: 0.28, Epoch: 0.22703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11684, "loss": 0.24943502247333527, "memory_gb": 7.721559524536133, "step_time_ms": 3352.735757827759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:02] (step=0011684) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.2270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11685, "loss": 0.24557405710220337, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0615100860596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:06] (step=0011685) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.22706956859696853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11686, "loss": 0.2250451147556305, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7357788085938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:09] (step=0011686) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.22708900116595415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11687, "loss": 0.17592500150203705, "memory_gb": 7.721559524536133, "step_time_ms": 3355.358123779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:13] (step=0011687) Train Loss: 0.1843, Train Steps/Sec: 0.28, Epoch: 0.22710843373493975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11688, "loss": 0.3321020007133484, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3198318481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:17] (step=0011688) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.22712786630392537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11689, "loss": 0.31883567571640015, "memory_gb": 7.721559524536133, "step_time_ms": 3343.707323074341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:20] (step=0011689) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.227147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11690, "loss": 0.19571399688720703, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0874462127686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:24] (step=0011690) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.2271667314418966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11691, "loss": 0.19683079421520233, "memory_gb": 7.721559524536133, "step_time_ms": 3351.88627243042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:27] (step=0011691) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.22718616401088224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11692, "loss": 0.2694705128669739, "memory_gb": 7.721559524536133, "step_time_ms": 3348.9859104156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:31] (step=0011692) Train Loss: 0.2758, Train Steps/Sec: 0.28, Epoch: 0.22720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11693, "loss": 0.3268163204193115, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9369316101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:35] (step=0011693) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.22722502914885348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11694, "loss": 0.34845736622810364, "memory_gb": 7.721559524536133, "step_time_ms": 3348.8101959228516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:38] (step=0011694) Train Loss: 0.3190, Train Steps/Sec: 0.28, Epoch: 0.2272444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11695, "loss": 0.2509618401527405, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2651405334473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:42] (step=0011695) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.22726389428682472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11696, "loss": 0.28793931007385254, "memory_gb": 7.721559524536133, "step_time_ms": 3355.541706085205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:45] (step=0011696) Train Loss: 0.2394, Train Steps/Sec: 0.28, Epoch: 0.22728332685581035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11697, "loss": 0.2310396134853363, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7727336883545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:49] (step=0011697) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.22730275942479597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11698, "loss": 0.2460424154996872, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0423126220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:52] (step=0011698) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2273221919937816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11699, "loss": 0.2545846700668335, "memory_gb": 7.721559524536133, "step_time_ms": 3354.881525039673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:42:56] (step=0011699) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.22734162456276719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11700, "loss": 0.18720166385173798, "memory_gb": 7.721559524536133, "step_time_ms": 3342.991828918457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:00] (step=0011700) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.2273610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11701, "loss": 0.18202432990074158, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5263748168945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:03] (step=0011701) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.22738048970073843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11702, "loss": 0.25061532855033875, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7683696746826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:07] (step=0011702) Train Loss: 0.2581, Train Steps/Sec: 0.27, Epoch: 0.22739992226972405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11703, "loss": 0.21169808506965637, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0069541931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:10] (step=0011703) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.22741935483870968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11704, "loss": 0.2798624038696289, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5661182403564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:14] (step=0011704) Train Loss: 0.2973, Train Steps/Sec: 0.28, Epoch: 0.2274387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11705, "loss": 0.32859963178634644, "memory_gb": 7.721559524536133, "step_time_ms": 3357.478618621826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:18] (step=0011705) Train Loss: 0.2938, Train Steps/Sec: 0.28, Epoch: 0.22745821997668092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11706, "loss": 0.21405373513698578, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3107719421387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:21] (step=0011706) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.22747765254566654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11707, "loss": 0.32600170373916626, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6767024993896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:25] (step=0011707) Train Loss: 0.2875, Train Steps/Sec: 0.28, Epoch: 0.22749708511465216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11708, "loss": 0.19165009260177612, "memory_gb": 7.721559524536133, "step_time_ms": 3352.494955062866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:28] (step=0011708) Train Loss: 0.1814, Train Steps/Sec: 0.28, Epoch: 0.2275165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11709, "loss": 0.24610093235969543, "memory_gb": 7.721559524536133, "step_time_ms": 3494.0223693847656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:32] (step=0011709) Train Loss: 0.2278, Train Steps/Sec: 0.28, Epoch: 0.2275359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11710, "loss": 0.16440024971961975, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3942699432373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:35] (step=0011710) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.227555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11711, "loss": 0.21356481313705444, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1250438690186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:39] (step=0011711) Train Loss: 0.2870, Train Steps/Sec: 0.28, Epoch: 0.22757481539059463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11712, "loss": 0.2060338705778122, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4797897338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:43] (step=0011712) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.22759424795958025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11713, "loss": 0.24080967903137207, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9818267822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:46] (step=0011713) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.22761368052856587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11714, "loss": 0.2919756770133972, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3621044158936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:50] (step=0011714) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.2276331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11715, "loss": 0.29207658767700195, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4958572387695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:53] (step=0011715) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.22765254566653711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:43:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11716, "loss": 0.2938835322856903, "memory_gb": 7.721559524536133, "step_time_ms": 3353.644371032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:43:57] (step=0011716) Train Loss: 0.3136, Train Steps/Sec: 0.28, Epoch: 0.22767197823552274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11717, "loss": 0.2776685357093811, "memory_gb": 7.721559524536133, "step_time_ms": 3363.149404525757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:00] (step=0011717) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.22769141080450836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11718, "loss": 0.23035511374473572, "memory_gb": 7.721559524536133, "step_time_ms": 3361.842155456543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:04] (step=0011718) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.22771084337349398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11719, "loss": 0.281724214553833, "memory_gb": 7.721559524536133, "step_time_ms": 3346.3189601898193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:08] (step=0011719) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.2277302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11720, "loss": 0.29526931047439575, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1972122192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:11] (step=0011720) Train Loss: 0.2810, Train Steps/Sec: 0.28, Epoch: 0.22774970851146523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11721, "loss": 0.2891700863838196, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7256412506104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:15] (step=0011721) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.22776914108045085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11722, "loss": 0.28868618607521057, "memory_gb": 7.721559524536133, "step_time_ms": 3357.076644897461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:18] (step=0011722) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.22778857364943644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11723, "loss": 0.30558645725250244, "memory_gb": 7.721559524536133, "step_time_ms": 3362.457513809204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:22] (step=0011723) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.22780800621842207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11724, "loss": 0.20111393928527832, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5686893463135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:26] (step=0011724) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.2278274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11725, "loss": 0.2059965431690216, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9961738586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:29] (step=0011725) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.2278468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11726, "loss": 0.29850122332572937, "memory_gb": 7.721559524536133, "step_time_ms": 3344.0351486206055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:33] (step=0011726) Train Loss: 0.2962, Train Steps/Sec: 0.28, Epoch: 0.22786630392537893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11727, "loss": 0.1983242779970169, "memory_gb": 7.721559524536133, "step_time_ms": 3364.100694656372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:36] (step=0011727) Train Loss: 0.1958, Train Steps/Sec: 0.28, Epoch: 0.22788573649436455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11728, "loss": 0.19601276516914368, "memory_gb": 7.721559524536133, "step_time_ms": 3354.417324066162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:40] (step=0011728) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.22790516906335018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11729, "loss": 0.1785130500793457, "memory_gb": 7.721559524536133, "step_time_ms": 3362.841844558716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:43] (step=0011729) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.2279246016323358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11730, "loss": 0.23245888948440552, "memory_gb": 7.721559524536133, "step_time_ms": 3360.790491104126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:47] (step=0011730) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.22794403420132142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11731, "loss": 0.3465467095375061, "memory_gb": 7.721559524536133, "step_time_ms": 3362.903594970703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:51] (step=0011731) Train Loss: 0.3008, Train Steps/Sec: 0.28, Epoch: 0.22796346677030704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11732, "loss": 0.2531784176826477, "memory_gb": 7.721559524536133, "step_time_ms": 3360.128402709961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:54] (step=0011732) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.22798289933929267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:44:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11733, "loss": 0.15781328082084656, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3155307769775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:44:58] (step=0011733) Train Loss: 0.1875, Train Steps/Sec: 0.28, Epoch: 0.22800233190827826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11734, "loss": 0.23788192868232727, "memory_gb": 7.721559524536133, "step_time_ms": 3360.03041267395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:01] (step=0011734) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.22802176447726388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11735, "loss": 0.17634740471839905, "memory_gb": 7.721559524536133, "step_time_ms": 3365.783214569092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:05] (step=0011735) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.2280411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11736, "loss": 0.20035293698310852, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6740703582764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:08] (step=0011736) Train Loss: 0.2532, Train Steps/Sec: 0.28, Epoch: 0.22806062961523513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11737, "loss": 0.23785285651683807, "memory_gb": 7.721559524536133, "step_time_ms": 3364.215612411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:12] (step=0011737) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.22808006218422075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11738, "loss": 0.28569135069847107, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3480072021484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:16] (step=0011738) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.22809949475320637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11739, "loss": 0.19677311182022095, "memory_gb": 7.721559524536133, "step_time_ms": 3366.941452026367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:19] (step=0011739) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.228118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11740, "loss": 0.2798524796962738, "memory_gb": 7.721559524536133, "step_time_ms": 3356.555938720703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:23] (step=0011740) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.22813835989117762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11741, "loss": 0.20031343400478363, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9202842712402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:26] (step=0011741) Train Loss: 0.1801, Train Steps/Sec: 0.28, Epoch: 0.22815779246016324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11742, "loss": 0.2525850534439087, "memory_gb": 7.721559524536133, "step_time_ms": 3361.750841140747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:30] (step=0011742) Train Loss: 0.2404, Train Steps/Sec: 0.28, Epoch: 0.22817722502914886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11743, "loss": 0.30783307552337646, "memory_gb": 7.721559524536133, "step_time_ms": 3354.560375213623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:34] (step=0011743) Train Loss: 0.2865, Train Steps/Sec: 0.28, Epoch: 0.22819665759813448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11744, "loss": 0.3620188236236572, "memory_gb": 7.721559524536133, "step_time_ms": 3358.835220336914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:37] (step=0011744) Train Loss: 0.3768, Train Steps/Sec: 0.28, Epoch: 0.2282160901671201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11745, "loss": 0.2438967078924179, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2536449432373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:41] (step=0011745) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.2282355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11746, "loss": 0.34937259554862976, "memory_gb": 7.721559524536133, "step_time_ms": 3358.682632446289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:44] (step=0011746) Train Loss: 0.2799, Train Steps/Sec: 0.28, Epoch: 0.22825495530509132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11747, "loss": 0.18233726918697357, "memory_gb": 7.721559524536133, "step_time_ms": 3355.638027191162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:48] (step=0011747) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.22827438787407694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11748, "loss": 0.20484477281570435, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1839332580566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:52] (step=0011748) Train Loss: 0.2069, Train Steps/Sec: 0.28, Epoch: 0.22829382044306257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11749, "loss": 0.296322762966156, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0239963531494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:55] (step=0011749) Train Loss: 0.2475, Train Steps/Sec: 0.27, Epoch: 0.2283132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:45:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11750, "loss": 0.23925793170928955, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6172847747803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:45:59] (step=0011750) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2283326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11751, "loss": 0.16442406177520752, "memory_gb": 7.721559524536133, "step_time_ms": 3501.3489723205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:02] (step=0011751) Train Loss: 0.1624, Train Steps/Sec: 0.28, Epoch: 0.22835211815001943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11752, "loss": 0.3075030744075775, "memory_gb": 7.721559524536133, "step_time_ms": 3362.006425857544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:06] (step=0011752) Train Loss: 0.2938, Train Steps/Sec: 0.28, Epoch: 0.22837155071900506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11753, "loss": 0.2448752075433731, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7636013031006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:10] (step=0011753) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.22839098328799068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11754, "loss": 0.30697178840637207, "memory_gb": 7.721559524536133, "step_time_ms": 3358.436346054077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:13] (step=0011754) Train Loss: 0.2995, Train Steps/Sec: 0.28, Epoch: 0.2284104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11755, "loss": 0.2197268307209015, "memory_gb": 7.721559524536133, "step_time_ms": 3355.700731277466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:17] (step=0011755) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.22842984842596192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11756, "loss": 0.3497925400733948, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7002754211426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:20] (step=0011756) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.22844928099494755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11757, "loss": 0.3471597731113434, "memory_gb": 7.721559524536133, "step_time_ms": 3346.1313247680664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:24] (step=0011757) Train Loss: 0.3148, Train Steps/Sec: 0.28, Epoch: 0.22846871356393314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11758, "loss": 0.14615429937839508, "memory_gb": 7.721559524536133, "step_time_ms": 3358.982801437378, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:28] (step=0011758) Train Loss: 0.1884, Train Steps/Sec: 0.28, Epoch: 0.22848814613291876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11759, "loss": 0.25222158432006836, "memory_gb": 7.721559524536133, "step_time_ms": 3357.82790184021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:31] (step=0011759) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.22850757870190438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11760, "loss": 0.2965424060821533, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7516803741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:35] (step=0011760) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.22852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11761, "loss": 0.24456965923309326, "memory_gb": 7.721559524536133, "step_time_ms": 3345.3454971313477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:38] (step=0011761) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.22854644383987563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11762, "loss": 0.3472650647163391, "memory_gb": 7.721559524536133, "step_time_ms": 3363.267660140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:42] (step=0011762) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.22856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11763, "loss": 0.21898695826530457, "memory_gb": 7.721559524536133, "step_time_ms": 3360.370397567749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:45] (step=0011763) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.22858530897784687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11764, "loss": 0.18776355683803558, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1016063690186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:49] (step=0011764) Train Loss: 0.1994, Train Steps/Sec: 0.28, Epoch: 0.2286047415468325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11765, "loss": 0.2487090528011322, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7978191375732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:53] (step=0011765) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.22862417411581812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:46:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11766, "loss": 0.29064929485321045, "memory_gb": 7.721559524536133, "step_time_ms": 3349.320888519287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:46:56] (step=0011766) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.22864360668480374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11767, "loss": 0.30980581045150757, "memory_gb": 7.721559524536133, "step_time_ms": 3356.433391571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:00] (step=0011767) Train Loss: 0.2653, Train Steps/Sec: 0.28, Epoch: 0.22866303925378936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11768, "loss": 0.2635136842727661, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6584796905518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:03] (step=0011768) Train Loss: 0.2904, Train Steps/Sec: 0.28, Epoch: 0.22868247182277496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11769, "loss": 0.1580219864845276, "memory_gb": 7.721559524536133, "step_time_ms": 3353.182792663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:07] (step=0011769) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.22870190439176058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11770, "loss": 0.25068342685699463, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5353622436523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:11] (step=0011770) Train Loss: 0.2082, Train Steps/Sec: 0.28, Epoch: 0.2287213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11771, "loss": 0.2744137644767761, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7663173675537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:14] (step=0011771) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.22874076952973182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11772, "loss": 0.21372099220752716, "memory_gb": 7.721559524536133, "step_time_ms": 3357.428789138794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:18] (step=0011772) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.22876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11773, "loss": 0.23148395121097565, "memory_gb": 7.721559524536133, "step_time_ms": 3353.351593017578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:21] (step=0011773) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.22877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11774, "loss": 0.30280667543411255, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3948192596436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:25] (step=0011774) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.2287990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11775, "loss": 0.1855471134185791, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0286293029785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:29] (step=0011775) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.2288184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11776, "loss": 0.2898247539997101, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0188026428223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:32] (step=0011776) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.22883793237465994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11777, "loss": 0.28513550758361816, "memory_gb": 7.721559524536133, "step_time_ms": 3353.444814682007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:36] (step=0011777) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.22885736494364556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11778, "loss": 0.1852392554283142, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9279956817627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:39] (step=0011778) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.22887679751263118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11779, "loss": 0.19730548560619354, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1769256591797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:43] (step=0011779) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.2288962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11780, "loss": 0.15802498161792755, "memory_gb": 7.721559524536133, "step_time_ms": 3346.331834793091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:46] (step=0011780) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.2289156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11781, "loss": 0.1997821033000946, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5001487731934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:50] (step=0011781) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.22893509521958802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11782, "loss": 0.29955846071243286, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2544326782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:54] (step=0011782) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.22895452778857364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:47:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11783, "loss": 0.13030947744846344, "memory_gb": 7.721559524536133, "step_time_ms": 3351.0093688964844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:47:57] (step=0011783) Train Loss: 0.2005, Train Steps/Sec: 0.28, Epoch: 0.22897396035755926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11784, "loss": 0.18346261978149414, "memory_gb": 7.721559524536133, "step_time_ms": 3347.48911857605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:01] (step=0011784) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.2289933929265449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11785, "loss": 0.24101731181144714, "memory_gb": 7.721559524536133, "step_time_ms": 3350.768566131592, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:04] (step=0011785) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.2290128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11786, "loss": 0.1948608160018921, "memory_gb": 7.721559524536133, "step_time_ms": 3351.984977722168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:08] (step=0011786) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.22903225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11787, "loss": 0.2372191846370697, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6314754486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:12] (step=0011787) Train Loss: 0.1988, Train Steps/Sec: 0.28, Epoch: 0.22905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11788, "loss": 0.26062193512916565, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5804748535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:15] (step=0011788) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.22907112320248738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11789, "loss": 0.319282591342926, "memory_gb": 7.721559524536133, "step_time_ms": 3349.760055541992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:19] (step=0011789) Train Loss: 0.2924, Train Steps/Sec: 0.28, Epoch: 0.229090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11790, "loss": 0.16830947995185852, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5373210906982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:22] (step=0011790) Train Loss: 0.1857, Train Steps/Sec: 0.27, Epoch: 0.22910998834045862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11791, "loss": 0.20884230732917786, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5923023223877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:26] (step=0011791) Train Loss: 0.1908, Train Steps/Sec: 0.28, Epoch: 0.22912942090944421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11792, "loss": 0.27978724241256714, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3928184509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:30] (step=0011792) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.22914885347842984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11793, "loss": 0.23603767156600952, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8902740478516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:33] (step=0011793) Train Loss: 0.2896, Train Steps/Sec: 0.28, Epoch: 0.22916828604741546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11794, "loss": 0.23040589690208435, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8970432281494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:37] (step=0011794) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.22918771861640108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11795, "loss": 0.1664610207080841, "memory_gb": 7.721559524536133, "step_time_ms": 3344.1927433013916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:40] (step=0011795) Train Loss: 0.1784, Train Steps/Sec: 0.28, Epoch: 0.2292071511853867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11796, "loss": 0.2698224186897278, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8003902435303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:44] (step=0011796) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.22922658375437233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11797, "loss": 0.2887241840362549, "memory_gb": 7.721559524536133, "step_time_ms": 3355.896234512329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:47] (step=0011797) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.22924601632335795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11798, "loss": 0.33655327558517456, "memory_gb": 7.721559524536133, "step_time_ms": 3500.870704650879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:51] (step=0011798) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.22926544889234357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11799, "loss": 0.2715636193752289, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6997985839844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:55] (step=0011799) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.2292848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:48:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11800, "loss": 0.2711432874202728, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6903648376465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:48:58] (step=0011800) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.22930431403031482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11801, "loss": 0.24364073574543, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8592071533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:02] (step=0011801) Train Loss: 0.3129, Train Steps/Sec: 0.28, Epoch: 0.22932374659930044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11802, "loss": 0.19226090610027313, "memory_gb": 7.721559524536133, "step_time_ms": 3352.952241897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:05] (step=0011802) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.22934317916828606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11803, "loss": 0.251369833946228, "memory_gb": 7.721559524536133, "step_time_ms": 3346.1477756500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:09] (step=0011803) Train Loss: 0.2971, Train Steps/Sec: 0.28, Epoch: 0.22936261173727165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11804, "loss": 0.336642324924469, "memory_gb": 7.721559524536133, "step_time_ms": 3341.601610183716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:12] (step=0011804) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.22938204430625728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11805, "loss": 0.2666442394256592, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4562587738037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:16] (step=0011805) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.2294014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11806, "loss": 0.2185169905424118, "memory_gb": 7.721559524536133, "step_time_ms": 3340.0609493255615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:19] (step=0011806) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.22942090944422852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11807, "loss": 0.22317831218242645, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3203086853027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:23] (step=0011807) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.22944034201321414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11808, "loss": 0.28797826170921326, "memory_gb": 7.721559524536133, "step_time_ms": 3345.693349838257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:27] (step=0011808) Train Loss: 0.2528, Train Steps/Sec: 0.28, Epoch: 0.22945977458219977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11809, "loss": 0.34557390213012695, "memory_gb": 7.721559524536133, "step_time_ms": 3348.5262393951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:30] (step=0011809) Train Loss: 0.3160, Train Steps/Sec: 0.28, Epoch: 0.2294792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11810, "loss": 0.1999138593673706, "memory_gb": 7.721559524536133, "step_time_ms": 3356.894016265869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:34] (step=0011810) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.229498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11811, "loss": 0.2465319037437439, "memory_gb": 7.721559524536133, "step_time_ms": 3356.687545776367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:37] (step=0011811) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.22951807228915663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11812, "loss": 0.21150782704353333, "memory_gb": 7.721559524536133, "step_time_ms": 3344.512462615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:41] (step=0011812) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.22953750485814226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11813, "loss": 0.2535107731819153, "memory_gb": 7.721559524536133, "step_time_ms": 3344.4786071777344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:44] (step=0011813) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.22955693742712788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11814, "loss": 0.2620643973350525, "memory_gb": 7.721559524536133, "step_time_ms": 3359.15207862854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:48] (step=0011814) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.2295763699961135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11815, "loss": 0.17542800307273865, "memory_gb": 7.721559524536133, "step_time_ms": 3349.560022354126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:52] (step=0011815) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.2295958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11816, "loss": 0.26295173168182373, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3525161743164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:55] (step=0011816) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.22961523513408472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:49:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11817, "loss": 0.21333631873130798, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6548824310303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:49:59] (step=0011817) Train Loss: 0.2097, Train Steps/Sec: 0.28, Epoch: 0.22963466770307034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11818, "loss": 0.2954927086830139, "memory_gb": 7.721559524536133, "step_time_ms": 3357.646703720093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:02] (step=0011818) Train Loss: 0.3038, Train Steps/Sec: 0.28, Epoch: 0.22965410027205596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11819, "loss": 0.25920987129211426, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3411750793457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:06] (step=0011819) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.22967353284104158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11820, "loss": 0.16039887070655823, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9423637390137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:10] (step=0011820) Train Loss: 0.1445, Train Steps/Sec: 0.28, Epoch: 0.2296929654100272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11821, "loss": 0.3688426613807678, "memory_gb": 7.721559524536133, "step_time_ms": 3356.170892715454, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:13] (step=0011821) Train Loss: 0.3230, Train Steps/Sec: 0.28, Epoch: 0.22971239797901283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11822, "loss": 0.3468627333641052, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6126823425293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:17] (step=0011822) Train Loss: 0.2718, Train Steps/Sec: 0.28, Epoch: 0.22973183054799845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11823, "loss": 0.3304588794708252, "memory_gb": 7.715639114379883, "step_time_ms": 3321.530342102051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:20] (step=0011823) Train Loss: 0.3290, Train Steps/Sec: 0.28, Epoch: 0.22975126311698407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11824, "loss": 0.16881641745567322, "memory_gb": 7.721559524536133, "step_time_ms": 3357.776165008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:24] (step=0011824) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.2297706956859697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11825, "loss": 0.14809177815914154, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3523502349854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:27] (step=0011825) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.22979012825495532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11826, "loss": 0.26006603240966797, "memory_gb": 7.721559524536133, "step_time_ms": 3359.912157058716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:31] (step=0011826) Train Loss: 0.3159, Train Steps/Sec: 0.28, Epoch: 0.2298095608239409, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11827, "loss": 0.231062114238739, "memory_gb": 7.721559524536133, "step_time_ms": 3358.609437942505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:35] (step=0011827) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.22982899339292653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11828, "loss": 0.21137197315692902, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9117736816406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:38] (step=0011828) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.22984842596191216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11829, "loss": 0.23173820972442627, "memory_gb": 7.721559524536133, "step_time_ms": 3356.407403945923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:42] (step=0011829) Train Loss: 0.2793, Train Steps/Sec: 0.28, Epoch: 0.22986785853089778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11830, "loss": 0.23245444893836975, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2354011535645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:45] (step=0011830) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.2298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11831, "loss": 0.22242581844329834, "memory_gb": 7.721559524536133, "step_time_ms": 3356.433629989624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:49] (step=0011831) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.22990672366886902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11832, "loss": 0.21773335337638855, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5645637512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:52] (step=0011832) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.22992615623785465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:50:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11833, "loss": 0.22950375080108643, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8787879943848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:50:56] (step=0011833) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.22994558880684027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11834, "loss": 0.1919340044260025, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5094470977783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:00] (step=0011834) Train Loss: 0.1807, Train Steps/Sec: 0.28, Epoch: 0.2299650213758259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11835, "loss": 0.3675747513771057, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6719455718994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:03] (step=0011835) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.2299844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11836, "loss": 0.27781742811203003, "memory_gb": 7.721559524536133, "step_time_ms": 3360.830545425415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:07] (step=0011836) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.23000388651379713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11837, "loss": 0.20598269999027252, "memory_gb": 7.715639114379883, "step_time_ms": 3327.282667160034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:10] (step=0011837) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.23002331908278276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11838, "loss": 0.18599830567836761, "memory_gb": 7.721559524536133, "step_time_ms": 3361.032009124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:14] (step=0011838) Train Loss: 0.1883, Train Steps/Sec: 0.27, Epoch: 0.23004275165176835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11839, "loss": 0.23694133758544922, "memory_gb": 7.721559524536133, "step_time_ms": 3503.3252239227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:18] (step=0011839) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.23006218422075397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11840, "loss": 0.268984854221344, "memory_gb": 7.721559524536133, "step_time_ms": 3346.625328063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:21] (step=0011840) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.2300816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11841, "loss": 0.2412116527557373, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0093383789062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:25] (step=0011841) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.23010104935872522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11842, "loss": 0.28580573201179504, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1509075164795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:28] (step=0011842) Train Loss: 0.2895, Train Steps/Sec: 0.28, Epoch: 0.23012048192771084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11843, "loss": 0.25617343187332153, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2704277038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:32] (step=0011843) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.23013991449669646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11844, "loss": 0.18670278787612915, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6134605407715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:36] (step=0011844) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.23015934706568209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11845, "loss": 0.2679349184036255, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7913208007812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:39] (step=0011845) Train Loss: 0.2452, Train Steps/Sec: 0.28, Epoch: 0.2301787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11846, "loss": 0.3080746829509735, "memory_gb": 7.721559524536133, "step_time_ms": 3365.537405014038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:43] (step=0011846) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.23019821220365333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11847, "loss": 0.28864359855651855, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8200759887695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:46] (step=0011847) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.23021764477263895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11848, "loss": 0.24151024222373962, "memory_gb": 7.721559524536133, "step_time_ms": 3361.34672164917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:50] (step=0011848) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.23023707734162457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11849, "loss": 0.2994201183319092, "memory_gb": 7.721559524536133, "step_time_ms": 3363.852024078369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:53] (step=0011849) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.2302565099106102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:51:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11850, "loss": 0.20288851857185364, "memory_gb": 7.721559524536133, "step_time_ms": 3364.696741104126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:51:57] (step=0011850) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.2302759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11851, "loss": 0.1474558711051941, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2325191497803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:01] (step=0011851) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.2302953750485814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11852, "loss": 0.244575634598732, "memory_gb": 7.721559524536133, "step_time_ms": 3363.497495651245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:04] (step=0011852) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.23031480761756704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11853, "loss": 0.23156990110874176, "memory_gb": 7.721559524536133, "step_time_ms": 3370.194673538208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:08] (step=0011853) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.23033424018655266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11854, "loss": 0.2540634572505951, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3369884490967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:11] (step=0011854) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.23035367275553828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11855, "loss": 0.2556498646736145, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9022789001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:15] (step=0011855) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.2303731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11856, "loss": 0.2547740936279297, "memory_gb": 7.721559524536133, "step_time_ms": 3354.891061782837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:18] (step=0011856) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.23039253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11857, "loss": 0.17298075556755066, "memory_gb": 7.721559524536133, "step_time_ms": 3365.618944168091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:22] (step=0011857) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.23041197046249515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11858, "loss": 0.20753392577171326, "memory_gb": 7.721559524536133, "step_time_ms": 3365.878105163574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:26] (step=0011858) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.23043140303148077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11859, "loss": 0.2525641918182373, "memory_gb": 7.721559524536133, "step_time_ms": 3366.13130569458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:29] (step=0011859) Train Loss: 0.2725, Train Steps/Sec: 0.28, Epoch: 0.2304508356004664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11860, "loss": 0.13908356428146362, "memory_gb": 7.721559524536133, "step_time_ms": 3355.724573135376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:33] (step=0011860) Train Loss: 0.1408, Train Steps/Sec: 0.28, Epoch: 0.23047026816945201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11861, "loss": 0.28686821460723877, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7371158599854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:36] (step=0011861) Train Loss: 0.3040, Train Steps/Sec: 0.28, Epoch: 0.2304897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11862, "loss": 0.22549468278884888, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4427318573, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:40] (step=0011862) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.23050913330742323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11863, "loss": 0.22782449424266815, "memory_gb": 7.721559524536133, "step_time_ms": 3364.804267883301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:44] (step=0011863) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.23052856587640885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11864, "loss": 0.2830997109413147, "memory_gb": 7.721559524536133, "step_time_ms": 3357.332944869995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:47] (step=0011864) Train Loss: 0.3011, Train Steps/Sec: 0.28, Epoch: 0.23054799844539448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11865, "loss": 0.17722488939762115, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6717071533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:51] (step=0011865) Train Loss: 0.1657, Train Steps/Sec: 0.28, Epoch: 0.2305674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11866, "loss": 0.27305689454078674, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5496158599854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:54] (step=0011866) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.23058686358336572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:52:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11867, "loss": 0.27093642950057983, "memory_gb": 7.721559524536133, "step_time_ms": 3359.426736831665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:52:58] (step=0011867) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.23060629615235134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11868, "loss": 0.12970365583896637, "memory_gb": 7.721559524536133, "step_time_ms": 3356.875419616699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:02] (step=0011868) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.23062572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11869, "loss": 0.15316355228424072, "memory_gb": 7.721559524536133, "step_time_ms": 3344.7299003601074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:05] (step=0011869) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.2306451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11870, "loss": 0.3161178231239319, "memory_gb": 7.721559524536133, "step_time_ms": 3357.398509979248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:09] (step=0011870) Train Loss: 0.3071, Train Steps/Sec: 0.28, Epoch: 0.2306645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11871, "loss": 0.203277125954628, "memory_gb": 7.721559524536133, "step_time_ms": 3361.539840698242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:12] (step=0011871) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.23068402642829383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11872, "loss": 0.22480110824108124, "memory_gb": 7.721559524536133, "step_time_ms": 3360.639810562134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:16] (step=0011872) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.23070345899727945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11873, "loss": 0.23250067234039307, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2937717437744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:19] (step=0011873) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.23072289156626505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11874, "loss": 0.29425352811813354, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6854000091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:23] (step=0011874) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.23074232413525067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11875, "loss": 0.1309206187725067, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1407794952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:27] (step=0011875) Train Loss: 0.1699, Train Steps/Sec: 0.28, Epoch: 0.2307617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11876, "loss": 0.24948182702064514, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1785221099854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:30] (step=0011876) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.23078118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11877, "loss": 0.2544906735420227, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4065437316895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:34] (step=0011877) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.23080062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11878, "loss": 0.25265684723854065, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6855449676514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:37] (step=0011878) Train Loss: 0.2631, Train Steps/Sec: 0.27, Epoch: 0.23082005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11879, "loss": 0.2354920506477356, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1879653930664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:41] (step=0011879) Train Loss: 0.2059, Train Steps/Sec: 0.28, Epoch: 0.23083948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11880, "loss": 0.2117331624031067, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8173904418945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:45] (step=0011880) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.2308589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11881, "loss": 0.3408510088920593, "memory_gb": 7.721559524536133, "step_time_ms": 3357.309103012085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:48] (step=0011881) Train Loss: 0.3287, Train Steps/Sec: 0.28, Epoch: 0.23087835211815003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11882, "loss": 0.30665963888168335, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4235439300537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:52] (step=0011882) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.23089778468713565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11883, "loss": 0.2123790979385376, "memory_gb": 7.721559524536133, "step_time_ms": 3350.637435913086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:55] (step=0011883) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.23091721725612127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11884, "loss": 0.23391656577587128, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1040630340576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:53:59] (step=0011884) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.23093664982510687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11885, "loss": 0.2596881687641144, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5728664398193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:03] (step=0011885) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.2309560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11886, "loss": 0.16094771027565002, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0917797088623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:06] (step=0011886) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.2309755149630781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11887, "loss": 0.261910617351532, "memory_gb": 7.721559524536133, "step_time_ms": 3492.103338241577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:10] (step=0011887) Train Loss: 0.2714, Train Steps/Sec: 0.28, Epoch: 0.23099494753206373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11888, "loss": 0.21685145795345306, "memory_gb": 7.721559524536133, "step_time_ms": 3357.626438140869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:13] (step=0011888) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.23101438010104935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11889, "loss": 0.18801675736904144, "memory_gb": 7.721559524536133, "step_time_ms": 3357.862949371338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:17] (step=0011889) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.23103381267003498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11890, "loss": 0.22435873746871948, "memory_gb": 7.721559524536133, "step_time_ms": 3356.54878616333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:20] (step=0011890) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.2310532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11891, "loss": 0.1648091971874237, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5736122131348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:24] (step=0011891) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.23107267780800622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11892, "loss": 0.27957600355148315, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7142486572266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:28] (step=0011892) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.23109211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11893, "loss": 0.3218410611152649, "memory_gb": 7.721559524536133, "step_time_ms": 3359.611988067627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:31] (step=0011893) Train Loss: 0.3197, Train Steps/Sec: 0.28, Epoch: 0.23111154294597747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11894, "loss": 0.23690126836299896, "memory_gb": 7.721559524536133, "step_time_ms": 3340.0964736938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:35] (step=0011894) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.2311309755149631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11895, "loss": 0.17829783260822296, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8218479156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:38] (step=0011895) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.2311504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11896, "loss": 0.26410001516342163, "memory_gb": 7.721559524536133, "step_time_ms": 3353.487968444824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:42] (step=0011896) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.2311698406529343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11897, "loss": 0.3649093508720398, "memory_gb": 7.721559524536133, "step_time_ms": 3353.264570236206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:45] (step=0011897) Train Loss: 0.2828, Train Steps/Sec: 0.28, Epoch: 0.23118927322191993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11898, "loss": 0.25988054275512695, "memory_gb": 7.721559524536133, "step_time_ms": 3353.915214538574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:49] (step=0011898) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.23120870579090555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11899, "loss": 0.22603026032447815, "memory_gb": 7.721559524536133, "step_time_ms": 3350.587844848633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:53] (step=0011899) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.23122813835989117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:54:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11900, "loss": 0.18047833442687988, "memory_gb": 7.721559524536133, "step_time_ms": 3359.480619430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:54:56] (step=0011900) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.2312475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11901, "loss": 0.27401110529899597, "memory_gb": 7.721559524536133, "step_time_ms": 3339.5633697509766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:00] (step=0011901) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.23126700349786242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11902, "loss": 0.132843017578125, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1700115203857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:03] (step=0011902) Train Loss: 0.1486, Train Steps/Sec: 0.28, Epoch: 0.23128643606684804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11903, "loss": 0.26978036761283875, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5681705474854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:07] (step=0011903) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.23130586863583366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11904, "loss": 0.2512550354003906, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4169921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:10] (step=0011904) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.23132530120481928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11905, "loss": 0.15692976117134094, "memory_gb": 7.721559524536133, "step_time_ms": 3355.246067047119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:14] (step=0011905) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.2313447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11906, "loss": 0.17585837841033936, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3098697662354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:18] (step=0011906) Train Loss: 0.1649, Train Steps/Sec: 0.28, Epoch: 0.23136416634279053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11907, "loss": 0.31958526372909546, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9187393188477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:21] (step=0011907) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.23138359891177615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11908, "loss": 0.21932174265384674, "memory_gb": 7.721559524536133, "step_time_ms": 3352.902889251709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:25] (step=0011908) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.23140303148076174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11909, "loss": 0.2752857506275177, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2890758514404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:28] (step=0011909) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.23142246404974737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11910, "loss": 0.2904178500175476, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6931018829346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:32] (step=0011910) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.231441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11911, "loss": 0.2879181504249573, "memory_gb": 7.721559524536133, "step_time_ms": 3360.490322113037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:35] (step=0011911) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.2314613291877186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11912, "loss": 0.2406434863805771, "memory_gb": 7.721559524536133, "step_time_ms": 3344.003677368164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:39] (step=0011912) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.23148076175670423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11913, "loss": 0.17976070940494537, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3099327087402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:43] (step=0011913) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.23150019432568986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11914, "loss": 0.21654056012630463, "memory_gb": 7.721559524536133, "step_time_ms": 3358.891248703003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:46] (step=0011914) Train Loss: 0.2087, Train Steps/Sec: 0.28, Epoch: 0.23151962689467548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11915, "loss": 0.24032819271087646, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6315593719482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:50] (step=0011915) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.2315390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11916, "loss": 0.216695636510849, "memory_gb": 7.721559524536133, "step_time_ms": 3357.52534866333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:53] (step=0011916) Train Loss: 0.1689, Train Steps/Sec: 0.28, Epoch: 0.23155849203264672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:55:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11917, "loss": 0.17101040482521057, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3689670562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:55:57] (step=0011917) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.23157792460163235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11918, "loss": 0.25100791454315186, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8436584472656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:00] (step=0011918) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.23159735717061797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11919, "loss": 0.20807993412017822, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7305030822754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:04] (step=0011919) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.23161678973960356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 11920, "loss": 0.16468389332294464, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7985553741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:07] (step=0011920) Train Loss: 0.1926, Train Steps/Sec: 0.28, Epoch: 0.23163622230858918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11921, "loss": 0.256727933883667, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8909587860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:11] (step=0011921) Train Loss: 0.2037, Train Steps/Sec: 0.28, Epoch: 0.2316556548775748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11922, "loss": 0.27810829877853394, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4957847595215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:15] (step=0011922) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.23167508744656043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11923, "loss": 0.31257426738739014, "memory_gb": 7.715639114379883, "step_time_ms": 3317.6193237304688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:18] (step=0011923) Train Loss: 0.3389, Train Steps/Sec: 0.28, Epoch: 0.23169452001554605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11924, "loss": 0.2280808687210083, "memory_gb": 7.721559524536133, "step_time_ms": 3345.2537059783936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:22] (step=0011924) Train Loss: 0.2482, Train Steps/Sec: 0.29, Epoch: 0.23171395258453167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11925, "loss": 0.27545106410980225, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3028526306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:25] (step=0011925) Train Loss: 0.2508, Train Steps/Sec: 0.27, Epoch: 0.2317333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11926, "loss": 0.2215801328420639, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3544540405273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:29] (step=0011926) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.23175281772250292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11927, "loss": 0.26144248247146606, "memory_gb": 7.721559524536133, "step_time_ms": 3493.2644367218018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:33] (step=0011927) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.23177225029148854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11928, "loss": 0.1653585284948349, "memory_gb": 7.721559524536133, "step_time_ms": 3350.374937057495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:36] (step=0011928) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.23179168286047416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11929, "loss": 0.2085283398628235, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0759506225586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:40] (step=0011929) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.23181111542945979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11930, "loss": 0.22417433559894562, "memory_gb": 7.721559524536133, "step_time_ms": 3365.236282348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:43] (step=0011930) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.2318305479984454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11931, "loss": 0.31403639912605286, "memory_gb": 7.721559524536133, "step_time_ms": 3364.030361175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:47] (step=0011931) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.231849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11932, "loss": 0.311245858669281, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7095737457275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:50] (step=0011932) Train Loss: 0.3238, Train Steps/Sec: 0.28, Epoch: 0.23186941313641662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 11933, "loss": 0.2992168664932251, "memory_gb": 7.721559524536133, "step_time_ms": 3359.510898590088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:54] (step=0011933) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.23188884570540225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:56:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11934, "loss": 0.17561858892440796, "memory_gb": 7.721559524536133, "step_time_ms": 3360.374689102173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:56:58] (step=0011934) Train Loss: 0.1666, Train Steps/Sec: 0.28, Epoch: 0.23190827827438787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 11935, "loss": 0.2333393394947052, "memory_gb": 7.721559524536133, "step_time_ms": 3366.842031478882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:01] (step=0011935) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.2319277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11936, "loss": 0.2583305537700653, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7891750335693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:05] (step=0011936) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.2319471434123591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11937, "loss": 0.2315436154603958, "memory_gb": 7.721559524536133, "step_time_ms": 3362.15877532959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:08] (step=0011937) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.23196657598134474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 11938, "loss": 0.31079262495040894, "memory_gb": 7.721559524536133, "step_time_ms": 3364.231586456299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:12] (step=0011938) Train Loss: 0.2830, Train Steps/Sec: 0.28, Epoch: 0.23198600855033036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11939, "loss": 0.33187079429626465, "memory_gb": 7.721559524536133, "step_time_ms": 3367.687463760376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:15] (step=0011939) Train Loss: 0.3009, Train Steps/Sec: 0.28, Epoch: 0.23200544111931598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 11940, "loss": 0.3138018846511841, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1722507476807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:19] (step=0011940) Train Loss: 0.2445, Train Steps/Sec: 0.28, Epoch: 0.2320248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11941, "loss": 0.14052806794643402, "memory_gb": 7.721559524536133, "step_time_ms": 3369.168758392334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:23] (step=0011941) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.23204430625728723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 11942, "loss": 0.23451653122901917, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9721145629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:26] (step=0011942) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.23206373882627282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 11943, "loss": 0.33059442043304443, "memory_gb": 7.721559524536133, "step_time_ms": 3366.666793823242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:30] (step=0011943) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.23208317139525844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11944, "loss": 0.24234020709991455, "memory_gb": 7.715639114379883, "step_time_ms": 3319.833517074585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:33] (step=0011944) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.23210260396424406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 11945, "loss": 0.23327970504760742, "memory_gb": 7.721559524536133, "step_time_ms": 3359.694242477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:37] (step=0011945) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.2321220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11946, "loss": 0.2604539394378662, "memory_gb": 7.721559524536133, "step_time_ms": 3365.391969680786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:40] (step=0011946) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.2321414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 11947, "loss": 0.25222304463386536, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0663414001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:44] (step=0011947) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.23216090167120093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 11948, "loss": 0.3002421259880066, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6019439697266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:48] (step=0011948) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.23218033424018655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 11949, "loss": 0.26392796635627747, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2750511169434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:51] (step=0011949) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.23219976680917218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 11950, "loss": 0.2367827594280243, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0379905700684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:55] (step=0011950) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.2322191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:57:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 11951, "loss": 0.23414957523345947, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6631031036377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:57:58] (step=0011951) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.23223863194714342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 11952, "loss": 0.27695441246032715, "memory_gb": 7.721559524536133, "step_time_ms": 3355.442762374878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:02] (step=0011952) Train Loss: 0.2781, Train Steps/Sec: 0.28, Epoch: 0.23225806451612904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 11953, "loss": 0.21028047800064087, "memory_gb": 7.715639114379883, "step_time_ms": 3333.071708679199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:05] (step=0011953) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.23227749708511466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 11954, "loss": 0.19470560550689697, "memory_gb": 7.721559524536133, "step_time_ms": 3365.283489227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:09] (step=0011954) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.23229692965410026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 11955, "loss": 0.17993058264255524, "memory_gb": 7.721559524536133, "step_time_ms": 3360.271453857422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:13] (step=0011955) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.23231636222308588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 11956, "loss": 0.20680055022239685, "memory_gb": 7.721559524536133, "step_time_ms": 3360.769748687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:16] (step=0011956) Train Loss: 0.1894, Train Steps/Sec: 0.28, Epoch: 0.2323357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 11957, "loss": 0.25725582242012024, "memory_gb": 7.721559524536133, "step_time_ms": 3365.945339202881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:20] (step=0011957) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.23235522736105713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 11958, "loss": 0.2977890372276306, "memory_gb": 7.721559524536133, "step_time_ms": 3366.262435913086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:23] (step=0011958) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.23237465993004275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 11959, "loss": 0.19883155822753906, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4935665130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:27] (step=0011959) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.23239409249902837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 11960, "loss": 0.23997829854488373, "memory_gb": 7.721559524536133, "step_time_ms": 3356.38165473938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:31] (step=0011960) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 11961, "loss": 0.27321359515190125, "memory_gb": 7.721559524536133, "step_time_ms": 3363.866090774536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:34] (step=0011961) Train Loss: 0.2719, Train Steps/Sec: 0.28, Epoch: 0.23243295763699962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 11962, "loss": 0.14876459538936615, "memory_gb": 7.721559524536133, "step_time_ms": 3360.445737838745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:38] (step=0011962) Train Loss: 0.2224, Train Steps/Sec: 0.28, Epoch: 0.23245239020598524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 11963, "loss": 0.2877735495567322, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9019050598145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:41] (step=0011963) Train Loss: 0.2692, Train Steps/Sec: 0.28, Epoch: 0.23247182277497086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 11964, "loss": 0.26467788219451904, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4724407196045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:45] (step=0011964) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.23249125534395648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 11965, "loss": 0.25275087356567383, "memory_gb": 7.721559524536133, "step_time_ms": 3363.130807876587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:49] (step=0011965) Train Loss: 0.2991, Train Steps/Sec: 0.28, Epoch: 0.2325106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 11966, "loss": 0.25751903653144836, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3132705688477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:52] (step=0011966) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.2325301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 11967, "loss": 0.23693789541721344, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6251430511475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:56] (step=0011967) Train Loss: 0.2488, Train Steps/Sec: 0.28, Epoch: 0.23254955305091332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:58:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 11968, "loss": 0.1706331968307495, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6671562194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:58:59] (step=0011968) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.23256898561989894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 11969, "loss": 0.24520611763000488, "memory_gb": 7.721559524536133, "step_time_ms": 3344.2752361297607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:03] (step=0011969) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.23258841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 11970, "loss": 0.18895722925662994, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3898029327393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:06] (step=0011970) Train Loss: 0.1995, Train Steps/Sec: 0.28, Epoch: 0.2326078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 11971, "loss": 0.29265105724334717, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8035106658936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:10] (step=0011971) Train Loss: 0.3082, Train Steps/Sec: 0.28, Epoch: 0.2326272833268558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 11972, "loss": 0.25214695930480957, "memory_gb": 7.721559524536133, "step_time_ms": 3363.614797592163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:14] (step=0011972) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.23264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 11973, "loss": 0.2639308571815491, "memory_gb": 7.721559524536133, "step_time_ms": 3366.325616836548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:17] (step=0011973) Train Loss: 0.2444, Train Steps/Sec: 0.27, Epoch: 0.23266614846482706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 11974, "loss": 0.24940067529678345, "memory_gb": 7.721559524536133, "step_time_ms": 3498.6586570739746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:21] (step=0011974) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.23268558103381268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 11975, "loss": 0.2095886468887329, "memory_gb": 7.721559524536133, "step_time_ms": 3359.848976135254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:24] (step=0011975) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.2327050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 11976, "loss": 0.2803523540496826, "memory_gb": 7.721559524536133, "step_time_ms": 3359.386205673218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:28] (step=0011976) Train Loss: 0.2645, Train Steps/Sec: 0.28, Epoch: 0.23272444617178392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 11977, "loss": 0.1701968014240265, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5446815490723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:32] (step=0011977) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.23274387874076952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 11978, "loss": 0.279034823179245, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0660610198975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:35] (step=0011978) Train Loss: 0.2670, Train Steps/Sec: 0.28, Epoch: 0.23276331130975514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 11979, "loss": 0.20679785311222076, "memory_gb": 7.721559524536133, "step_time_ms": 3362.694501876831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:39] (step=0011979) Train Loss: 0.1846, Train Steps/Sec: 0.28, Epoch: 0.23278274387874076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 11980, "loss": 0.2393336147069931, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5474281311035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:42] (step=0011980) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.23280217644772638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 11981, "loss": 0.2923278510570526, "memory_gb": 7.721559524536133, "step_time_ms": 3363.248825073242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:46] (step=0011981) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.232821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11982, "loss": 0.2814691662788391, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0939044952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:50] (step=0011982) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.23284104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 11983, "loss": 0.3126019239425659, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8040084838867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:53] (step=0011983) Train Loss: 0.2974, Train Steps/Sec: 0.28, Epoch: 0.23286047415468325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 11:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 11984, "loss": 0.29114872217178345, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8601093292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 11:59:57] (step=0011984) Train Loss: 0.2144, Train Steps/Sec: 0.28, Epoch: 0.23287990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 11985, "loss": 0.2204570174217224, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5823307037354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:00] (step=0011985) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.2328993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 11986, "loss": 0.30476343631744385, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3685626983643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:04] (step=0011986) Train Loss: 0.3365, Train Steps/Sec: 0.28, Epoch: 0.23291877186164012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 11987, "loss": 0.2806469202041626, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9893836975098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:08] (step=0011987) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.23293820443062574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 11988, "loss": 0.21554625034332275, "memory_gb": 7.721559524536133, "step_time_ms": 3352.1270751953125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:11] (step=0011988) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.23295763699961136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 11989, "loss": 0.20620869100093842, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5688343048096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:15] (step=0011989) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.23297706956859696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 11990, "loss": 0.2753101587295532, "memory_gb": 7.721559524536133, "step_time_ms": 3351.0284423828125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:18] (step=0011990) Train Loss: 0.3076, Train Steps/Sec: 0.28, Epoch: 0.23299650213758258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 11991, "loss": 0.3328680694103241, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6575775146484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:22] (step=0011991) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.2330159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 11992, "loss": 0.20181843638420105, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2067279815674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:25] (step=0011992) Train Loss: 0.1900, Train Steps/Sec: 0.28, Epoch: 0.23303536727555382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 11993, "loss": 0.2287784069776535, "memory_gb": 7.721559524536133, "step_time_ms": 3355.211019515991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:29] (step=0011993) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.23305479984453945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 11994, "loss": 0.2567260265350342, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0072860717773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:33] (step=0011994) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.23307423241352507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 11995, "loss": 0.2112198770046234, "memory_gb": 7.721559524536133, "step_time_ms": 3340.256452560425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:36] (step=0011995) Train Loss: 0.2615, Train Steps/Sec: 0.29, Epoch: 0.2330936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 11996, "loss": 0.24600449204444885, "memory_gb": 7.721559524536133, "step_time_ms": 3341.616630554199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:40] (step=0011996) Train Loss: 0.2309, Train Steps/Sec: 0.29, Epoch: 0.2331130975514963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 11997, "loss": 0.19571858644485474, "memory_gb": 7.721559524536133, "step_time_ms": 3347.529411315918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:43] (step=0011997) Train Loss: 0.1985, Train Steps/Sec: 0.29, Epoch: 0.23313253012048193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 11998, "loss": 0.2828274965286255, "memory_gb": 7.721559524536133, "step_time_ms": 3355.800151824951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:47] (step=0011998) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.23315196268946756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 11999, "loss": 0.1701299101114273, "memory_gb": 7.721559524536133, "step_time_ms": 3347.506523132324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:50] (step=0011999) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.23317139525845318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12000, "loss": 0.4033834636211395, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6671867370605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:54] (step=0012000) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.23319082782743877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:00:54] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0012000/ [2025-07-29 12:00:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12001, "loss": 0.1543424129486084, "memory_gb": 7.721559524536133, "step_time_ms": 3349.336862564087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:00:57] (step=0012001) Train Loss: 0.1753, Train Steps/Sec: 0.28, Epoch: 0.2332102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12002, "loss": 0.2798304259777069, "memory_gb": 7.721559524536133, "step_time_ms": 3339.773654937744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:01] (step=0012002) Train Loss: 0.1990, Train Steps/Sec: 0.29, Epoch: 0.23322969296541002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12003, "loss": 0.2626110017299652, "memory_gb": 7.721559524536133, "step_time_ms": 3353.208541870117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:04] (step=0012003) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.23324912553439564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12004, "loss": 0.3347776234149933, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6014251708984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:08] (step=0012004) Train Loss: 0.3059, Train Steps/Sec: 0.28, Epoch: 0.23326855810338126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12005, "loss": 0.303477019071579, "memory_gb": 7.721559524536133, "step_time_ms": 3353.677749633789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:11] (step=0012005) Train Loss: 0.2667, Train Steps/Sec: 0.28, Epoch: 0.23328799067236689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12006, "loss": 0.23842725157737732, "memory_gb": 7.721559524536133, "step_time_ms": 3353.306531906128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:15] (step=0012006) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.2333074232413525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12007, "loss": 0.28834646940231323, "memory_gb": 7.721559524536133, "step_time_ms": 3358.692169189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:19] (step=0012007) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.23332685581033813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12008, "loss": 0.22792388498783112, "memory_gb": 7.721559524536133, "step_time_ms": 3348.5665321350098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:22] (step=0012008) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.23334628837932375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12009, "loss": 0.3332071006298065, "memory_gb": 7.721559524536133, "step_time_ms": 3352.285861968994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:26] (step=0012009) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.23336572094830937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12010, "loss": 0.3013342618942261, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4626445770264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:29] (step=0012010) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.233385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12011, "loss": 0.1839882880449295, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8662662506104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:33] (step=0012011) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.23340458608628062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12012, "loss": 0.2045212984085083, "memory_gb": 7.721559524536133, "step_time_ms": 3358.46209526062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:36] (step=0012012) Train Loss: 0.1971, Train Steps/Sec: 0.28, Epoch: 0.2334240186552662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12013, "loss": 0.30096954107284546, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4767322540283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:40] (step=0012013) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.23344345122425184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12014, "loss": 0.19407397508621216, "memory_gb": 7.721559524536133, "step_time_ms": 3348.2792377471924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:44] (step=0012014) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.23346288379323746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12015, "loss": 0.22490745782852173, "memory_gb": 7.721559524536133, "step_time_ms": 3360.26930809021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:47] (step=0012015) Train Loss: 0.2112, Train Steps/Sec: 0.27, Epoch: 0.23348231636222308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12016, "loss": 0.3570305109024048, "memory_gb": 7.721559524536133, "step_time_ms": 3340.9125804901123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:51] (step=0012016) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.2335017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12017, "loss": 0.23425808548927307, "memory_gb": 7.721559524536133, "step_time_ms": 3355.973720550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:54] (step=0012017) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.23352118150019432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:01:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12018, "loss": 0.19752028584480286, "memory_gb": 7.721559524536133, "step_time_ms": 3490.49711227417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:01:58] (step=0012018) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.23354061406917995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12019, "loss": 0.23998510837554932, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7351875305176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:02] (step=0012019) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.23356004663816557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12020, "loss": 0.30575287342071533, "memory_gb": 7.721559524536133, "step_time_ms": 3344.447135925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:05] (step=0012020) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.2335794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12021, "loss": 0.3060888648033142, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0733070373535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:09] (step=0012021) Train Loss: 0.2875, Train Steps/Sec: 0.28, Epoch: 0.23359891177613681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12022, "loss": 0.22793376445770264, "memory_gb": 7.715639114379883, "step_time_ms": 3324.1329193115234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:12] (step=0012022) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.23361834434512244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12023, "loss": 0.22319582104682922, "memory_gb": 7.721559524536133, "step_time_ms": 3356.99725151062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:16] (step=0012023) Train Loss: 0.1992, Train Steps/Sec: 0.28, Epoch: 0.23363777691410806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12024, "loss": 0.27447476983070374, "memory_gb": 7.721559524536133, "step_time_ms": 3360.457420349121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:19] (step=0012024) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.23365720948309365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12025, "loss": 0.11663024127483368, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3272228240967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:23] (step=0012025) Train Loss: 0.1817, Train Steps/Sec: 0.28, Epoch: 0.23367664205207928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12026, "loss": 0.24750164151191711, "memory_gb": 7.721559524536133, "step_time_ms": 3359.565019607544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:27] (step=0012026) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.2336960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12027, "loss": 0.2748870253562927, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8784885406494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:30] (step=0012027) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.23371550719005052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12028, "loss": 0.1979072093963623, "memory_gb": 7.721559524536133, "step_time_ms": 3360.811948776245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:34] (step=0012028) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.23373493975903614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12029, "loss": 0.34256845712661743, "memory_gb": 7.715639114379883, "step_time_ms": 3325.2243995666504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:37] (step=0012029) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.23375437232802176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12030, "loss": 0.2987474799156189, "memory_gb": 7.721559524536133, "step_time_ms": 3358.416795730591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:41] (step=0012030) Train Loss: 0.2900, Train Steps/Sec: 0.28, Epoch: 0.2337738048970074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12031, "loss": 0.20037701725959778, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7122898101807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:44] (step=0012031) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.233793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12032, "loss": 0.17556129395961761, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6012287139893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:48] (step=0012032) Train Loss: 0.1676, Train Steps/Sec: 0.28, Epoch: 0.23381267003497863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12033, "loss": 0.23103532195091248, "memory_gb": 7.721559524536133, "step_time_ms": 3359.950542449951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:52] (step=0012033) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.23383210260396425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12034, "loss": 0.15885823965072632, "memory_gb": 7.721559524536133, "step_time_ms": 3362.238645553589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:55] (step=0012034) Train Loss: 0.1916, Train Steps/Sec: 0.28, Epoch: 0.23385153517294988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12035, "loss": 0.252896249294281, "memory_gb": 7.721559524536133, "step_time_ms": 3361.835718154907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:02:59] (step=0012035) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.23387096774193547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12036, "loss": 0.2190949022769928, "memory_gb": 7.721559524536133, "step_time_ms": 3366.774320602417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:02] (step=0012036) Train Loss: 0.1804, Train Steps/Sec: 0.28, Epoch: 0.2338904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12037, "loss": 0.21481938660144806, "memory_gb": 7.715639114379883, "step_time_ms": 3321.3624954223633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:06] (step=0012037) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.23390983287990672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12038, "loss": 0.32814979553222656, "memory_gb": 7.721559524536133, "step_time_ms": 3364.215135574341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:09] (step=0012038) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.23392926544889234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12039, "loss": 0.19724041223526, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1649017333984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:13] (step=0012039) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.23394869801787796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12040, "loss": 0.14851899445056915, "memory_gb": 7.721559524536133, "step_time_ms": 3364.96639251709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:17] (step=0012040) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.23396813058686358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12041, "loss": 0.26281607151031494, "memory_gb": 7.721559524536133, "step_time_ms": 3360.766887664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:20] (step=0012041) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.2339875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12042, "loss": 0.23026788234710693, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8366718292236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:24] (step=0012042) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.23400699572483483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12043, "loss": 0.20170579850673676, "memory_gb": 7.721559524536133, "step_time_ms": 3363.675832748413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:27] (step=0012043) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.23402642829382045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12044, "loss": 0.2859325408935547, "memory_gb": 7.721559524536133, "step_time_ms": 3354.623794555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:31] (step=0012044) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.23404586086280607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12045, "loss": 0.21219249069690704, "memory_gb": 7.721559524536133, "step_time_ms": 3362.290859222412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:35] (step=0012045) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.2340652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12046, "loss": 0.20061424374580383, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5973510742188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:38] (step=0012046) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.23408472600077732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12047, "loss": 0.30669575929641724, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3742332458496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:42] (step=0012047) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.2341041585697629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12048, "loss": 0.2452457845211029, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3030910491943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:45] (step=0012048) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.23412359113874853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12049, "loss": 0.3155231177806854, "memory_gb": 7.721559524536133, "step_time_ms": 3361.574411392212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:49] (step=0012049) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.23414302370773415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12050, "loss": 0.32103431224823, "memory_gb": 7.715639114379883, "step_time_ms": 3330.310106277466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:53] (step=0012050) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.23416245627671978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:03:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12051, "loss": 0.2090926617383957, "memory_gb": 7.721559524536133, "step_time_ms": 3364.837884902954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:03:56] (step=0012051) Train Loss: 0.2748, Train Steps/Sec: 0.28, Epoch: 0.2341818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12052, "loss": 0.13145074248313904, "memory_gb": 7.721559524536133, "step_time_ms": 3362.989902496338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:00] (step=0012052) Train Loss: 0.1357, Train Steps/Sec: 0.28, Epoch: 0.23420132141469102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12053, "loss": 0.1439821422100067, "memory_gb": 7.721559524536133, "step_time_ms": 3360.475540161133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:03] (step=0012053) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.23422075398367664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12054, "loss": 0.35957521200180054, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2889099121094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:07] (step=0012054) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.23424018655266227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12055, "loss": 0.24151012301445007, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5644187927246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:10] (step=0012055) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.2342596191216479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12056, "loss": 0.2114902138710022, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4748973846436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:14] (step=0012056) Train Loss: 0.2414, Train Steps/Sec: 0.27, Epoch: 0.2342790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12057, "loss": 0.2326834499835968, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9255294799805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:18] (step=0012057) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.23429848425961913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12058, "loss": 0.31998851895332336, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0637912750244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:21] (step=0012058) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.23431791682860476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12059, "loss": 0.41003283858299255, "memory_gb": 7.715639114379883, "step_time_ms": 3471.2119102478027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:25] (step=0012059) Train Loss: 0.3190, Train Steps/Sec: 0.28, Epoch: 0.23433734939759035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12060, "loss": 0.29477545619010925, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4182987213135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:28] (step=0012060) Train Loss: 0.2904, Train Steps/Sec: 0.28, Epoch: 0.23435678196657597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12061, "loss": 0.2481852024793625, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1136627197266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:32] (step=0012061) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.2343762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12062, "loss": 0.22411441802978516, "memory_gb": 7.721559524536133, "step_time_ms": 3363.236427307129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:36] (step=0012062) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.23439564710454722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12063, "loss": 0.294017493724823, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7642135620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:39] (step=0012063) Train Loss: 0.2808, Train Steps/Sec: 0.28, Epoch: 0.23441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12064, "loss": 0.22354453802108765, "memory_gb": 7.721559524536133, "step_time_ms": 3359.898567199707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:43] (step=0012064) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.23443451224251846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12065, "loss": 0.3055803179740906, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4421310424805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:46] (step=0012065) Train Loss: 0.2978, Train Steps/Sec: 0.28, Epoch: 0.23445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12066, "loss": 0.19130614399909973, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9817962646484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:50] (step=0012066) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.2344733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12067, "loss": 0.16613616049289703, "memory_gb": 7.715639114379883, "step_time_ms": 3329.200506210327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:54] (step=0012067) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.23449280994947533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12068, "loss": 0.34858018159866333, "memory_gb": 7.721559524536133, "step_time_ms": 3361.528158187866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:04:57] (step=0012068) Train Loss: 0.3254, Train Steps/Sec: 0.28, Epoch: 0.23451224251846095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12069, "loss": 0.3099343776702881, "memory_gb": 7.721559524536133, "step_time_ms": 3362.058401107788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:01] (step=0012069) Train Loss: 0.3043, Train Steps/Sec: 0.28, Epoch: 0.23453167508744657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12070, "loss": 0.2745114862918854, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9085807800293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:04] (step=0012070) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.23455110765643217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12071, "loss": 0.22461280226707458, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5712394714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:08] (step=0012071) Train Loss: 0.2012, Train Steps/Sec: 0.28, Epoch: 0.2345705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12072, "loss": 0.2170620709657669, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4208488464355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:12] (step=0012072) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.2345899727944034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12073, "loss": 0.09975729137659073, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5851402282715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:15] (step=0012073) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.23460940536338903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12074, "loss": 0.1711709201335907, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1112670898438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:19] (step=0012074) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.23462883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12075, "loss": 0.23232415318489075, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0305786132812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:22] (step=0012075) Train Loss: 0.2975, Train Steps/Sec: 0.28, Epoch: 0.23464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12076, "loss": 0.3411098122596741, "memory_gb": 7.715639114379883, "step_time_ms": 3326.5397548675537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:26] (step=0012076) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.2346677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12077, "loss": 0.30073386430740356, "memory_gb": 7.721559524536133, "step_time_ms": 3346.0445404052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:30] (step=0012077) Train Loss: 0.2891, Train Steps/Sec: 0.28, Epoch: 0.23468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12078, "loss": 0.27998995780944824, "memory_gb": 7.715639114379883, "step_time_ms": 3322.453498840332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:33] (step=0012078) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.23470656820831715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12079, "loss": 0.24630206823349, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3270053863525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:37] (step=0012079) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.23472600077730277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12080, "loss": 0.3057233393192291, "memory_gb": 7.721559524536133, "step_time_ms": 3360.276222229004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:40] (step=0012080) Train Loss: 0.2450, Train Steps/Sec: 0.28, Epoch: 0.2347454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12081, "loss": 0.1732572466135025, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6925525665283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:44] (step=0012081) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.234764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12082, "loss": 0.25759008526802063, "memory_gb": 7.721559524536133, "step_time_ms": 3359.015464782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:48] (step=0012082) Train Loss: 0.2058, Train Steps/Sec: 0.28, Epoch: 0.2347842984842596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12083, "loss": 0.20587027072906494, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9514026641846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:51] (step=0012083) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.23480373105324523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12084, "loss": 0.31528913974761963, "memory_gb": 7.721559524536133, "step_time_ms": 3360.682725906372, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:55] (step=0012084) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.23482316362223085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12085, "loss": 0.24880820512771606, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7798347473145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:05:58] (step=0012085) Train Loss: 0.2615, Train Steps/Sec: 0.28, Epoch: 0.23484259619121647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12086, "loss": 0.20728638768196106, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7049503326416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:02] (step=0012086) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.2348620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12087, "loss": 0.13996383547782898, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1291484832764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:05] (step=0012087) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.23488146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12088, "loss": 0.21273678541183472, "memory_gb": 7.721559524536133, "step_time_ms": 3357.341527938843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:09] (step=0012088) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.23490089389817334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12089, "loss": 0.1460517942905426, "memory_gb": 7.721559524536133, "step_time_ms": 3344.984292984009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:13] (step=0012089) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.23492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12090, "loss": 0.19508953392505646, "memory_gb": 7.721559524536133, "step_time_ms": 3362.290620803833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:16] (step=0012090) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.23493975903614459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12091, "loss": 0.22298195958137512, "memory_gb": 7.721559524536133, "step_time_ms": 3355.271577835083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:20] (step=0012091) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.2349591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12092, "loss": 0.16750836372375488, "memory_gb": 7.721559524536133, "step_time_ms": 3360.459327697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:23] (step=0012092) Train Loss: 0.1731, Train Steps/Sec: 0.28, Epoch: 0.23497862417411583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12093, "loss": 0.21681387722492218, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1177558898926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:27] (step=0012093) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.23499805674310142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12094, "loss": 0.3249547481536865, "memory_gb": 7.715639114379883, "step_time_ms": 3322.1802711486816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:31] (step=0012094) Train Loss: 0.3065, Train Steps/Sec: 0.28, Epoch: 0.23501748931208705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12095, "loss": 0.17196246981620789, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2631912231445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:34] (step=0012095) Train Loss: 0.2123, Train Steps/Sec: 0.28, Epoch: 0.23503692188107267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12096, "loss": 0.2934061288833618, "memory_gb": 7.721559524536133, "step_time_ms": 3358.839273452759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:38] (step=0012096) Train Loss: 0.3350, Train Steps/Sec: 0.28, Epoch: 0.2350563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12097, "loss": 0.25686490535736084, "memory_gb": 7.721559524536133, "step_time_ms": 3352.325677871704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:41] (step=0012097) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.2350757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12098, "loss": 0.15820957720279694, "memory_gb": 7.721559524536133, "step_time_ms": 3349.0970134735107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:45] (step=0012098) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.23509521958802954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12099, "loss": 0.2290867269039154, "memory_gb": 7.721559524536133, "step_time_ms": 3357.327699661255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:48] (step=0012099) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.23511465215701516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12100, "loss": 0.28580984473228455, "memory_gb": 7.721559524536133, "step_time_ms": 3360.408306121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:52] (step=0012100) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.23513408472600078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12101, "loss": 0.22930029034614563, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6390018463135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:56] (step=0012101) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.2351535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12102, "loss": 0.27708369493484497, "memory_gb": 7.721559524536133, "step_time_ms": 3356.295347213745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:06:59] (step=0012102) Train Loss: 0.2915, Train Steps/Sec: 0.28, Epoch: 0.23517294986397203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12103, "loss": 0.3239946961402893, "memory_gb": 7.721559524536133, "step_time_ms": 3355.994701385498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:03] (step=0012103) Train Loss: 0.2345, Train Steps/Sec: 0.27, Epoch: 0.23519238243295765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12104, "loss": 0.3074001371860504, "memory_gb": 7.721559524536133, "step_time_ms": 3359.441041946411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:06] (step=0012104) Train Loss: 0.3329, Train Steps/Sec: 0.28, Epoch: 0.23521181500194327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12105, "loss": 0.26608341932296753, "memory_gb": 7.715639114379883, "step_time_ms": 3320.2319145202637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:10] (step=0012105) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.23523124757092886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12106, "loss": 0.33270788192749023, "memory_gb": 7.721559524536133, "step_time_ms": 3355.250597000122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:14] (step=0012106) Train Loss: 0.2852, Train Steps/Sec: 0.28, Epoch: 0.2352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12107, "loss": 0.18452733755111694, "memory_gb": 7.721559524536133, "step_time_ms": 3496.3603019714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:17] (step=0012107) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.2352701127089001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12108, "loss": 0.2478983998298645, "memory_gb": 7.721559524536133, "step_time_ms": 3339.721918106079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:21] (step=0012108) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.23528954527788573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12109, "loss": 0.23868972063064575, "memory_gb": 7.721559524536133, "step_time_ms": 3356.142044067383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:24] (step=0012109) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.23530897784687135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12110, "loss": 0.24158824980258942, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5466918945312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:28] (step=0012110) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.23532841041585698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12111, "loss": 0.14617733657360077, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6792011260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:31] (step=0012111) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.2353478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12112, "loss": 0.23799706995487213, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2254180908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:35] (step=0012112) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.23536727555382822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12113, "loss": 0.225155308842659, "memory_gb": 7.721559524536133, "step_time_ms": 3355.307102203369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:39] (step=0012113) Train Loss: 0.1590, Train Steps/Sec: 0.28, Epoch: 0.23538670812281384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12114, "loss": 0.1433928906917572, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2888679504395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:42] (step=0012114) Train Loss: 0.1755, Train Steps/Sec: 0.28, Epoch: 0.23540614069179946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12115, "loss": 0.15098604559898376, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9859523773193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:46] (step=0012115) Train Loss: 0.1334, Train Steps/Sec: 0.28, Epoch: 0.2354255732607851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12116, "loss": 0.11929111927747726, "memory_gb": 7.721559524536133, "step_time_ms": 3355.159282684326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:49] (step=0012116) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.2354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12117, "loss": 0.2358599305152893, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1288890838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:53] (step=0012117) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.2354644383987563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12118, "loss": 0.28727009892463684, "memory_gb": 7.721559524536133, "step_time_ms": 3350.446939468384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:07:56] (step=0012118) Train Loss: 0.2768, Train Steps/Sec: 0.28, Epoch: 0.23548387096774193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12119, "loss": 0.2569630742073059, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1666221618652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:00] (step=0012119) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.23550330353672755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12120, "loss": 0.18578317761421204, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1132164001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:04] (step=0012120) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.23552273610571317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12121, "loss": 0.23034676909446716, "memory_gb": 7.721559524536133, "step_time_ms": 3359.236478805542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:07] (step=0012121) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.2355421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12122, "loss": 0.19980907440185547, "memory_gb": 7.721559524536133, "step_time_ms": 3340.859889984131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:11] (step=0012122) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.23556160124368442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12123, "loss": 0.18045125901699066, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6057682037354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:14] (step=0012123) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.23558103381267004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12124, "loss": 0.19578030705451965, "memory_gb": 7.721559524536133, "step_time_ms": 3353.917121887207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:18] (step=0012124) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.23560046638165566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12125, "loss": 0.30145302414894104, "memory_gb": 7.715639114379883, "step_time_ms": 3319.657325744629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:21] (step=0012125) Train Loss: 0.3213, Train Steps/Sec: 0.28, Epoch: 0.23561989895064128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12126, "loss": 0.2767719626426697, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2369346618652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:25] (step=0012126) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.2356393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12127, "loss": 0.2712770402431488, "memory_gb": 7.721559524536133, "step_time_ms": 3357.029438018799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:29] (step=0012127) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.23565876408861253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12128, "loss": 0.23685762286186218, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7566871643066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:32] (step=0012128) Train Loss: 0.2429, Train Steps/Sec: 0.28, Epoch: 0.23567819665759812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12129, "loss": 0.20010706782341003, "memory_gb": 7.721559524536133, "step_time_ms": 3341.6783809661865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:36] (step=0012129) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.23569762922658374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12130, "loss": 0.1533878743648529, "memory_gb": 7.721559524536133, "step_time_ms": 3353.074073791504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:39] (step=0012130) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.23571706179556937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12131, "loss": 0.13145577907562256, "memory_gb": 7.721559524536133, "step_time_ms": 3366.069555282593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:43] (step=0012131) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.235736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12132, "loss": 0.1723019778728485, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5447750091553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:47] (step=0012132) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12133, "loss": 0.19828465580940247, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0379695892334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:50] (step=0012133) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.23577535950252623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12134, "loss": 0.34287768602371216, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6128482818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:54] (step=0012134) Train Loss: 0.3059, Train Steps/Sec: 0.28, Epoch: 0.23579479207151186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12135, "loss": 0.2631041407585144, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3550662994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:08:57] (step=0012135) Train Loss: 0.2887, Train Steps/Sec: 0.28, Epoch: 0.23581422464049748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12136, "loss": 0.2505037784576416, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4532947540283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:01] (step=0012136) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.2358336572094831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12137, "loss": 0.25450456142425537, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4163608551025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:04] (step=0012137) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.23585308977846872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12138, "loss": 0.1604561060667038, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9725284576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:08] (step=0012138) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.23587252234745434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12139, "loss": 0.32962822914123535, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7115955352783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:12] (step=0012139) Train Loss: 0.3390, Train Steps/Sec: 0.28, Epoch: 0.23589195491643997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12140, "loss": 0.1785610318183899, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0010662078857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:15] (step=0012140) Train Loss: 0.1628, Train Steps/Sec: 0.28, Epoch: 0.23591138748542556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12141, "loss": 0.21972061693668365, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8624210357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:19] (step=0012141) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.23593082005441118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12142, "loss": 0.25300294160842896, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3575744628906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:22] (step=0012142) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.2359502526233968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12143, "loss": 0.3225787580013275, "memory_gb": 7.721559524536133, "step_time_ms": 3362.483263015747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:26] (step=0012143) Train Loss: 0.2457, Train Steps/Sec: 0.27, Epoch: 0.23596968519238243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12144, "loss": 0.3331286311149597, "memory_gb": 7.721559524536133, "step_time_ms": 3363.360643386841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:30] (step=0012144) Train Loss: 0.3130, Train Steps/Sec: 0.28, Epoch: 0.23598911776136805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12145, "loss": 0.22151558101177216, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2077026367188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:33] (step=0012145) Train Loss: 0.1794, Train Steps/Sec: 0.28, Epoch: 0.23600855033035367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12146, "loss": 0.16212210059165955, "memory_gb": 7.721559524536133, "step_time_ms": 3363.893985748291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:37] (step=0012146) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.2360279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12147, "loss": 0.23873966932296753, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4819259643555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:40] (step=0012147) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.23604741546832492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12148, "loss": 0.3575788736343384, "memory_gb": 7.715639114379883, "step_time_ms": 3464.4057750701904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:44] (step=0012148) Train Loss: 0.3466, Train Steps/Sec: 0.28, Epoch: 0.23606684803731054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12149, "loss": 0.2873958945274353, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3086681365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:47] (step=0012149) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.23608628060629616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12150, "loss": 0.15000887215137482, "memory_gb": 7.721559524536133, "step_time_ms": 3358.27898979187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:51] (step=0012150) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.23610571317528178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12151, "loss": 0.2575090527534485, "memory_gb": 7.721559524536133, "step_time_ms": 3362.694501876831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:55] (step=0012151) Train Loss: 0.2917, Train Steps/Sec: 0.28, Epoch: 0.23612514574426738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:09:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12152, "loss": 0.26912808418273926, "memory_gb": 7.721559524536133, "step_time_ms": 3358.68763923645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:09:58] (step=0012152) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.236144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12153, "loss": 0.19156178832054138, "memory_gb": 7.721559524536133, "step_time_ms": 3365.56339263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:02] (step=0012153) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.23616401088223862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12154, "loss": 0.2570381760597229, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2777252197266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:05] (step=0012154) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.23618344345122425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12155, "loss": 0.23010264337062836, "memory_gb": 7.721559524536133, "step_time_ms": 3363.222122192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:09] (step=0012155) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.23620287602020987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12156, "loss": 0.17053347826004028, "memory_gb": 7.721559524536133, "step_time_ms": 3345.107316970825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:13] (step=0012156) Train Loss: 0.2341, Train Steps/Sec: 0.28, Epoch: 0.2362223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12157, "loss": 0.2570039629936218, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4354133605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:16] (step=0012157) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.2362417411581811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12158, "loss": 0.23006263375282288, "memory_gb": 7.721559524536133, "step_time_ms": 3363.203525543213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:20] (step=0012158) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.23626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12159, "loss": 0.15999428927898407, "memory_gb": 7.721559524536133, "step_time_ms": 3362.506151199341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:23] (step=0012159) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.23628060629615236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12160, "loss": 0.14975155889987946, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0079498291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:27] (step=0012160) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.23630003886513798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12161, "loss": 0.269355833530426, "memory_gb": 7.721559524536133, "step_time_ms": 3359.724283218384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:30] (step=0012161) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.2363194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12162, "loss": 0.15533246099948883, "memory_gb": 7.721559524536133, "step_time_ms": 3367.043972015381, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:34] (step=0012162) Train Loss: 0.1622, Train Steps/Sec: 0.28, Epoch: 0.23633890400310922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12163, "loss": 0.11448408663272858, "memory_gb": 7.721559524536133, "step_time_ms": 3364.577054977417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:38] (step=0012163) Train Loss: 0.2044, Train Steps/Sec: 0.28, Epoch: 0.23635833657209482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12164, "loss": 0.27774927020072937, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8533096313477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:41] (step=0012164) Train Loss: 0.2592, Train Steps/Sec: 0.28, Epoch: 0.23637776914108044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12165, "loss": 0.26681023836135864, "memory_gb": 7.721559524536133, "step_time_ms": 3359.163761138916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:45] (step=0012165) Train Loss: 0.2865, Train Steps/Sec: 0.28, Epoch: 0.23639720171006606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12166, "loss": 0.20858076214790344, "memory_gb": 7.721559524536133, "step_time_ms": 3357.602119445801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:48] (step=0012166) Train Loss: 0.2937, Train Steps/Sec: 0.28, Epoch: 0.23641663427905169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12167, "loss": 0.2544860243797302, "memory_gb": 7.721559524536133, "step_time_ms": 3357.999563217163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:52] (step=0012167) Train Loss: 0.2994, Train Steps/Sec: 0.28, Epoch: 0.2364360668480373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12168, "loss": 0.2550140917301178, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9890727996826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:55] (step=0012168) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.23645549941702293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12169, "loss": 0.17443802952766418, "memory_gb": 7.721559524536133, "step_time_ms": 3344.831705093384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:10:59] (step=0012169) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.23647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12170, "loss": 0.33364686369895935, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8099994659424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:03] (step=0012170) Train Loss: 0.3014, Train Steps/Sec: 0.28, Epoch: 0.23649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12171, "loss": 0.24057260155677795, "memory_gb": 7.721559524536133, "step_time_ms": 3359.721899032593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:06] (step=0012171) Train Loss: 0.2127, Train Steps/Sec: 0.28, Epoch: 0.2365137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12172, "loss": 0.24852171540260315, "memory_gb": 7.721559524536133, "step_time_ms": 3358.656406402588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:10] (step=0012172) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.23653322969296542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12173, "loss": 0.2203265279531479, "memory_gb": 7.721559524536133, "step_time_ms": 3346.7817306518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:13] (step=0012173) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.23655266226195104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12174, "loss": 0.23073448240756989, "memory_gb": 7.721559524536133, "step_time_ms": 3362.903118133545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:17] (step=0012174) Train Loss: 0.1965, Train Steps/Sec: 0.28, Epoch: 0.23657209483093666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12175, "loss": 0.2735796868801117, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8551025390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:20] (step=0012175) Train Loss: 0.3054, Train Steps/Sec: 0.28, Epoch: 0.23659152739992226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12176, "loss": 0.2164250761270523, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6177101135254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:24] (step=0012176) Train Loss: 0.3187, Train Steps/Sec: 0.28, Epoch: 0.23661095996890788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12177, "loss": 0.2921663224697113, "memory_gb": 7.721559524536133, "step_time_ms": 3367.896556854248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:28] (step=0012177) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.2366303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12178, "loss": 0.24632179737091064, "memory_gb": 7.721559524536133, "step_time_ms": 3365.309238433838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:31] (step=0012178) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.23664982510687912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12179, "loss": 0.28811219334602356, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3334426879883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:35] (step=0012179) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.23666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12180, "loss": 0.278221070766449, "memory_gb": 7.721559524536133, "step_time_ms": 3366.265058517456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:38] (step=0012180) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.23668869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12181, "loss": 0.3643074929714203, "memory_gb": 7.721559524536133, "step_time_ms": 3356.086015701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:42] (step=0012181) Train Loss: 0.3296, Train Steps/Sec: 0.28, Epoch: 0.236708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12182, "loss": 0.24733325839042664, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0968074798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:46] (step=0012182) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.23672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12183, "loss": 0.36090555787086487, "memory_gb": 7.721559524536133, "step_time_ms": 3361.370325088501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:49] (step=0012183) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.23674698795180724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12184, "loss": 0.20463024079799652, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6505603790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:53] (step=0012184) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.23676642052079286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:11:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12185, "loss": 0.2374594509601593, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2890033721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:11:56] (step=0012185) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.23678585308977848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12186, "loss": 0.22384853661060333, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0736808776855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:00] (step=0012186) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.23680528565876408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12187, "loss": 0.2112216353416443, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9384765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:04] (step=0012187) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.2368247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12188, "loss": 0.26224544644355774, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6478958129883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:07] (step=0012188) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.23684415079673532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12189, "loss": 0.2800571024417877, "memory_gb": 7.721559524536133, "step_time_ms": 3356.480360031128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:11] (step=0012189) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.23686358336572094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12190, "loss": 0.3226500153541565, "memory_gb": 7.721559524536133, "step_time_ms": 3342.2861099243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:14] (step=0012190) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.23688301593470656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12191, "loss": 0.2538568377494812, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4939708709717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:18] (step=0012191) Train Loss: 0.2273, Train Steps/Sec: 0.27, Epoch: 0.2369024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12192, "loss": 0.20307663083076477, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1315956115723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:22] (step=0012192) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.2369218810726778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12193, "loss": 0.20714417099952698, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2845249176025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:25] (step=0012193) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.23694131364166343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12194, "loss": 0.18159309029579163, "memory_gb": 7.721559524536133, "step_time_ms": 3362.212896347046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:29] (step=0012194) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.23696074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12195, "loss": 0.2047852724790573, "memory_gb": 7.721559524536133, "step_time_ms": 3495.197296142578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:32] (step=0012195) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.23698017877963468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12196, "loss": 0.16496706008911133, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5711460113525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:36] (step=0012196) Train Loss: 0.2255, Train Steps/Sec: 0.28, Epoch: 0.2369996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12197, "loss": 0.20993457734584808, "memory_gb": 7.721559524536133, "step_time_ms": 3357.856512069702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:39] (step=0012197) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.23701904391760592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12198, "loss": 0.20439596474170685, "memory_gb": 7.721559524536133, "step_time_ms": 3359.633207321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:43] (step=0012198) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.23703847648659152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12199, "loss": 0.25892701745033264, "memory_gb": 7.721559524536133, "step_time_ms": 3358.032464981079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:47] (step=0012199) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.23705790905557714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12200, "loss": 0.15076816082000732, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5849742889404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:50] (step=0012200) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.23707734162456276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12201, "loss": 0.27048259973526, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7617149353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:54] (step=0012201) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.23709677419354838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:12:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12202, "loss": 0.15210838615894318, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9959354400635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:12:57] (step=0012202) Train Loss: 0.1420, Train Steps/Sec: 0.28, Epoch: 0.237116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12203, "loss": 0.29528433084487915, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9298725128174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:01] (step=0012203) Train Loss: 0.3040, Train Steps/Sec: 0.28, Epoch: 0.23713563933151963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12204, "loss": 0.2700594663619995, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5281372070312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:05] (step=0012204) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.23715507190050525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12205, "loss": 0.14159342646598816, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1387271881104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:08] (step=0012205) Train Loss: 0.1980, Train Steps/Sec: 0.28, Epoch: 0.23717450446949087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12206, "loss": 0.2002146691083908, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5218658447266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:12] (step=0012206) Train Loss: 0.1560, Train Steps/Sec: 0.28, Epoch: 0.2371939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12207, "loss": 0.22476796805858612, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8285961151123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:15] (step=0012207) Train Loss: 0.1904, Train Steps/Sec: 0.28, Epoch: 0.23721336960746212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12208, "loss": 0.22681652009487152, "memory_gb": 7.721559524536133, "step_time_ms": 3345.9737300872803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:19] (step=0012208) Train Loss: 0.2296, Train Steps/Sec: 0.28, Epoch: 0.23723280217644774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12209, "loss": 0.31372737884521484, "memory_gb": 7.721559524536133, "step_time_ms": 3358.706474304199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:22] (step=0012209) Train Loss: 0.3211, Train Steps/Sec: 0.28, Epoch: 0.23725223474543333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12210, "loss": 0.2731142044067383, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5421829223633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:26] (step=0012210) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.23727166731441895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12211, "loss": 0.23016008734703064, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7748794555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:30] (step=0012211) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.23729109988340458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12212, "loss": 0.3089679479598999, "memory_gb": 7.721559524536133, "step_time_ms": 3352.473735809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:33] (step=0012212) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.2373105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12213, "loss": 0.18483631312847137, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7863445281982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:37] (step=0012213) Train Loss: 0.1892, Train Steps/Sec: 0.28, Epoch: 0.23732996502137582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12214, "loss": 0.12236040830612183, "memory_gb": 7.721559524536133, "step_time_ms": 3353.961706161499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:40] (step=0012214) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.23734939759036144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12215, "loss": 0.19997474551200867, "memory_gb": 7.721559524536133, "step_time_ms": 3342.3376083374023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:44] (step=0012215) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.23736883015934707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12216, "loss": 0.2656863331794739, "memory_gb": 7.721559524536133, "step_time_ms": 3356.374740600586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:47] (step=0012216) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.2373882627283327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12217, "loss": 0.2905987501144409, "memory_gb": 7.721559524536133, "step_time_ms": 3358.827590942383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:51] (step=0012217) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.2374076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12218, "loss": 0.2268456220626831, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0211029052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:55] (step=0012218) Train Loss: 0.2714, Train Steps/Sec: 0.28, Epoch: 0.23742712786630393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:13:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12219, "loss": 0.3124989867210388, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2541942596436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:13:58] (step=0012219) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.23744656043528956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12220, "loss": 0.18421171605587006, "memory_gb": 7.721559524536133, "step_time_ms": 3358.797311782837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:02] (step=0012220) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.23746599300427518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12221, "loss": 0.2700435221195221, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4659099578857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:05] (step=0012221) Train Loss: 0.2888, Train Steps/Sec: 0.28, Epoch: 0.23748542557326077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12222, "loss": 0.2562808096408844, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4532012939453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:09] (step=0012222) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.2375048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12223, "loss": 0.1766715943813324, "memory_gb": 7.721559524536133, "step_time_ms": 3360.182523727417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:12] (step=0012223) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.23752429071123202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12224, "loss": 0.17638270556926727, "memory_gb": 7.721559524536133, "step_time_ms": 3358.036756515503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:16] (step=0012224) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.23754372328021764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12225, "loss": 0.363353967666626, "memory_gb": 7.715639114379883, "step_time_ms": 3325.4826068878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:20] (step=0012225) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.23756315584920326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12226, "loss": 0.127365380525589, "memory_gb": 7.721559524536133, "step_time_ms": 3356.807231903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:23] (step=0012226) Train Loss: 0.1818, Train Steps/Sec: 0.28, Epoch: 0.23758258841818888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12227, "loss": 0.3189423382282257, "memory_gb": 7.721559524536133, "step_time_ms": 3361.175537109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:27] (step=0012227) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.2376020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12228, "loss": 0.34682708978652954, "memory_gb": 7.721559524536133, "step_time_ms": 3355.513572692871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:30] (step=0012228) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.23762145355616013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12229, "loss": 0.24673455953598022, "memory_gb": 7.721559524536133, "step_time_ms": 3359.551191329956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:34] (step=0012229) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.23764088612514575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12230, "loss": 0.18477566540241241, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1182231903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:37] (step=0012230) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.23766031869413137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12231, "loss": 0.11964856833219528, "memory_gb": 7.721559524536133, "step_time_ms": 3362.041473388672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:41] (step=0012231) Train Loss: 0.1322, Train Steps/Sec: 0.27, Epoch: 0.237679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12232, "loss": 0.22428521513938904, "memory_gb": 7.721559524536133, "step_time_ms": 3356.187105178833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:45] (step=0012232) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.23769918383210262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12233, "loss": 0.2398228943347931, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7966899871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:48] (step=0012233) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.2377186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12234, "loss": 0.28800711035728455, "memory_gb": 7.721559524536133, "step_time_ms": 3346.4341163635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:52] (step=0012234) Train Loss: 0.3104, Train Steps/Sec: 0.28, Epoch: 0.23773804897007383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12235, "loss": 0.2803611159324646, "memory_gb": 7.721559524536133, "step_time_ms": 3346.177577972412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:55] (step=0012235) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.23775748153905946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:14:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12236, "loss": 0.21536366641521454, "memory_gb": 7.721559524536133, "step_time_ms": 3358.226537704468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:14:59] (step=0012236) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.23777691410804508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12237, "loss": 0.16992008686065674, "memory_gb": 7.721559524536133, "step_time_ms": 3361.788511276245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:03] (step=0012237) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.2377963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12238, "loss": 0.28232279419898987, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7652912139893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:06] (step=0012238) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.23781577924601632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12239, "loss": 0.20263460278511047, "memory_gb": 7.721559524536133, "step_time_ms": 3364.326238632202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:10] (step=0012239) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.23783521181500195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12240, "loss": 0.14350968599319458, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2802753448486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:13] (step=0012240) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.23785464438398757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12241, "loss": 0.2699454426765442, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6129417419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:17] (step=0012241) Train Loss: 0.2178, Train Steps/Sec: 0.28, Epoch: 0.2378740769529732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12242, "loss": 0.3110811412334442, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2046241760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:21] (step=0012242) Train Loss: 0.2689, Train Steps/Sec: 0.28, Epoch: 0.2378935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12243, "loss": 0.1957162618637085, "memory_gb": 7.721559524536133, "step_time_ms": 3502.500534057617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:24] (step=0012243) Train Loss: 0.2395, Train Steps/Sec: 0.28, Epoch: 0.23791294209094443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12244, "loss": 0.22850380837917328, "memory_gb": 7.721559524536133, "step_time_ms": 3359.548568725586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:28] (step=0012244) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.23793237465993003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12245, "loss": 0.17813953757286072, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6201877593994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:31] (step=0012245) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.23795180722891565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12246, "loss": 0.15302486717700958, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6838970184326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:35] (step=0012246) Train Loss: 0.1613, Train Steps/Sec: 0.28, Epoch: 0.23797123979790127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12247, "loss": 0.16409283876419067, "memory_gb": 7.721559524536133, "step_time_ms": 3356.762647628784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:38] (step=0012247) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.2379906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12248, "loss": 0.2514404356479645, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6341819763184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:42] (step=0012248) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.23801010493587252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12249, "loss": 0.1708928644657135, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1927757263184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:46] (step=0012249) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.23802953750485814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12250, "loss": 0.167280375957489, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1818084716797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:49] (step=0012250) Train Loss: 0.2330, Train Steps/Sec: 0.28, Epoch: 0.23804897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12251, "loss": 0.2762373685836792, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2911701202393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:53] (step=0012251) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.23806840264282939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:15:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12252, "loss": 0.2339126467704773, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1627349853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:15:56] (step=0012252) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.238087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12253, "loss": 0.2578955292701721, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7823638916016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:00] (step=0012253) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.23810726778080063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12254, "loss": 0.30362921953201294, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3084812164307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:03] (step=0012254) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.23812670034978625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12255, "loss": 0.22850659489631653, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5239601135254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:07] (step=0012255) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.23814613291877187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12256, "loss": 0.28176698088645935, "memory_gb": 7.721559524536133, "step_time_ms": 3355.390787124634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:11] (step=0012256) Train Loss: 0.2680, Train Steps/Sec: 0.28, Epoch: 0.23816556548775747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12257, "loss": 0.2528833746910095, "memory_gb": 7.721559524536133, "step_time_ms": 3360.539436340332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:14] (step=0012257) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.2381849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12258, "loss": 0.3352561295032501, "memory_gb": 7.721559524536133, "step_time_ms": 3363.229513168335, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:18] (step=0012258) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.2382044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12259, "loss": 0.1413596272468567, "memory_gb": 7.721559524536133, "step_time_ms": 3361.488103866577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:21] (step=0012259) Train Loss: 0.1554, Train Steps/Sec: 0.28, Epoch: 0.23822386319471434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12260, "loss": 0.30388960242271423, "memory_gb": 7.721559524536133, "step_time_ms": 3352.370500564575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:25] (step=0012260) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.23824329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12261, "loss": 0.20342929661273956, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4542274475098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:29] (step=0012261) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.23826272833268558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12262, "loss": 0.3477388620376587, "memory_gb": 7.721559524536133, "step_time_ms": 3361.142873764038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:32] (step=0012262) Train Loss: 0.3101, Train Steps/Sec: 0.28, Epoch: 0.2382821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12263, "loss": 0.15218991041183472, "memory_gb": 7.721559524536133, "step_time_ms": 3352.297782897949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:36] (step=0012263) Train Loss: 0.1897, Train Steps/Sec: 0.28, Epoch: 0.23830159347065683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12264, "loss": 0.2750682234764099, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4513664245605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:39] (step=0012264) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.23832102603964245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12265, "loss": 0.19055819511413574, "memory_gb": 7.721559524536133, "step_time_ms": 3362.99991607666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:43] (step=0012265) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.23834045860862807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12266, "loss": 0.2011350840330124, "memory_gb": 7.721559524536133, "step_time_ms": 3358.00838470459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:46] (step=0012266) Train Loss: 0.1823, Train Steps/Sec: 0.28, Epoch: 0.2383598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12267, "loss": 0.1595143973827362, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4776649475098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:50] (step=0012267) Train Loss: 0.1398, Train Steps/Sec: 0.28, Epoch: 0.23837932374659931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12268, "loss": 0.1163070946931839, "memory_gb": 7.721559524536133, "step_time_ms": 3360.99910736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:54] (step=0012268) Train Loss: 0.1736, Train Steps/Sec: 0.28, Epoch: 0.2383987563155849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:16:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12269, "loss": 0.3068409562110901, "memory_gb": 7.721559524536133, "step_time_ms": 3359.438180923462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:16:57] (step=0012269) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.23841818888457053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12270, "loss": 0.2029055655002594, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3781204223633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:01] (step=0012270) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.23843762145355615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12271, "loss": 0.21767902374267578, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7698726654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:04] (step=0012271) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.23845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12272, "loss": 0.17920756340026855, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4846935272217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:08] (step=0012272) Train Loss: 0.1834, Train Steps/Sec: 0.27, Epoch: 0.2384764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12273, "loss": 0.206947460770607, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2661361694336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:12] (step=0012273) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.23849591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12274, "loss": 0.30050379037857056, "memory_gb": 7.721559524536133, "step_time_ms": 3361.534357070923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:15] (step=0012274) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.23851535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12275, "loss": 0.31443458795547485, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6577949523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:19] (step=0012275) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.23853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12276, "loss": 0.1935417652130127, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5117588043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:22] (step=0012276) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.2385542168674699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12277, "loss": 0.2741389870643616, "memory_gb": 7.721559524536133, "step_time_ms": 3351.557493209839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:26] (step=0012277) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.2385736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12278, "loss": 0.2291848361492157, "memory_gb": 7.721559524536133, "step_time_ms": 3342.4134254455566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:30] (step=0012278) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.23859308200544113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12279, "loss": 0.13443642854690552, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8200340270996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:33] (step=0012279) Train Loss: 0.1555, Train Steps/Sec: 0.28, Epoch: 0.23861251457442673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12280, "loss": 0.23903679847717285, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3437881469727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:37] (step=0012280) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.23863194714341235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12281, "loss": 0.2675570547580719, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6151599884033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:40] (step=0012281) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.23865137971239797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12282, "loss": 0.26658356189727783, "memory_gb": 7.721559524536133, "step_time_ms": 3356.358051300049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:44] (step=0012282) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.2386708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12283, "loss": 0.15551133453845978, "memory_gb": 7.721559524536133, "step_time_ms": 3354.379415512085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:48] (step=0012283) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.23869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12284, "loss": 0.24909809231758118, "memory_gb": 7.721559524536133, "step_time_ms": 3524.49369430542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:51] (step=0012284) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.23870967741935484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12285, "loss": 0.10653994232416153, "memory_gb": 7.721559524536133, "step_time_ms": 3342.480421066284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:55] (step=0012285) Train Loss: 0.1283, Train Steps/Sec: 0.28, Epoch: 0.23872910998834046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:17:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12286, "loss": 0.32671719789505005, "memory_gb": 7.721559524536133, "step_time_ms": 3358.55770111084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:17:58] (step=0012286) Train Loss: 0.3169, Train Steps/Sec: 0.28, Epoch: 0.23874854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12287, "loss": 0.3613358736038208, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2068729400635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:02] (step=0012287) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.2387679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12288, "loss": 0.23516403138637543, "memory_gb": 7.721559524536133, "step_time_ms": 3352.70357131958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:05] (step=0012288) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.23878740769529733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12289, "loss": 0.2267698496580124, "memory_gb": 7.721559524536133, "step_time_ms": 3349.193572998047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:09] (step=0012289) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.23880684026428295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12290, "loss": 0.20640945434570312, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4275035858154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:13] (step=0012290) Train Loss: 0.1826, Train Steps/Sec: 0.28, Epoch: 0.23882627283326857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12291, "loss": 0.21547815203666687, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7027111053467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:16] (step=0012291) Train Loss: 0.2273, Train Steps/Sec: 0.28, Epoch: 0.23884570540225417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12292, "loss": 0.2973206639289856, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5592346191406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:20] (step=0012292) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.2388651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12293, "loss": 0.3379940390586853, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6079139709473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:23] (step=0012293) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.2388845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12294, "loss": 0.17139701545238495, "memory_gb": 7.721559524536133, "step_time_ms": 3349.720001220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:27] (step=0012294) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.23890400310921103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12295, "loss": 0.2091023474931717, "memory_gb": 7.721559524536133, "step_time_ms": 3355.098009109497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:30] (step=0012295) Train Loss: 0.1921, Train Steps/Sec: 0.28, Epoch: 0.23892343567819666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12296, "loss": 0.23011353611946106, "memory_gb": 7.721559524536133, "step_time_ms": 3346.7748165130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:34] (step=0012296) Train Loss: 0.2295, Train Steps/Sec: 0.28, Epoch: 0.23894286824718228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12297, "loss": 0.2163732349872589, "memory_gb": 7.721559524536133, "step_time_ms": 3347.71990776062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:38] (step=0012297) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.2389623008161679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12298, "loss": 0.22262723743915558, "memory_gb": 7.721559524536133, "step_time_ms": 3356.642007827759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:41] (step=0012298) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.23898173338515352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12299, "loss": 0.23070959746837616, "memory_gb": 7.721559524536133, "step_time_ms": 3359.088897705078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:45] (step=0012299) Train Loss: 0.3043, Train Steps/Sec: 0.28, Epoch: 0.23900116595413914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12300, "loss": 0.3331054151058197, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4750423431396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:48] (step=0012300) Train Loss: 0.3007, Train Steps/Sec: 0.28, Epoch: 0.23902059852312477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12301, "loss": 0.22996902465820312, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8978309631348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:52] (step=0012301) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.2390400310921104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12302, "loss": 0.21027842164039612, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6763401031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:56] (step=0012302) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.23905946366109598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:18:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12303, "loss": 0.332037091255188, "memory_gb": 7.721559524536133, "step_time_ms": 3356.780529022217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:18:59] (step=0012303) Train Loss: 0.2958, Train Steps/Sec: 0.28, Epoch: 0.2390788962300816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12304, "loss": 0.1591242104768753, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3907146453857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:03] (step=0012304) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.23909832879906723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12305, "loss": 0.1967889368534088, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4490852355957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:06] (step=0012305) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.23911776136805285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12306, "loss": 0.14412954449653625, "memory_gb": 7.721559524536133, "step_time_ms": 3341.3968086242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:10] (step=0012306) Train Loss: 0.1389, Train Steps/Sec: 0.28, Epoch: 0.23913719393703847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12307, "loss": 0.37190473079681396, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9136390686035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:13] (step=0012307) Train Loss: 0.3398, Train Steps/Sec: 0.28, Epoch: 0.2391566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12308, "loss": 0.30980128049850464, "memory_gb": 7.721559524536133, "step_time_ms": 3358.276844024658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:17] (step=0012308) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.23917605907500972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12309, "loss": 0.16162511706352234, "memory_gb": 7.721559524536133, "step_time_ms": 3346.409559249878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:21] (step=0012309) Train Loss: 0.1568, Train Steps/Sec: 0.28, Epoch: 0.23919549164399534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12310, "loss": 0.33737701177597046, "memory_gb": 7.721559524536133, "step_time_ms": 3356.023073196411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:24] (step=0012310) Train Loss: 0.3283, Train Steps/Sec: 0.28, Epoch: 0.23921492421298096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12311, "loss": 0.22961047291755676, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2329540252686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:28] (step=0012311) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.23923435678196658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12312, "loss": 0.3129100799560547, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2160053253174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:31] (step=0012312) Train Loss: 0.2662, Train Steps/Sec: 0.28, Epoch: 0.2392537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12313, "loss": 0.2745487093925476, "memory_gb": 7.715639114379883, "step_time_ms": 3310.8391761779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:35] (step=0012313) Train Loss: 0.2664, Train Steps/Sec: 0.28, Epoch: 0.23927322191993783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12314, "loss": 0.3691381812095642, "memory_gb": 7.721559524536133, "step_time_ms": 3347.698450088501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:38] (step=0012314) Train Loss: 0.3033, Train Steps/Sec: 0.28, Epoch: 0.23929265448892342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12315, "loss": 0.15203160047531128, "memory_gb": 7.721559524536133, "step_time_ms": 3337.648391723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:42] (step=0012315) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.23931208705790905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12316, "loss": 0.27217087149620056, "memory_gb": 7.721559524536133, "step_time_ms": 3354.987144470215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:45] (step=0012316) Train Loss: 0.2896, Train Steps/Sec: 0.28, Epoch: 0.23933151962689467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12317, "loss": 0.28662776947021484, "memory_gb": 7.721559524536133, "step_time_ms": 3339.8685455322266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:49] (step=0012317) Train Loss: 0.2833, Train Steps/Sec: 0.28, Epoch: 0.2393509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12318, "loss": 0.12394057959318161, "memory_gb": 7.721559524536133, "step_time_ms": 3351.573944091797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:53] (step=0012318) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.2393703847648659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:19:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12319, "loss": 0.23265552520751953, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0873947143555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:19:56] (step=0012319) Train Loss: 0.2036, Train Steps/Sec: 0.27, Epoch: 0.23938981733385153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12320, "loss": 0.16346487402915955, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0516624450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:00] (step=0012320) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.23940924990283716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12321, "loss": 0.2930126190185547, "memory_gb": 7.721559524536133, "step_time_ms": 3360.11004447937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:03] (step=0012321) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.23942868247182278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12322, "loss": 0.27646321058273315, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3076725006104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:07] (step=0012322) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.2394481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12323, "loss": 0.2800680994987488, "memory_gb": 7.721559524536133, "step_time_ms": 3354.963779449463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:11] (step=0012323) Train Loss: 0.2690, Train Steps/Sec: 0.28, Epoch: 0.23946754760979402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12324, "loss": 0.2506313621997833, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7209968566895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:14] (step=0012324) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.23948698017877965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12325, "loss": 0.22345639765262604, "memory_gb": 7.721559524536133, "step_time_ms": 3357.470989227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:18] (step=0012325) Train Loss: 0.2247, Train Steps/Sec: 0.28, Epoch: 0.23950641274776527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12326, "loss": 0.2604764699935913, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0190715789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:21] (step=0012326) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.23952584531675086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12327, "loss": 0.22582149505615234, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4066162109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:25] (step=0012327) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.23954527788573649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12328, "loss": 0.19829554855823517, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4275035858154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:28] (step=0012328) Train Loss: 0.1841, Train Steps/Sec: 0.28, Epoch: 0.2395647104547221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12329, "loss": 0.2766788899898529, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8531742095947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:32] (step=0012329) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.23958414302370773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12330, "loss": 0.10230957716703415, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7143211364746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:36] (step=0012330) Train Loss: 0.1725, Train Steps/Sec: 0.28, Epoch: 0.23960357559269335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12331, "loss": 0.18196821212768555, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0892810821533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:39] (step=0012331) Train Loss: 0.1793, Train Steps/Sec: 0.28, Epoch: 0.23962300816167897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12332, "loss": 0.27282220125198364, "memory_gb": 7.721559524536133, "step_time_ms": 3499.4027614593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:43] (step=0012332) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.2396424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12333, "loss": 0.2632124423980713, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8548641204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:46] (step=0012333) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.23966187329965022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12334, "loss": 0.19660350680351257, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6157093048096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:50] (step=0012334) Train Loss: 0.1812, Train Steps/Sec: 0.28, Epoch: 0.23968130586863584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12335, "loss": 0.1441897749900818, "memory_gb": 7.721559524536133, "step_time_ms": 3359.203577041626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:53] (step=0012335) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.23970073843762146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:20:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12336, "loss": 0.20794662833213806, "memory_gb": 7.715639114379883, "step_time_ms": 3321.551561355591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:20:57] (step=0012336) Train Loss: 0.1648, Train Steps/Sec: 0.28, Epoch: 0.23972017100660709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12337, "loss": 0.15026848018169403, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9813499450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:01] (step=0012337) Train Loss: 0.1321, Train Steps/Sec: 0.28, Epoch: 0.23973960357559268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12338, "loss": 0.2823891341686249, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6226024627686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:04] (step=0012338) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.2397590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12339, "loss": 0.1862030029296875, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3997745513916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:08] (step=0012339) Train Loss: 0.1969, Train Steps/Sec: 0.28, Epoch: 0.23977846871356392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12340, "loss": 0.3898032307624817, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4347496032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:11] (step=0012340) Train Loss: 0.3069, Train Steps/Sec: 0.28, Epoch: 0.23979790128254955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12341, "loss": 0.284220427274704, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3314418792725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:15] (step=0012341) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.23981733385153517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12342, "loss": 0.16842854022979736, "memory_gb": 7.721559524536133, "step_time_ms": 3345.551013946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:18] (step=0012342) Train Loss: 0.1932, Train Steps/Sec: 0.28, Epoch: 0.2398367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12343, "loss": 0.26292186975479126, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5796871185303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:22] (step=0012343) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.23985619898950641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12344, "loss": 0.2098861038684845, "memory_gb": 7.721559524536133, "step_time_ms": 3357.750415802002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:26] (step=0012344) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.23987563155849204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12345, "loss": 0.23365534842014313, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8437309265137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:29] (step=0012345) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.23989506412747766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12346, "loss": 0.17488379776477814, "memory_gb": 7.721559524536133, "step_time_ms": 3352.970838546753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:33] (step=0012346) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.23991449669646328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12347, "loss": 0.2609636187553406, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2848052978516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:36] (step=0012347) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.2399339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12348, "loss": 0.16335302591323853, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8546772003174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:40] (step=0012348) Train Loss: 0.1591, Train Steps/Sec: 0.28, Epoch: 0.23995336183443453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12349, "loss": 0.25009751319885254, "memory_gb": 7.721559524536133, "step_time_ms": 3361.140727996826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:43] (step=0012349) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.23997279440342012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12350, "loss": 0.1458578109741211, "memory_gb": 7.721559524536133, "step_time_ms": 3358.760118484497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:47] (step=0012350) Train Loss: 0.1795, Train Steps/Sec: 0.28, Epoch: 0.23999222697240574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12351, "loss": 0.2643078863620758, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5535850524902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:51] (step=0012351) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.24001165954139136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12352, "loss": 0.1706511378288269, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4441413879395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:54] (step=0012352) Train Loss: 0.1935, Train Steps/Sec: 0.28, Epoch: 0.240031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:21:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12353, "loss": 0.23908250033855438, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6364936828613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:21:58] (step=0012353) Train Loss: 0.2020, Train Steps/Sec: 0.28, Epoch: 0.2400505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12354, "loss": 0.28947773575782776, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0169372558594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:01] (step=0012354) Train Loss: 0.2929, Train Steps/Sec: 0.28, Epoch: 0.24006995724834823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12355, "loss": 0.30470553040504456, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6561164855957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:05] (step=0012355) Train Loss: 0.2971, Train Steps/Sec: 0.28, Epoch: 0.24008938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12356, "loss": 0.3505253195762634, "memory_gb": 7.721559524536133, "step_time_ms": 3363.224744796753, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:08] (step=0012356) Train Loss: 0.3322, Train Steps/Sec: 0.28, Epoch: 0.24010882238631948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12357, "loss": 0.15707233548164368, "memory_gb": 7.721559524536133, "step_time_ms": 3360.018014907837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:12] (step=0012357) Train Loss: 0.1841, Train Steps/Sec: 0.28, Epoch: 0.2401282549553051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12358, "loss": 0.30580365657806396, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1511459350586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:16] (step=0012358) Train Loss: 0.2276, Train Steps/Sec: 0.28, Epoch: 0.24014768752429072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12359, "loss": 0.19492629170417786, "memory_gb": 7.721559524536133, "step_time_ms": 3363.874912261963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:19] (step=0012359) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.24016712009327634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12360, "loss": 0.26029321551322937, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1694202423096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:23] (step=0012360) Train Loss: 0.2631, Train Steps/Sec: 0.27, Epoch: 0.24018655266226194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12361, "loss": 0.17728932201862335, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2257080078125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:26] (step=0012361) Train Loss: 0.1647, Train Steps/Sec: 0.28, Epoch: 0.24020598523124756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12362, "loss": 0.15924644470214844, "memory_gb": 7.721559524536133, "step_time_ms": 3350.680112838745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:30] (step=0012362) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.24022541780023318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12363, "loss": 0.24140477180480957, "memory_gb": 7.721559524536133, "step_time_ms": 3363.813877105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:34] (step=0012363) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.2402448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12364, "loss": 0.24382954835891724, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1510009765625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:37] (step=0012364) Train Loss: 0.2864, Train Steps/Sec: 0.28, Epoch: 0.24026428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12365, "loss": 0.2328357994556427, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5622520446777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:41] (step=0012365) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.24028371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12366, "loss": 0.24764575064182281, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5968227386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:44] (step=0012366) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.24030314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12367, "loss": 0.2283649742603302, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5054359436035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:48] (step=0012367) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.2403225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12368, "loss": 0.32425928115844727, "memory_gb": 7.721559524536133, "step_time_ms": 3360.29314994812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:52] (step=0012368) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.24034201321414692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12369, "loss": 0.37015387415885925, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2349033355713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:55] (step=0012369) Train Loss: 0.3044, Train Steps/Sec: 0.28, Epoch: 0.24036144578313254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12370, "loss": 0.3081471025943756, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1279258728027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:22:59] (step=0012370) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.24038087835211816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12371, "loss": 0.17774280905723572, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3692054748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:02] (step=0012371) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.24040031092110378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12372, "loss": 0.24408209323883057, "memory_gb": 7.721559524536133, "step_time_ms": 3501.6229152679443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:06] (step=0012372) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.24041974349008938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12373, "loss": 0.2773798704147339, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0549182891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:09] (step=0012373) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.240439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12374, "loss": 0.18645885586738586, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1809272766113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:13] (step=0012374) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.24045860862806062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12375, "loss": 0.20721076428890228, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4164638519287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:17] (step=0012375) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.24047804119704624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12376, "loss": 0.16886162757873535, "memory_gb": 7.721559524536133, "step_time_ms": 3359.337329864502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:20] (step=0012376) Train Loss: 0.1840, Train Steps/Sec: 0.28, Epoch: 0.24049747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12377, "loss": 0.26363205909729004, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9584617614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:24] (step=0012377) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2405169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12378, "loss": 0.19464465975761414, "memory_gb": 7.721559524536133, "step_time_ms": 3356.966257095337, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:27] (step=0012378) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.2405363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12379, "loss": 0.22868677973747253, "memory_gb": 7.721559524536133, "step_time_ms": 3360.457420349121, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:31] (step=0012379) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.24055577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12380, "loss": 0.2093067467212677, "memory_gb": 7.721559524536133, "step_time_ms": 3362.140417098999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:34] (step=0012380) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.24057520404197436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12381, "loss": 0.24559234082698822, "memory_gb": 7.721559524536133, "step_time_ms": 3349.738121032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:38] (step=0012381) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.24059463661095998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12382, "loss": 0.21380901336669922, "memory_gb": 7.721559524536133, "step_time_ms": 3359.537124633789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:42] (step=0012382) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.2406140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12383, "loss": 0.2111385464668274, "memory_gb": 7.721559524536133, "step_time_ms": 3359.560251235962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:45] (step=0012383) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.24063350174893122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12384, "loss": 0.28441131114959717, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2427501678467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:49] (step=0012384) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.24065293431791682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12385, "loss": 0.22592571377754211, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7048263549805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:52] (step=0012385) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.24067236688690244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:23:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12386, "loss": 0.1923895627260208, "memory_gb": 7.721559524536133, "step_time_ms": 3362.954616546631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:23:56] (step=0012386) Train Loss: 0.2215, Train Steps/Sec: 0.28, Epoch: 0.24069179945588806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12387, "loss": 0.2253170907497406, "memory_gb": 7.721559524536133, "step_time_ms": 3356.308698654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:00] (step=0012387) Train Loss: 0.2734, Train Steps/Sec: 0.28, Epoch: 0.24071123202487368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12388, "loss": 0.15138961374759674, "memory_gb": 7.721559524536133, "step_time_ms": 3361.706256866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:03] (step=0012388) Train Loss: 0.1636, Train Steps/Sec: 0.28, Epoch: 0.2407306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12389, "loss": 0.20299175381660461, "memory_gb": 7.721559524536133, "step_time_ms": 3359.104871749878, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:07] (step=0012389) Train Loss: 0.1574, Train Steps/Sec: 0.28, Epoch: 0.24075009716284493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12390, "loss": 0.3212701380252838, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1418266296387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:10] (step=0012390) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.24076952973183055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12391, "loss": 0.14566722512245178, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6649894714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:14] (step=0012391) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.24078896230081617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12392, "loss": 0.16476957499980927, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2053184509277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:17] (step=0012392) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.2408083948698018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12393, "loss": 0.2569018602371216, "memory_gb": 7.721559524536133, "step_time_ms": 3356.212377548218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:21] (step=0012393) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.24082782743878742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12394, "loss": 0.2631552517414093, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1478385925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:25] (step=0012394) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.24084726000777304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12395, "loss": 0.16249124705791473, "memory_gb": 7.721559524536133, "step_time_ms": 3356.66561126709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:28] (step=0012395) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.24086669257675863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12396, "loss": 0.21131819486618042, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9863777160645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:32] (step=0012396) Train Loss: 0.1736, Train Steps/Sec: 0.28, Epoch: 0.24088612514574426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12397, "loss": 0.3121381998062134, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7091999053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:35] (step=0012397) Train Loss: 0.2932, Train Steps/Sec: 0.28, Epoch: 0.24090555771472988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12398, "loss": 0.18850360810756683, "memory_gb": 7.721559524536133, "step_time_ms": 3350.212335586548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:39] (step=0012398) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.2409249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12399, "loss": 0.23430244624614716, "memory_gb": 7.721559524536133, "step_time_ms": 3358.466863632202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:43] (step=0012399) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.24094442285270112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12400, "loss": 0.2818228304386139, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5048027038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:46] (step=0012400) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.24096385542168675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12401, "loss": 0.12447671592235565, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8006286621094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:50] (step=0012401) Train Loss: 0.1264, Train Steps/Sec: 0.28, Epoch: 0.24098328799067237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12402, "loss": 0.19408757984638214, "memory_gb": 7.721559524536133, "step_time_ms": 3348.785877227783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:53] (step=0012402) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.241002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12403, "loss": 0.32548412680625916, "memory_gb": 7.721559524536133, "step_time_ms": 3354.128360748291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:24:57] (step=0012403) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.2410221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12404, "loss": 0.23707681894302368, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9131622314453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:00] (step=0012404) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.24104158569762923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12405, "loss": 0.17607024312019348, "memory_gb": 7.721559524536133, "step_time_ms": 3345.3714847564697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:04] (step=0012405) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.24106101826661486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12406, "loss": 0.27061861753463745, "memory_gb": 7.721559524536133, "step_time_ms": 3358.344078063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:08] (step=0012406) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.24108045083560048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12407, "loss": 0.21608120203018188, "memory_gb": 7.721559524536133, "step_time_ms": 3356.710433959961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:11] (step=0012407) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.24109988340458607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12408, "loss": 0.23256593942642212, "memory_gb": 7.721559524536133, "step_time_ms": 3355.419397354126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:15] (step=0012408) Train Loss: 0.2388, Train Steps/Sec: 0.27, Epoch: 0.2411193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12409, "loss": 0.08889242261648178, "memory_gb": 7.721559524536133, "step_time_ms": 3350.0466346740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:18] (step=0012409) Train Loss: 0.1743, Train Steps/Sec: 0.28, Epoch: 0.24113874854255732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12410, "loss": 0.2923126518726349, "memory_gb": 7.721559524536133, "step_time_ms": 3343.7371253967285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:22] (step=0012410) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.24115818111154294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12411, "loss": 0.24467235803604126, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1453819274902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:26] (step=0012411) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.24117761368052856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12412, "loss": 0.2135678231716156, "memory_gb": 7.721559524536133, "step_time_ms": 3346.327543258667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:29] (step=0012412) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.24119704624951419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12413, "loss": 0.15733925998210907, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9602546691895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:33] (step=0012413) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.2412164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12414, "loss": 0.19777555763721466, "memory_gb": 7.721559524536133, "step_time_ms": 3355.314016342163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:36] (step=0012414) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.24123591138748543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12415, "loss": 0.3062390685081482, "memory_gb": 7.721559524536133, "step_time_ms": 3353.550910949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:40] (step=0012415) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.24125534395647105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12416, "loss": 0.25995171070098877, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7630310058594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:43] (step=0012416) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.24127477652545667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12417, "loss": 0.3032483458518982, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7421951293945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:47] (step=0012417) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.2412942090944423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12418, "loss": 0.14607927203178406, "memory_gb": 7.721559524536133, "step_time_ms": 3352.471351623535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:51] (step=0012418) Train Loss: 0.1468, Train Steps/Sec: 0.28, Epoch: 0.2413136416634279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12419, "loss": 0.15801797807216644, "memory_gb": 7.721559524536133, "step_time_ms": 3495.9781169891357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:54] (step=0012419) Train Loss: 0.1893, Train Steps/Sec: 0.28, Epoch: 0.2413330742324135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12420, "loss": 0.31093350052833557, "memory_gb": 7.721559524536133, "step_time_ms": 3352.285385131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:25:58] (step=0012420) Train Loss: 0.2716, Train Steps/Sec: 0.28, Epoch: 0.24135250680139914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12421, "loss": 0.2929317355155945, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0376892089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:01] (step=0012421) Train Loss: 0.3175, Train Steps/Sec: 0.28, Epoch: 0.24137193937038476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12422, "loss": 0.3281431198120117, "memory_gb": 7.721559524536133, "step_time_ms": 3347.5778102874756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:05] (step=0012422) Train Loss: 0.2692, Train Steps/Sec: 0.28, Epoch: 0.24139137193937038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12423, "loss": 0.17081928253173828, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2795906066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:08] (step=0012423) Train Loss: 0.2120, Train Steps/Sec: 0.28, Epoch: 0.241410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12424, "loss": 0.2878108024597168, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3222885131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:12] (step=0012424) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.24143023707734163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12425, "loss": 0.1937798261642456, "memory_gb": 7.721559524536133, "step_time_ms": 3349.9035835266113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:16] (step=0012425) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.24144966964632725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12426, "loss": 0.17498712241649628, "memory_gb": 7.721559524536133, "step_time_ms": 3355.146646499634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:19] (step=0012426) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.24146910221531287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12427, "loss": 0.2303169220685959, "memory_gb": 7.721559524536133, "step_time_ms": 3351.497173309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:23] (step=0012427) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.2414885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12428, "loss": 0.3725779354572296, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0920486450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:26] (step=0012428) Train Loss: 0.3508, Train Steps/Sec: 0.28, Epoch: 0.24150796735328411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12429, "loss": 0.2828518748283386, "memory_gb": 7.721559524536133, "step_time_ms": 3349.573850631714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:30] (step=0012429) Train Loss: 0.3116, Train Steps/Sec: 0.28, Epoch: 0.24152739992226974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12430, "loss": 0.15647393465042114, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8923053741455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:33] (step=0012430) Train Loss: 0.1735, Train Steps/Sec: 0.28, Epoch: 0.24154683249125533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12431, "loss": 0.23573720455169678, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5596084594727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:37] (step=0012431) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.24156626506024095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12432, "loss": 0.29860448837280273, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1877994537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:41] (step=0012432) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.24158569762922658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12433, "loss": 0.3071728050708771, "memory_gb": 7.721559524536133, "step_time_ms": 3355.268955230713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:44] (step=0012433) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.2416051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12434, "loss": 0.30505359172821045, "memory_gb": 7.721559524536133, "step_time_ms": 3352.703094482422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:48] (step=0012434) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.24162456276719782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12435, "loss": 0.17914563417434692, "memory_gb": 7.721559524536133, "step_time_ms": 3346.851348876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:51] (step=0012435) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.24164399533618344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12436, "loss": 0.15997472405433655, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4302196502686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:55] (step=0012436) Train Loss: 0.1851, Train Steps/Sec: 0.28, Epoch: 0.24166342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:26:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12437, "loss": 0.10169558227062225, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3079833984375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:26:58] (step=0012437) Train Loss: 0.1783, Train Steps/Sec: 0.28, Epoch: 0.2416828604741547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12438, "loss": 0.22428640723228455, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7166118621826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:02] (step=0012438) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.2417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12439, "loss": 0.3677004873752594, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4087829589844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:06] (step=0012439) Train Loss: 0.3355, Train Steps/Sec: 0.28, Epoch: 0.24172172561212593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12440, "loss": 0.3187605142593384, "memory_gb": 7.721559524536133, "step_time_ms": 3359.025239944458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:09] (step=0012440) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.24174115818111155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12441, "loss": 0.33003097772598267, "memory_gb": 7.721559524536133, "step_time_ms": 3353.614330291748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:13] (step=0012441) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.24176059075009718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12442, "loss": 0.1898893117904663, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2239360809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:16] (step=0012442) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.24178002331908277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12443, "loss": 0.2705276608467102, "memory_gb": 7.721559524536133, "step_time_ms": 3339.8680686950684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:20] (step=0012443) Train Loss: 0.2097, Train Steps/Sec: 0.29, Epoch: 0.2417994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12444, "loss": 0.19192439317703247, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4419956207275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:23] (step=0012444) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.24181888845705402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12445, "loss": 0.19553034007549286, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4905395507812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:27] (step=0012445) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.24183832102603964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12446, "loss": 0.1407812237739563, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9249687194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:30] (step=0012446) Train Loss: 0.1836, Train Steps/Sec: 0.28, Epoch: 0.24185775359502526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12447, "loss": 0.2712514400482178, "memory_gb": 7.721559524536133, "step_time_ms": 3357.267379760742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:34] (step=0012447) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.24187718616401088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12448, "loss": 0.24821540713310242, "memory_gb": 7.721559524536133, "step_time_ms": 3360.089063644409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:38] (step=0012448) Train Loss: 0.2451, Train Steps/Sec: 0.27, Epoch: 0.2418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12449, "loss": 0.20203548669815063, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3902378082275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:41] (step=0012449) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.24191605130198213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12450, "loss": 0.19321973621845245, "memory_gb": 7.721559524536133, "step_time_ms": 3347.2936153411865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:45] (step=0012450) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.24193548387096775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12451, "loss": 0.2764558792114258, "memory_gb": 7.721559524536133, "step_time_ms": 3350.898265838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:48] (step=0012451) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.24195491643995337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12452, "loss": 0.22446918487548828, "memory_gb": 7.721559524536133, "step_time_ms": 3360.919237136841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:52] (step=0012452) Train Loss: 0.2839, Train Steps/Sec: 0.28, Epoch: 0.241974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12453, "loss": 0.20907273888587952, "memory_gb": 7.721559524536133, "step_time_ms": 3359.527826309204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:56] (step=0012453) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.2419937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:27:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12454, "loss": 0.35611820220947266, "memory_gb": 7.721559524536133, "step_time_ms": 3346.403121948242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:27:59] (step=0012454) Train Loss: 0.3102, Train Steps/Sec: 0.28, Epoch: 0.2420132141469102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12455, "loss": 0.20927110314369202, "memory_gb": 7.721559524536133, "step_time_ms": 3361.227035522461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:03] (step=0012455) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.24203264671589583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12456, "loss": 0.22661659121513367, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7442378997803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:06] (step=0012456) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.24205207928488146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12457, "loss": 0.2464565932750702, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1489067077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:10] (step=0012457) Train Loss: 0.1892, Train Steps/Sec: 0.28, Epoch: 0.24207151185386708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12458, "loss": 0.2015272080898285, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0292205810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:13] (step=0012458) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.2420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12459, "loss": 0.3317752778530121, "memory_gb": 7.721559524536133, "step_time_ms": 3363.187551498413, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:17] (step=0012459) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.24211037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12460, "loss": 0.24218730628490448, "memory_gb": 7.721559524536133, "step_time_ms": 3499.6707439422607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:21] (step=0012460) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.24212980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12461, "loss": 0.24862419068813324, "memory_gb": 7.721559524536133, "step_time_ms": 3363.875150680542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:24] (step=0012461) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.24214924212980957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12462, "loss": 0.2397395670413971, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7900562286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:28] (step=0012462) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.2421686746987952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12463, "loss": 0.30390238761901855, "memory_gb": 7.721559524536133, "step_time_ms": 3360.196828842163, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:31] (step=0012463) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.2421881072677808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12464, "loss": 0.17179690301418304, "memory_gb": 7.721559524536133, "step_time_ms": 3362.393856048584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:35] (step=0012464) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.24220753983676643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12465, "loss": 0.20346646010875702, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4932556152344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:38] (step=0012465) Train Loss: 0.2806, Train Steps/Sec: 0.28, Epoch: 0.24222697240575203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12466, "loss": 0.26209813356399536, "memory_gb": 7.721559524536133, "step_time_ms": 3362.105369567871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:42] (step=0012466) Train Loss: 0.3137, Train Steps/Sec: 0.28, Epoch: 0.24224640497473765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12467, "loss": 0.15724217891693115, "memory_gb": 7.721559524536133, "step_time_ms": 3352.607488632202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:46] (step=0012467) Train Loss: 0.1897, Train Steps/Sec: 0.28, Epoch: 0.24226583754372327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12468, "loss": 0.24029189348220825, "memory_gb": 7.721559524536133, "step_time_ms": 3369.4043159484863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:49] (step=0012468) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.2422852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12469, "loss": 0.29324641823768616, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0623302459717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:53] (step=0012469) Train Loss: 0.2965, Train Steps/Sec: 0.28, Epoch: 0.24230470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:28:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12470, "loss": 0.3151811957359314, "memory_gb": 7.721559524536133, "step_time_ms": 3356.255292892456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:28:56] (step=0012470) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.24232413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12471, "loss": 0.1649399697780609, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2357425689697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:00] (step=0012471) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.24234356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12472, "loss": 0.3005799651145935, "memory_gb": 7.721559524536133, "step_time_ms": 3365.666627883911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:03] (step=0012472) Train Loss: 0.2325, Train Steps/Sec: 0.28, Epoch: 0.24236300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12473, "loss": 0.2480594962835312, "memory_gb": 7.721559524536133, "step_time_ms": 3364.086627960205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:07] (step=0012473) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.242382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12474, "loss": 0.17968323826789856, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4707927703857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:11] (step=0012474) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.24240186552662263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12475, "loss": 0.25274163484573364, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4698696136475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:14] (step=0012475) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.24242129809560825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12476, "loss": 0.31055721640586853, "memory_gb": 7.721559524536133, "step_time_ms": 3364.889621734619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:18] (step=0012476) Train Loss: 0.2689, Train Steps/Sec: 0.28, Epoch: 0.24244073066459387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12477, "loss": 0.3242414593696594, "memory_gb": 7.721559524536133, "step_time_ms": 3355.835437774658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:21] (step=0012477) Train Loss: 0.2797, Train Steps/Sec: 0.28, Epoch: 0.24246016323357947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12478, "loss": 0.2049446851015091, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3440265655518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:25] (step=0012478) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.2424795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12479, "loss": 0.24188140034675598, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1103134155273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:29] (step=0012479) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.2424990283715507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12480, "loss": 0.21213603019714355, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7543239593506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:32] (step=0012480) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.24251846094053633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12481, "loss": 0.24504923820495605, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5941276550293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:36] (step=0012481) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.24253789350952196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12482, "loss": 0.2189396321773529, "memory_gb": 7.721559524536133, "step_time_ms": 3345.297336578369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:39] (step=0012482) Train Loss: 0.1910, Train Steps/Sec: 0.28, Epoch: 0.24255732607850758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12483, "loss": 0.14626768231391907, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3310375213623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:43] (step=0012483) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.2425767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12484, "loss": 0.3412485420703888, "memory_gb": 7.721559524536133, "step_time_ms": 3362.993001937866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:46] (step=0012484) Train Loss: 0.3022, Train Steps/Sec: 0.28, Epoch: 0.24259619121647882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12485, "loss": 0.33923688530921936, "memory_gb": 7.721559524536133, "step_time_ms": 3361.680746078491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:50] (step=0012485) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.24261562378546445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12486, "loss": 0.16309207677841187, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8006496429443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:54] (step=0012486) Train Loss: 0.1698, Train Steps/Sec: 0.28, Epoch: 0.24263505635445007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:29:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12487, "loss": 0.2575860619544983, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7595176696777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:29:57] (step=0012487) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.2426544889234357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12488, "loss": 0.26222214102745056, "memory_gb": 7.721559524536133, "step_time_ms": 3360.015630722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:01] (step=0012488) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.24267392149242129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12489, "loss": 0.2452879548072815, "memory_gb": 7.721559524536133, "step_time_ms": 3362.175941467285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:04] (step=0012489) Train Loss: 0.2968, Train Steps/Sec: 0.28, Epoch: 0.2426933540614069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12490, "loss": 0.17415617406368256, "memory_gb": 7.721559524536133, "step_time_ms": 3363.835334777832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:08] (step=0012490) Train Loss: 0.1713, Train Steps/Sec: 0.28, Epoch: 0.24271278663039253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12491, "loss": 0.22234031558036804, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1155490875244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:11] (step=0012491) Train Loss: 0.2533, Train Steps/Sec: 0.28, Epoch: 0.24273221919937815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12492, "loss": 0.32574033737182617, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3861331939697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:15] (step=0012492) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.24275165176836377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12493, "loss": 0.20350395143032074, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6109619140625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:19] (step=0012493) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2427710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12494, "loss": 0.20758283138275146, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3193759918213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:22] (step=0012494) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.24279051690633502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12495, "loss": 0.21319571137428284, "memory_gb": 7.721559524536133, "step_time_ms": 3352.919816970825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:26] (step=0012495) Train Loss: 0.2056, Train Steps/Sec: 0.27, Epoch: 0.24280994947532064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12496, "loss": 0.24946700036525726, "memory_gb": 7.715639114379883, "step_time_ms": 3325.5391120910645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:29] (step=0012496) Train Loss: 0.2652, Train Steps/Sec: 0.28, Epoch: 0.24282938204430626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12497, "loss": 0.14258983731269836, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1743240356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:33] (step=0012497) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.24284881461329189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12498, "loss": 0.16503901779651642, "memory_gb": 7.715639114379883, "step_time_ms": 3327.744245529175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:37] (step=0012498) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.2428682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12499, "loss": 0.1978955864906311, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5047931671143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:40] (step=0012499) Train Loss: 0.2629, Train Steps/Sec: 0.28, Epoch: 0.24288767975126313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12500, "loss": 0.17224827408790588, "memory_gb": 7.721559524536133, "step_time_ms": 3359.726667404175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:44] (step=0012500) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.24290711232024872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12501, "loss": 0.24399176239967346, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1808853149414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:47] (step=0012501) Train Loss: 0.2123, Train Steps/Sec: 0.28, Epoch: 0.24292654488923435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12502, "loss": 0.33415690064430237, "memory_gb": 7.721559524536133, "step_time_ms": 3359.013319015503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:51] (step=0012502) Train Loss: 0.2948, Train Steps/Sec: 0.28, Epoch: 0.24294597745821997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12503, "loss": 0.25974905490875244, "memory_gb": 7.721559524536133, "step_time_ms": 3355.748414993286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:54] (step=0012503) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.2429654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:30:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12504, "loss": 0.3214573264122009, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8473587036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:30:58] (step=0012504) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.24298484259619121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12505, "loss": 0.25968340039253235, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6177825927734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:02] (step=0012505) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.24300427516517684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12506, "loss": 0.1982739269733429, "memory_gb": 7.721559524536133, "step_time_ms": 3353.834867477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:05] (step=0012506) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.24302370773416246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12507, "loss": 0.1686634123325348, "memory_gb": 7.721559524536133, "step_time_ms": 3356.02068901062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:09] (step=0012507) Train Loss: 0.2006, Train Steps/Sec: 0.28, Epoch: 0.24304314030314808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12508, "loss": 0.3038649559020996, "memory_gb": 7.715639114379883, "step_time_ms": 3461.4951610565186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:12] (step=0012508) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.2430625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12509, "loss": 0.3455577790737152, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1950664520264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:16] (step=0012509) Train Loss: 0.2869, Train Steps/Sec: 0.28, Epoch: 0.24308200544111933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12510, "loss": 0.36177751421928406, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4532012939453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:19] (step=0012510) Train Loss: 0.3146, Train Steps/Sec: 0.28, Epoch: 0.24310143801010495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12511, "loss": 0.29802146553993225, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0403118133545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:23] (step=0012511) Train Loss: 0.2866, Train Steps/Sec: 0.28, Epoch: 0.24312087057909054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12512, "loss": 0.34547337889671326, "memory_gb": 7.721559524536133, "step_time_ms": 3357.898712158203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:27] (step=0012512) Train Loss: 0.3148, Train Steps/Sec: 0.28, Epoch: 0.24314030314807616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12513, "loss": 0.30140334367752075, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4955463409424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:30] (step=0012513) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.2431597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12514, "loss": 0.2609911859035492, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5748462677, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:34] (step=0012514) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.2431791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12515, "loss": 0.1554879993200302, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1271686553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:37] (step=0012515) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.24319860085503303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12516, "loss": 0.28317755460739136, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8056144714355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:41] (step=0012516) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.24321803342401865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12517, "loss": 0.3124624490737915, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8574962615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:44] (step=0012517) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.24323746599300428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12518, "loss": 0.23411661386489868, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8267097473145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:48] (step=0012518) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.2432568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12519, "loss": 0.1983334869146347, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4515323638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:51] (step=0012519) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.24327633113097552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12520, "loss": 0.18110570311546326, "memory_gb": 7.721559524536133, "step_time_ms": 3349.189043045044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:55] (step=0012520) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.24329576369996114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:31:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12521, "loss": 0.16672158241271973, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3788661956787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:31:59] (step=0012521) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.24331519626894677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12522, "loss": 0.28756988048553467, "memory_gb": 7.721559524536133, "step_time_ms": 3360.884428024292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:02] (step=0012522) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.2433346288379324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12523, "loss": 0.22217708826065063, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6273498535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:06] (step=0012523) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.24335406140691798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12524, "loss": 0.35395705699920654, "memory_gb": 7.721559524536133, "step_time_ms": 3353.213310241699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:09] (step=0012524) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.2433734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12525, "loss": 0.259107768535614, "memory_gb": 7.721559524536133, "step_time_ms": 3357.203483581543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:13] (step=0012525) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.24339292654488923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12526, "loss": 0.2953469455242157, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8357696533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:16] (step=0012526) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.24341235911387485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12527, "loss": 0.21552270650863647, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8998832702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:20] (step=0012527) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.24343179168286047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12528, "loss": 0.2268216609954834, "memory_gb": 7.721559524536133, "step_time_ms": 3356.243371963501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:24] (step=0012528) Train Loss: 0.1886, Train Steps/Sec: 0.28, Epoch: 0.2434512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12529, "loss": 0.16556601226329803, "memory_gb": 7.721559524536133, "step_time_ms": 3351.102590560913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:27] (step=0012529) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.24347065682083172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12530, "loss": 0.19630581140518188, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3405227661133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:31] (step=0012530) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.24349008938981734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12531, "loss": 0.34221476316452026, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4252338409424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:34] (step=0012531) Train Loss: 0.2929, Train Steps/Sec: 0.28, Epoch: 0.24350952195880296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12532, "loss": 0.16221001744270325, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2845458984375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:38] (step=0012532) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.24352895452778858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12533, "loss": 0.29225724935531616, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4587574005127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:41] (step=0012533) Train Loss: 0.2817, Train Steps/Sec: 0.28, Epoch: 0.2435483870967742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12534, "loss": 0.23370227217674255, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1412353515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:45] (step=0012534) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.24356781966575983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12535, "loss": 0.2282184213399887, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2100143432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:49] (step=0012535) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.24358725223474542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12536, "loss": 0.24592137336730957, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3274097442627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:52] (step=0012536) Train Loss: 0.2617, Train Steps/Sec: 0.27, Epoch: 0.24360668480373104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12537, "loss": 0.2642230689525604, "memory_gb": 7.721559524536133, "step_time_ms": 3354.160785675049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:56] (step=0012537) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.24362611737271667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:32:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12538, "loss": 0.2007586658000946, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7799072265625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:32:59] (step=0012538) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.2436455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12539, "loss": 0.21485736966133118, "memory_gb": 7.721559524536133, "step_time_ms": 3358.419895172119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:03] (step=0012539) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.2436649825106879, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12540, "loss": 0.25087636709213257, "memory_gb": 7.721559524536133, "step_time_ms": 3357.989549636841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:06] (step=0012540) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.24368441507967353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12541, "loss": 0.26031380891799927, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6743602752686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:10] (step=0012541) Train Loss: 0.2744, Train Steps/Sec: 0.28, Epoch: 0.24370384764865916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12542, "loss": 0.3018415570259094, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8027019500732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:14] (step=0012542) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.24372328021764478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12543, "loss": 0.2259439080953598, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8268966674805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:17] (step=0012543) Train Loss: 0.2035, Train Steps/Sec: 0.28, Epoch: 0.2437427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12544, "loss": 0.2880117893218994, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7826957702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:21] (step=0012544) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.24376214535561602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12545, "loss": 0.2881585955619812, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6749305725098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:24] (step=0012545) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.24378157792460164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12546, "loss": 0.1715037226676941, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9798469543457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:28] (step=0012546) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.24380101049358724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12547, "loss": 0.13412326574325562, "memory_gb": 7.721559524536133, "step_time_ms": 3357.856512069702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:31] (step=0012547) Train Loss: 0.1554, Train Steps/Sec: 0.28, Epoch: 0.24382044306257286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12548, "loss": 0.3256009519100189, "memory_gb": 7.721559524536133, "step_time_ms": 3498.6071586608887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:35] (step=0012548) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.24383987563155848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12549, "loss": 0.30268269777297974, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5668334960938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:39] (step=0012549) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.2438593082005441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12550, "loss": 0.2624868154525757, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0704669952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:42] (step=0012550) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.24387874076952973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12551, "loss": 0.27850380539894104, "memory_gb": 7.721559524536133, "step_time_ms": 3358.307123184204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:46] (step=0012551) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.24389817333851535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12552, "loss": 0.22346100211143494, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0269813537598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:49] (step=0012552) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.24391760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12553, "loss": 0.3069306015968323, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1862449645996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:53] (step=0012553) Train Loss: 0.2794, Train Steps/Sec: 0.28, Epoch: 0.2439370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12554, "loss": 0.26673102378845215, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0409755706787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:33:57] (step=0012554) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.24395647104547222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12555, "loss": 0.26395487785339355, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3781204223633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:00] (step=0012555) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.24397590361445784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12556, "loss": 0.16463348269462585, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0761890411377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:04] (step=0012556) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.24399533618344346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12557, "loss": 0.1746707558631897, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1201095581055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:07] (step=0012557) Train Loss: 0.1883, Train Steps/Sec: 0.28, Epoch: 0.24401476875242908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12558, "loss": 0.17584459483623505, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1295738220215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:11] (step=0012558) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.24403420132141468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12559, "loss": 0.3321647346019745, "memory_gb": 7.721559524536133, "step_time_ms": 3361.20867729187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:14] (step=0012559) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.2440536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12560, "loss": 0.20675326883792877, "memory_gb": 7.721559524536133, "step_time_ms": 3364.733934402466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:18] (step=0012560) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.24407306645938592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12561, "loss": 0.24838794767856598, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6975288391113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:22] (step=0012561) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.24409249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12562, "loss": 0.32137492299079895, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6509647369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:25] (step=0012562) Train Loss: 0.2905, Train Steps/Sec: 0.28, Epoch: 0.24411193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12563, "loss": 0.2229788601398468, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2309226989746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:29] (step=0012563) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.2441313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12564, "loss": 0.29260915517807007, "memory_gb": 7.721559524536133, "step_time_ms": 3366.100311279297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:32] (step=0012564) Train Loss: 0.2966, Train Steps/Sec: 0.28, Epoch: 0.2441507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12565, "loss": 0.15456044673919678, "memory_gb": 7.721559524536133, "step_time_ms": 3354.449510574341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:36] (step=0012565) Train Loss: 0.2174, Train Steps/Sec: 0.28, Epoch: 0.24417022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12566, "loss": 0.24526220560073853, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2992248535156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:39] (step=0012566) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.24418966187329966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12567, "loss": 0.24838808178901672, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2469062805176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:43] (step=0012567) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.24420909444228528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12568, "loss": 0.15693753957748413, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9698543548584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:46] (step=0012568) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.2442285270112709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12569, "loss": 0.238124817609787, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3716106414795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:50] (step=0012569) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.2442479595802565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12570, "loss": 0.3085443079471588, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6767654418945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:54] (step=0012570) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.24426739214924212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12571, "loss": 0.3434199392795563, "memory_gb": 7.721559524536133, "step_time_ms": 3365.800142288208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:34:57] (step=0012571) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.24428682471822774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12572, "loss": 0.21919435262680054, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4841651916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:01] (step=0012572) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.24430625728721336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12573, "loss": 0.22138436138629913, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2807941436768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:04] (step=0012573) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.24432568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12574, "loss": 0.21345946192741394, "memory_gb": 7.721559524536133, "step_time_ms": 3362.096071243286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:08] (step=0012574) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.2443451224251846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12575, "loss": 0.20608699321746826, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2326431274414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:11] (step=0012575) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.24436455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12576, "loss": 0.21733428537845612, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0754013061523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:15] (step=0012576) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.24438398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12577, "loss": 0.17048826813697815, "memory_gb": 7.721559524536133, "step_time_ms": 3360.957622528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:19] (step=0012577) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.24440342013214147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12578, "loss": 0.2651079297065735, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9071502685547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:22] (step=0012578) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.2444228527011271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12579, "loss": 0.221521258354187, "memory_gb": 7.721559524536133, "step_time_ms": 3358.062505722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:26] (step=0012579) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.24444228527011272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12580, "loss": 0.19981318712234497, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7804775238037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:29] (step=0012580) Train Loss: 0.2418, Train Steps/Sec: 0.28, Epoch: 0.24446171783909834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12581, "loss": 0.24496912956237793, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9766960144043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:33] (step=0012581) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.24448115040808394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12582, "loss": 0.25534161925315857, "memory_gb": 7.721559524536133, "step_time_ms": 3359.39884185791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:36] (step=0012582) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.24450058297706956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12583, "loss": 0.3227093517780304, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6952171325684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:40] (step=0012583) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.24452001554605518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12584, "loss": 0.2540664076805115, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5562915802, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:44] (step=0012584) Train Loss: 0.2712, Train Steps/Sec: 0.27, Epoch: 0.2445394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12585, "loss": 0.2684245705604553, "memory_gb": 7.721559524536133, "step_time_ms": 3362.086296081543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:47] (step=0012585) Train Loss: 0.2780, Train Steps/Sec: 0.28, Epoch: 0.24455888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12586, "loss": 0.26383402943611145, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3312969207764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:51] (step=0012586) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.24457831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12587, "loss": 0.2220780998468399, "memory_gb": 7.721559524536133, "step_time_ms": 3358.86549949646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:54] (step=0012587) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.24459774582199767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12588, "loss": 0.283696711063385, "memory_gb": 7.721559524536133, "step_time_ms": 3357.827663421631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:35:58] (step=0012588) Train Loss: 0.3030, Train Steps/Sec: 0.28, Epoch: 0.2446171783909833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12589, "loss": 0.29163074493408203, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8998107910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:02] (step=0012589) Train Loss: 0.2898, Train Steps/Sec: 0.28, Epoch: 0.24463661095996891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12590, "loss": 0.158979132771492, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3075065612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:05] (step=0012590) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.24465604352895454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12591, "loss": 0.18868547677993774, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4988536834717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:09] (step=0012591) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.24467547609794016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12592, "loss": 0.19636216759681702, "memory_gb": 7.721559524536133, "step_time_ms": 3359.886884689331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:12] (step=0012592) Train Loss: 0.2799, Train Steps/Sec: 0.28, Epoch: 0.24469490866692578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12593, "loss": 0.29672670364379883, "memory_gb": 7.721559524536133, "step_time_ms": 3347.6922512054443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:16] (step=0012593) Train Loss: 0.2329, Train Steps/Sec: 0.28, Epoch: 0.24471434123591138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12594, "loss": 0.3073834180831909, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0595302581787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:19] (step=0012594) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.244733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12595, "loss": 0.28065061569213867, "memory_gb": 7.721559524536133, "step_time_ms": 3508.8868141174316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:23] (step=0012595) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.24475320637388262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12596, "loss": 0.22563132643699646, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2955436706543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:27] (step=0012596) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.24477263894286824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12597, "loss": 0.3364737629890442, "memory_gb": 7.721559524536133, "step_time_ms": 3341.0964012145996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:30] (step=0012597) Train Loss: 0.3350, Train Steps/Sec: 0.28, Epoch: 0.24479207151185386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12598, "loss": 0.23924747109413147, "memory_gb": 7.721559524536133, "step_time_ms": 3355.816602706909, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:34] (step=0012598) Train Loss: 0.2279, Train Steps/Sec: 0.28, Epoch: 0.2448115040808395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12599, "loss": 0.16529972851276398, "memory_gb": 7.721559524536133, "step_time_ms": 3355.344533920288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:37] (step=0012599) Train Loss: 0.2488, Train Steps/Sec: 0.28, Epoch: 0.2448309366498251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12600, "loss": 0.19506612420082092, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6906452178955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:41] (step=0012600) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.24485036921881073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12601, "loss": 0.2160644829273224, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4140071868896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:44] (step=0012601) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.24486980178779635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12602, "loss": 0.13773313164710999, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1325073242188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:48] (step=0012602) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.24488923435678198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12603, "loss": 0.21071408689022064, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2748432159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:52] (step=0012603) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.2449086669257676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12604, "loss": 0.24551944434642792, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2114028930664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:55] (step=0012604) Train Loss: 0.1896, Train Steps/Sec: 0.28, Epoch: 0.2449280994947532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12605, "loss": 0.3032315969467163, "memory_gb": 7.721559524536133, "step_time_ms": 3358.367681503296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:36:59] (step=0012605) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.24494753206373882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12606, "loss": 0.27063649892807007, "memory_gb": 7.721559524536133, "step_time_ms": 3345.2064990997314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:02] (step=0012606) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.24496696463272444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12607, "loss": 0.36193323135375977, "memory_gb": 7.721559524536133, "step_time_ms": 3355.625629425049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:06] (step=0012607) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.24498639720171006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12608, "loss": 0.27773356437683105, "memory_gb": 7.721559524536133, "step_time_ms": 3356.553077697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:09] (step=0012608) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.24500582977069568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12609, "loss": 0.1586328148841858, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2474250793457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:13] (step=0012609) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.2450252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12610, "loss": 0.1363353431224823, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5706787109375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:17] (step=0012610) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.24504469490866693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12611, "loss": 0.29987087845802307, "memory_gb": 7.721559524536133, "step_time_ms": 3349.595785140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:20] (step=0012611) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.24506412747765255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12612, "loss": 0.25977224111557007, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3547859191895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:24] (step=0012612) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.24508356004663817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12613, "loss": 0.26749199628829956, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9066524505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:27] (step=0012613) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.2451029926156238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12614, "loss": 0.2977067232131958, "memory_gb": 7.721559524536133, "step_time_ms": 3354.368209838867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:31] (step=0012614) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.24512242518460942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12615, "loss": 0.20539426803588867, "memory_gb": 7.721559524536133, "step_time_ms": 3341.1362171173096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:34] (step=0012615) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.24514185775359504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12616, "loss": 0.20364132523536682, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4962615966797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:38] (step=0012616) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.24516129032258063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12617, "loss": 0.3013915419578552, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4558963775635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:41] (step=0012617) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.24518072289156626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12618, "loss": 0.31885164976119995, "memory_gb": 7.721559524536133, "step_time_ms": 3336.1563682556152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:45] (step=0012618) Train Loss: 0.2939, Train Steps/Sec: 0.28, Epoch: 0.24520015546055188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12619, "loss": 0.27104252576828003, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7112941741943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:49] (step=0012619) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.2452195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12620, "loss": 0.24173569679260254, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1877269744873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:52] (step=0012620) Train Loss: 0.1724, Train Steps/Sec: 0.28, Epoch: 0.24523902059852312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12621, "loss": 0.2927486300468445, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7585945129395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:56] (step=0012621) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.24525845316750874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:37:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12622, "loss": 0.18121609091758728, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1320514678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:37:59] (step=0012622) Train Loss: 0.1503, Train Steps/Sec: 0.28, Epoch: 0.24527788573649437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12623, "loss": 0.21528375148773193, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0893325805664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:03] (step=0012623) Train Loss: 0.2667, Train Steps/Sec: 0.28, Epoch: 0.24529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12624, "loss": 0.19837971031665802, "memory_gb": 7.721559524536133, "step_time_ms": 3354.741334915161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:07] (step=0012624) Train Loss: 0.1807, Train Steps/Sec: 0.27, Epoch: 0.2453167508744656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12625, "loss": 0.2377389669418335, "memory_gb": 7.721559524536133, "step_time_ms": 3354.588508605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:10] (step=0012625) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.24533618344345123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12626, "loss": 0.1408190131187439, "memory_gb": 7.721559524536133, "step_time_ms": 3354.060649871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:14] (step=0012626) Train Loss: 0.1818, Train Steps/Sec: 0.28, Epoch: 0.24535561601243686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12627, "loss": 0.1694098860025406, "memory_gb": 7.721559524536133, "step_time_ms": 3354.311943054199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:17] (step=0012627) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.24537504858142245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12628, "loss": 0.15999388694763184, "memory_gb": 7.721559524536133, "step_time_ms": 3357.451915740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:21] (step=0012628) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.24539448115040807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12629, "loss": 0.2384759783744812, "memory_gb": 7.721559524536133, "step_time_ms": 3351.2632846832275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:24] (step=0012629) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.2454139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12630, "loss": 0.13355623185634613, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9434204101562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:28] (step=0012630) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.24543334628837932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12631, "loss": 0.23838801681995392, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3427200317383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:31] (step=0012631) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.24545277885736494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12632, "loss": 0.23499687016010284, "memory_gb": 7.721559524536133, "step_time_ms": 3352.527141571045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:35] (step=0012632) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.24547221142635056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12633, "loss": 0.19775578379631042, "memory_gb": 7.721559524536133, "step_time_ms": 3338.7231826782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:39] (step=0012633) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.24549164399533618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12634, "loss": 0.18134933710098267, "memory_gb": 7.721559524536133, "step_time_ms": 3355.396032333374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:42] (step=0012634) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.2455110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12635, "loss": 0.20990604162216187, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7192344665527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:46] (step=0012635) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.24553050913330743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12636, "loss": 0.28611990809440613, "memory_gb": 7.721559524536133, "step_time_ms": 3497.4899291992188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:49] (step=0012636) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.24554994170229305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12637, "loss": 0.18384644389152527, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8924922943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:53] (step=0012637) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.24556937427127867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:38:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12638, "loss": 0.2683770954608917, "memory_gb": 7.721559524536133, "step_time_ms": 3362.38694190979, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:38:56] (step=0012638) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.2455888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12639, "loss": 0.20017585158348083, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1432571411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:00] (step=0012639) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.2456082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12640, "loss": 0.1969740092754364, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9680919647217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:04] (step=0012640) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.2456276719782355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12641, "loss": 0.24777354300022125, "memory_gb": 7.721559524536133, "step_time_ms": 3357.595682144165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:07] (step=0012641) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.24564710454722113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12642, "loss": 0.2816660404205322, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2101078033447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:11] (step=0012642) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.24566653711620676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12643, "loss": 0.12465069442987442, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6907596588135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:14] (step=0012643) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.24568596968519238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12644, "loss": 0.25714123249053955, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6339740753174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:18] (step=0012644) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.245705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12645, "loss": 0.25464683771133423, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3765964508057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:21] (step=0012645) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.24572483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12646, "loss": 0.23647135496139526, "memory_gb": 7.721559524536133, "step_time_ms": 3362.7891540527344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:25] (step=0012646) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.24574426739214925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12647, "loss": 0.26376691460609436, "memory_gb": 7.721559524536133, "step_time_ms": 3352.470636367798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:29] (step=0012647) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.24576369996113487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12648, "loss": 0.24672716856002808, "memory_gb": 7.721559524536133, "step_time_ms": 3349.687099456787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:32] (step=0012648) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.2457831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12649, "loss": 0.20488205552101135, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6867065429688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:36] (step=0012649) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.2458025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12650, "loss": 0.24829712510108948, "memory_gb": 7.721559524536133, "step_time_ms": 3363.832950592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:39] (step=0012650) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.24582199766809174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12651, "loss": 0.28381407260894775, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1126461029053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:43] (step=0012651) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.24584143023707733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12652, "loss": 0.2156330645084381, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4844036102295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:46] (step=0012652) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.24586086280606295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12653, "loss": 0.25502702593803406, "memory_gb": 7.721559524536133, "step_time_ms": 3353.436231613159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:50] (step=0012653) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.24588029537504857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12654, "loss": 0.2125086486339569, "memory_gb": 7.721559524536133, "step_time_ms": 3366.24813079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:54] (step=0012654) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.2458997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12655, "loss": 0.27870064973831177, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5239391326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:39:57] (step=0012655) Train Loss: 0.2500, Train Steps/Sec: 0.28, Epoch: 0.24591916051301982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12656, "loss": 0.2355811446905136, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2919063568115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:01] (step=0012656) Train Loss: 0.2077, Train Steps/Sec: 0.28, Epoch: 0.24593859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12657, "loss": 0.22632722556591034, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8786125183105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:04] (step=0012657) Train Loss: 0.2732, Train Steps/Sec: 0.28, Epoch: 0.24595802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12658, "loss": 0.30240052938461304, "memory_gb": 7.721559524536133, "step_time_ms": 3364.107847213745, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:08] (step=0012658) Train Loss: 0.3080, Train Steps/Sec: 0.28, Epoch: 0.24597745821997669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12659, "loss": 0.25619494915008545, "memory_gb": 7.721559524536133, "step_time_ms": 3366.2285804748535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:11] (step=0012659) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.2459968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12660, "loss": 0.17152512073516846, "memory_gb": 7.721559524536133, "step_time_ms": 3360.816478729248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:15] (step=0012660) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.24601632335794793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12661, "loss": 0.3053877353668213, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1883602142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:19] (step=0012661) Train Loss: 0.3004, Train Steps/Sec: 0.28, Epoch: 0.24603575592693355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12662, "loss": 0.3100181221961975, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9802417755127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:22] (step=0012662) Train Loss: 0.2972, Train Steps/Sec: 0.28, Epoch: 0.24605518849591915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12663, "loss": 0.2898484766483307, "memory_gb": 7.721559524536133, "step_time_ms": 3363.48557472229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:26] (step=0012663) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.24607462106490477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12664, "loss": 0.22227685153484344, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7513389587402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:29] (step=0012664) Train Loss: 0.1974, Train Steps/Sec: 0.28, Epoch: 0.2460940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12665, "loss": 0.1778758317232132, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4632358551025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:33] (step=0012665) Train Loss: 0.2070, Train Steps/Sec: 0.28, Epoch: 0.24611348620287601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12666, "loss": 0.3171822428703308, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6397075653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:36] (step=0012666) Train Loss: 0.3104, Train Steps/Sec: 0.28, Epoch: 0.24613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12667, "loss": 0.28021764755249023, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9792461395264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:40] (step=0012667) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.24615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12668, "loss": 0.30685746669769287, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7566051483154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:44] (step=0012668) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.24617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12669, "loss": 0.15628600120544434, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5450344085693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:47] (step=0012669) Train Loss: 0.1535, Train Steps/Sec: 0.28, Epoch: 0.2461912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12670, "loss": 0.20397870242595673, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9633140563965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:51] (step=0012670) Train Loss: 0.1742, Train Steps/Sec: 0.28, Epoch: 0.24621064904780413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12671, "loss": 0.23026058077812195, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7875785827637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:54] (step=0012671) Train Loss: 0.2171, Train Steps/Sec: 0.27, Epoch: 0.24623008161678975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12672, "loss": 0.17912884056568146, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8339042663574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:40:58] (step=0012672) Train Loss: 0.2140, Train Steps/Sec: 0.28, Epoch: 0.24624951418577537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12673, "loss": 0.22469039261341095, "memory_gb": 7.721559524536133, "step_time_ms": 3364.748477935791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:02] (step=0012673) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.246268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12674, "loss": 0.2578919231891632, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1130714416504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:05] (step=0012674) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.2462883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12675, "loss": 0.3662465214729309, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6161556243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:09] (step=0012675) Train Loss: 0.3065, Train Steps/Sec: 0.28, Epoch: 0.2463078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12676, "loss": 0.18093587458133698, "memory_gb": 7.721559524536133, "step_time_ms": 3354.280948638916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:12] (step=0012676) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.24632724446171783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12677, "loss": 0.14867472648620605, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4910793304443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:16] (step=0012677) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.24634667703070345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12678, "loss": 0.2623973488807678, "memory_gb": 7.721559524536133, "step_time_ms": 3364.811897277832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:19] (step=0012678) Train Loss: 0.2656, Train Steps/Sec: 0.28, Epoch: 0.24636610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12679, "loss": 0.24681119620800018, "memory_gb": 7.721559524536133, "step_time_ms": 3359.292507171631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:23] (step=0012679) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.2463855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12680, "loss": 0.31281474232673645, "memory_gb": 7.721559524536133, "step_time_ms": 3368.417501449585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:27] (step=0012680) Train Loss: 0.3114, Train Steps/Sec: 0.28, Epoch: 0.24640497473766032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12681, "loss": 0.22632640600204468, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3998680114746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:30] (step=0012681) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.24642440730664594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12682, "loss": 0.12327784299850464, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6106510162354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:34] (step=0012682) Train Loss: 0.1643, Train Steps/Sec: 0.28, Epoch: 0.24644383987563157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12683, "loss": 0.2986224591732025, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1981658935547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:37] (step=0012683) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.2464632724446172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12684, "loss": 0.2962384819984436, "memory_gb": 7.721559524536133, "step_time_ms": 3506.303071975708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:41] (step=0012684) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.2464827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12685, "loss": 0.19049999117851257, "memory_gb": 7.721559524536133, "step_time_ms": 3347.240924835205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:44] (step=0012685) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.24650213758258843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12686, "loss": 0.19116944074630737, "memory_gb": 7.721559524536133, "step_time_ms": 3360.771417617798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:48] (step=0012686) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.24652157015157403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12687, "loss": 0.352379709482193, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5648021698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:52] (step=0012687) Train Loss: 0.3364, Train Steps/Sec: 0.28, Epoch: 0.24654100272055965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12688, "loss": 0.17104962468147278, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5801849365234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:55] (step=0012688) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.24656043528954527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:41:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12689, "loss": 0.2879621684551239, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8954677581787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:41:59] (step=0012689) Train Loss: 0.2780, Train Steps/Sec: 0.28, Epoch: 0.2465798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12690, "loss": 0.17923283576965332, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9306812286377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:02] (step=0012690) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.24659930042751652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12691, "loss": 0.17952175438404083, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3945922851562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:06] (step=0012691) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.24661873299650214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12692, "loss": 0.24553337693214417, "memory_gb": 7.721559524536133, "step_time_ms": 3346.94242477417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:09] (step=0012692) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.24663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12693, "loss": 0.26948821544647217, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3266735076904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:13] (step=0012693) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.24665759813447338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12694, "loss": 0.2757721543312073, "memory_gb": 7.721559524536133, "step_time_ms": 3360.283851623535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:17] (step=0012694) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.246677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12695, "loss": 0.24443760514259338, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1322174072266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:20] (step=0012695) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.24669646327244463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12696, "loss": 0.2712475061416626, "memory_gb": 7.721559524536133, "step_time_ms": 3361.506700515747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:24] (step=0012696) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.24671589584143025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12697, "loss": 0.3233250379562378, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9915809631348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:27] (step=0012697) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.24673532841041584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12698, "loss": 0.22741714119911194, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4910373687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:31] (step=0012698) Train Loss: 0.2240, Train Steps/Sec: 0.28, Epoch: 0.24675476097940147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12699, "loss": 0.27335238456726074, "memory_gb": 7.721559524536133, "step_time_ms": 3362.518310546875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:34] (step=0012699) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.2467741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12700, "loss": 0.32197248935699463, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2701168060303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:38] (step=0012700) Train Loss: 0.3045, Train Steps/Sec: 0.28, Epoch: 0.2467936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12701, "loss": 0.1670798361301422, "memory_gb": 7.721559524536133, "step_time_ms": 3356.403112411499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:42] (step=0012701) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.24681305868635833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12702, "loss": 0.0712321400642395, "memory_gb": 7.721559524536133, "step_time_ms": 3357.503652572632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:45] (step=0012702) Train Loss: 0.1658, Train Steps/Sec: 0.28, Epoch: 0.24683249125534396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12703, "loss": 0.3074986934661865, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0893020629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:49] (step=0012703) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.24685192382432958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12704, "loss": 0.2292976677417755, "memory_gb": 7.721559524536133, "step_time_ms": 3341.8524265289307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:52] (step=0012704) Train Loss: 0.2097, Train Steps/Sec: 0.28, Epoch: 0.2468713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12705, "loss": 0.30847835540771484, "memory_gb": 7.721559524536133, "step_time_ms": 3359.06982421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:56] (step=0012705) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.24689078896230082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:42:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12706, "loss": 0.20527684688568115, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5425453186035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:42:59] (step=0012706) Train Loss: 0.2038, Train Steps/Sec: 0.28, Epoch: 0.24691022153128644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12707, "loss": 0.2137071192264557, "memory_gb": 7.721559524536133, "step_time_ms": 3356.752395629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:03] (step=0012707) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.24692965410027207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12708, "loss": 0.29608064889907837, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9694499969482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:06] (step=0012708) Train Loss: 0.2994, Train Steps/Sec: 0.28, Epoch: 0.2469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12709, "loss": 0.19274452328681946, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1604957580566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:10] (step=0012709) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.24696851923824328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12710, "loss": 0.24627530574798584, "memory_gb": 7.721559524536133, "step_time_ms": 3355.137348175049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:14] (step=0012710) Train Loss: 0.2464, Train Steps/Sec: 0.28, Epoch: 0.2469879518072289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12711, "loss": 0.2858017086982727, "memory_gb": 7.721559524536133, "step_time_ms": 3358.013153076172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:17] (step=0012711) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.24700738437621453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12712, "loss": 0.285015344619751, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4908714294434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:21] (step=0012712) Train Loss: 0.2400, Train Steps/Sec: 0.27, Epoch: 0.24702681694520015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12713, "loss": 0.2116691768169403, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6998291015625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:24] (step=0012713) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.24704624951418577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12714, "loss": 0.23347705602645874, "memory_gb": 7.721559524536133, "step_time_ms": 3347.6850986480713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:28] (step=0012714) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.2470656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12715, "loss": 0.29670971632003784, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7444343566895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:32] (step=0012715) Train Loss: 0.2863, Train Steps/Sec: 0.28, Epoch: 0.24708511465215702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12716, "loss": 0.2617063522338867, "memory_gb": 7.721559524536133, "step_time_ms": 3356.125831604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:35] (step=0012716) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.24710454722114264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12717, "loss": 0.09366600960493088, "memory_gb": 7.721559524536133, "step_time_ms": 3337.291955947876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:39] (step=0012717) Train Loss: 0.1361, Train Steps/Sec: 0.29, Epoch: 0.24712397979012826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12718, "loss": 0.21728450059890747, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8642654418945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:42] (step=0012718) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.24714341235911388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12719, "loss": 0.2098693996667862, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6154499053955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:46] (step=0012719) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.2471628449280995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12720, "loss": 0.14614835381507874, "memory_gb": 7.721559524536133, "step_time_ms": 3353.947877883911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:49] (step=0012720) Train Loss: 0.1887, Train Steps/Sec: 0.28, Epoch: 0.2471822774970851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12721, "loss": 0.1978321522474289, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4624366760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:53] (step=0012721) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.24720171006607072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:43:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12722, "loss": 0.25272154808044434, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2206287384033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:43:56] (step=0012722) Train Loss: 0.2842, Train Steps/Sec: 0.28, Epoch: 0.24722114263505635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12723, "loss": 0.1753203272819519, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9751510620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:00] (step=0012723) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.24724057520404197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12724, "loss": 0.29675787687301636, "memory_gb": 7.721559524536133, "step_time_ms": 3495.501756668091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:04] (step=0012724) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.2472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12725, "loss": 0.20280343294143677, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0098667144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:07] (step=0012725) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.2472794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12726, "loss": 0.19029247760772705, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4765968322754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:11] (step=0012726) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.24729887291099883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12727, "loss": 0.20765036344528198, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5947589874268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:14] (step=0012727) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.24731830547998446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12728, "loss": 0.27888020873069763, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5290699005127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:18] (step=0012728) Train Loss: 0.2838, Train Steps/Sec: 0.28, Epoch: 0.24733773804897008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12729, "loss": 0.1440560519695282, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3342304229736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:21] (step=0012729) Train Loss: 0.1224, Train Steps/Sec: 0.28, Epoch: 0.2473571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12730, "loss": 0.2193969488143921, "memory_gb": 7.721559524536133, "step_time_ms": 3352.442502975464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:25] (step=0012730) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.24737660318694132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12731, "loss": 0.31688907742500305, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9624938964844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:29] (step=0012731) Train Loss: 0.3299, Train Steps/Sec: 0.28, Epoch: 0.24739603575592695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12732, "loss": 0.22967077791690826, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9433269500732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:32] (step=0012732) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.24741546832491254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12733, "loss": 0.19459684193134308, "memory_gb": 7.721559524536133, "step_time_ms": 3357.793092727661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:36] (step=0012733) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.24743490089389816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12734, "loss": 0.22936230897903442, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1488857269287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:39] (step=0012734) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.24745433346288379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12735, "loss": 0.14707359671592712, "memory_gb": 7.721559524536133, "step_time_ms": 3349.412679672241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:43] (step=0012735) Train Loss: 0.1923, Train Steps/Sec: 0.28, Epoch: 0.2474737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12736, "loss": 0.32889899611473083, "memory_gb": 7.721559524536133, "step_time_ms": 3356.348991394043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:46] (step=0012736) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.24749319860085503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12737, "loss": 0.28586992621421814, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4793338775635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:50] (step=0012737) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.24751263116984065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12738, "loss": 0.25420039892196655, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4647693634033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:54] (step=0012738) Train Loss: 0.2033, Train Steps/Sec: 0.28, Epoch: 0.24753206373882627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:44:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12739, "loss": 0.24202045798301697, "memory_gb": 7.715639114379883, "step_time_ms": 3315.8247470855713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:44:57] (step=0012739) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.2475514963078119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12740, "loss": 0.26081597805023193, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1855297088623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:01] (step=0012740) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.24757092887679752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12741, "loss": 0.2808617949485779, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8810691833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:04] (step=0012741) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.24759036144578314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12742, "loss": 0.13963301479816437, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2206497192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:08] (step=0012742) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.24760979401476876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12743, "loss": 0.28011491894721985, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1794033050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:11] (step=0012743) Train Loss: 0.2734, Train Steps/Sec: 0.28, Epoch: 0.24762922658375439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12744, "loss": 0.35660505294799805, "memory_gb": 7.715639114379883, "step_time_ms": 3323.425054550171, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:15] (step=0012744) Train Loss: 0.3365, Train Steps/Sec: 0.28, Epoch: 0.24764865915273998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12745, "loss": 0.23462752997875214, "memory_gb": 7.721559524536133, "step_time_ms": 3354.865312576294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:19] (step=0012745) Train Loss: 0.2232, Train Steps/Sec: 0.28, Epoch: 0.2476680917217256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12746, "loss": 0.17875799536705017, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7182807922363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:22] (step=0012746) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.24768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12747, "loss": 0.18268471956253052, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5454692840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:26] (step=0012747) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.24770695685969685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12748, "loss": 0.24866604804992676, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3954105377197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:29] (step=0012748) Train Loss: 0.2881, Train Steps/Sec: 0.28, Epoch: 0.24772638942868247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12749, "loss": 0.28217947483062744, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4952354431152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:33] (step=0012749) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.2477458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12750, "loss": 0.2852305769920349, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6443195343018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:36] (step=0012750) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.24776525456665371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12751, "loss": 0.298043817281723, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3008613586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:40] (step=0012751) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.24778468713563934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12752, "loss": 0.17854923009872437, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1487197875977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:43] (step=0012752) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.24780411970462496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12753, "loss": 0.2361157238483429, "memory_gb": 7.721559524536133, "step_time_ms": 3359.670877456665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:47] (step=0012753) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.24782355227361058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12754, "loss": 0.30932170152664185, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2373600006104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:51] (step=0012754) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.2478429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12755, "loss": 0.17179708182811737, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5775623321533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:54] (step=0012755) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.2478624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:45:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12756, "loss": 0.2155756950378418, "memory_gb": 7.721559524536133, "step_time_ms": 3354.599714279175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:45:58] (step=0012756) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.24788184998056742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12757, "loss": 0.19223633408546448, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0480346679688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:01] (step=0012757) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.24790128254955304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12758, "loss": 0.23708753287792206, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:05] (step=0012758) Train Loss: 0.1884, Train Steps/Sec: 0.28, Epoch: 0.24792071511853866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12759, "loss": 0.10645104944705963, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2968711853027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:09] (step=0012759) Train Loss: 0.1684, Train Steps/Sec: 0.28, Epoch: 0.2479401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12760, "loss": 0.2755208909511566, "memory_gb": 7.721559524536133, "step_time_ms": 3354.562997817993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:12] (step=0012760) Train Loss: 0.2819, Train Steps/Sec: 0.27, Epoch: 0.2479595802565099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12761, "loss": 0.2426215410232544, "memory_gb": 7.715639114379883, "step_time_ms": 3323.939561843872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:16] (step=0012761) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.24797901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12762, "loss": 0.28072312474250793, "memory_gb": 7.721559524536133, "step_time_ms": 3358.367681503296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:19] (step=0012762) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.24799844539448115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12763, "loss": 0.3121253252029419, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9324741363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:23] (step=0012763) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.24801787796346678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12764, "loss": 0.30675071477890015, "memory_gb": 7.721559524536133, "step_time_ms": 3354.565382003784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:26] (step=0012764) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.2480373105324524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12765, "loss": 0.2260173261165619, "memory_gb": 7.721559524536133, "step_time_ms": 3363.720655441284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:30] (step=0012765) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.24805674310143802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12766, "loss": 0.33447667956352234, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2215518951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:34] (step=0012766) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.24807617567042364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12767, "loss": 0.2069852203130722, "memory_gb": 7.721559524536133, "step_time_ms": 3361.429214477539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:37] (step=0012767) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.24809560823940924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12768, "loss": 0.2628783583641052, "memory_gb": 7.721559524536133, "step_time_ms": 3359.344482421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:41] (step=0012768) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.24811504080839486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12769, "loss": 0.24093078076839447, "memory_gb": 7.721559524536133, "step_time_ms": 3362.738609313965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:44] (step=0012769) Train Loss: 0.2806, Train Steps/Sec: 0.28, Epoch: 0.24813447337738048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12770, "loss": 0.24740739166736603, "memory_gb": 7.721559524536133, "step_time_ms": 3361.927270889282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:48] (step=0012770) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.2481539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12771, "loss": 0.2186887562274933, "memory_gb": 7.721559524536133, "step_time_ms": 3505.1276683807373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:52] (step=0012771) Train Loss: 0.1954, Train Steps/Sec: 0.28, Epoch: 0.24817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12772, "loss": 0.2204396277666092, "memory_gb": 7.721559524536133, "step_time_ms": 3363.581895828247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:55] (step=0012772) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.24819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12773, "loss": 0.20565439760684967, "memory_gb": 7.721559524536133, "step_time_ms": 3361.180305480957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:46:59] (step=0012773) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.24821220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12774, "loss": 0.3804478347301483, "memory_gb": 7.721559524536133, "step_time_ms": 3361.961841583252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:02] (step=0012774) Train Loss: 0.2922, Train Steps/Sec: 0.28, Epoch: 0.2482316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12775, "loss": 0.33103275299072266, "memory_gb": 7.721559524536133, "step_time_ms": 3357.297897338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:06] (step=0012775) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.24825106879129422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12776, "loss": 0.2609138488769531, "memory_gb": 7.721559524536133, "step_time_ms": 3354.022979736328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:09] (step=0012776) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.24827050136027984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12777, "loss": 0.11779564619064331, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6805381774902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:13] (step=0012777) Train Loss: 0.1717, Train Steps/Sec: 0.28, Epoch: 0.24828993392926546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12778, "loss": 0.2742850184440613, "memory_gb": 7.721559524536133, "step_time_ms": 3358.480930328369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:17] (step=0012778) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.24830936649825106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12779, "loss": 0.3253134787082672, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5072288513184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:20] (step=0012779) Train Loss: 0.3021, Train Steps/Sec: 0.28, Epoch: 0.24832879906723668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12780, "loss": 0.256096214056015, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7660789489746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:24] (step=0012780) Train Loss: 0.3091, Train Steps/Sec: 0.28, Epoch: 0.2483482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12781, "loss": 0.3395293951034546, "memory_gb": 7.721559524536133, "step_time_ms": 3360.179662704468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:27] (step=0012781) Train Loss: 0.3064, Train Steps/Sec: 0.28, Epoch: 0.24836766420520792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12782, "loss": 0.25534600019454956, "memory_gb": 7.721559524536133, "step_time_ms": 3362.545967102051, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:31] (step=0012782) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.24838709677419354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12783, "loss": 0.24115607142448425, "memory_gb": 7.721559524536133, "step_time_ms": 3358.659267425537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:34] (step=0012783) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.24840652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12784, "loss": 0.1844765692949295, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4319610595703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:38] (step=0012784) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.2484259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12785, "loss": 0.29844969511032104, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3064289093018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:42] (step=0012785) Train Loss: 0.3013, Train Steps/Sec: 0.28, Epoch: 0.2484453944811504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12786, "loss": 0.15893961489200592, "memory_gb": 7.721559524536133, "step_time_ms": 3356.827735900879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:45] (step=0012786) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.24846482705013603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12787, "loss": 0.25864964723587036, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2184314727783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:49] (step=0012787) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.24848425961912166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12788, "loss": 0.25953832268714905, "memory_gb": 7.721559524536133, "step_time_ms": 3360.822916030884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:52] (step=0012788) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.24850369218810728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12789, "loss": 0.2707371711730957, "memory_gb": 7.721559524536133, "step_time_ms": 3361.967086791992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:47:56] (step=0012789) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.2485231247570929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:47:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12790, "loss": 0.25166526436805725, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6550483703613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:00] (step=0012790) Train Loss: 0.2311, Train Steps/Sec: 0.28, Epoch: 0.2485425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12791, "loss": 0.1806003451347351, "memory_gb": 7.721559524536133, "step_time_ms": 3357.436418533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:03] (step=0012791) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.24856198989506412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12792, "loss": 0.17473620176315308, "memory_gb": 7.721559524536133, "step_time_ms": 3356.548070907593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:07] (step=0012792) Train Loss: 0.1794, Train Steps/Sec: 0.28, Epoch: 0.24858142246404974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12793, "loss": 0.27402302622795105, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8590412139893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:10] (step=0012793) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.24860085503303536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12794, "loss": 0.2026289850473404, "memory_gb": 7.715639114379883, "step_time_ms": 3321.592330932617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:14] (step=0012794) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.24862028760202098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12795, "loss": 0.2655716836452484, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0565452575684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:17] (step=0012795) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.2486397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12796, "loss": 0.24070687592029572, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5936603546143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:21] (step=0012796) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.24865915273999223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12797, "loss": 0.23116709291934967, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5456867218018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:25] (step=0012797) Train Loss: 0.2880, Train Steps/Sec: 0.28, Epoch: 0.24867858530897785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12798, "loss": 0.21815136075019836, "memory_gb": 7.721559524536133, "step_time_ms": 3356.091022491455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:28] (step=0012798) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.24869801787796347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12799, "loss": 0.2640749514102936, "memory_gb": 7.721559524536133, "step_time_ms": 3353.565216064453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:32] (step=0012799) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.2487174504469491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12800, "loss": 0.2737237513065338, "memory_gb": 7.721559524536133, "step_time_ms": 3345.960855484009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:35] (step=0012800) Train Loss: 0.3122, Train Steps/Sec: 0.27, Epoch: 0.24873688301593472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12801, "loss": 0.1710178405046463, "memory_gb": 7.721559524536133, "step_time_ms": 3336.987018585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:39] (step=0012801) Train Loss: 0.1813, Train Steps/Sec: 0.28, Epoch: 0.24875631558492034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12802, "loss": 0.27998578548431396, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3618450164795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:42] (step=0012802) Train Loss: 0.2860, Train Steps/Sec: 0.28, Epoch: 0.24877574815390593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12803, "loss": 0.2292988896369934, "memory_gb": 7.721559524536133, "step_time_ms": 3352.907180786133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:46] (step=0012803) Train Loss: 0.1917, Train Steps/Sec: 0.28, Epoch: 0.24879518072289156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12804, "loss": 0.2859230041503906, "memory_gb": 7.721559524536133, "step_time_ms": 3352.170944213867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:50] (step=0012804) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.24881461329187718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12805, "loss": 0.16259858012199402, "memory_gb": 7.721559524536133, "step_time_ms": 3352.501153945923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:53] (step=0012805) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.2488340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12806, "loss": 0.30548885464668274, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3622913360596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:48:57] (step=0012806) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.24885347842984842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12807, "loss": 0.23687826097011566, "memory_gb": 7.721559524536133, "step_time_ms": 3344.900608062744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:00] (step=0012807) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.24887291099883405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12808, "loss": 0.1322183459997177, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7135334014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:04] (step=0012808) Train Loss: 0.1385, Train Steps/Sec: 0.28, Epoch: 0.24889234356781967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12809, "loss": 0.23504406213760376, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4834594726562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:07] (step=0012809) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.2489117761368053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12810, "loss": 0.3048744201660156, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2640419006348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:11] (step=0012810) Train Loss: 0.3181, Train Steps/Sec: 0.28, Epoch: 0.2489312087057909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12811, "loss": 0.1519167274236679, "memory_gb": 7.721559524536133, "step_time_ms": 3355.85355758667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:15] (step=0012811) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.24895064127477654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12812, "loss": 0.2133409082889557, "memory_gb": 7.721559524536133, "step_time_ms": 3357.734441757202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:18] (step=0012812) Train Loss: 0.1872, Train Steps/Sec: 0.28, Epoch: 0.24897007384376216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12813, "loss": 0.17236682772636414, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6511821746826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:22] (step=0012813) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.24898950641274775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12814, "loss": 0.32742777466773987, "memory_gb": 7.721559524536133, "step_time_ms": 3359.186887741089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:25] (step=0012814) Train Loss: 0.2995, Train Steps/Sec: 0.28, Epoch: 0.24900893898173337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12815, "loss": 0.2715822458267212, "memory_gb": 7.721559524536133, "step_time_ms": 3347.6130962371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:29] (step=0012815) Train Loss: 0.2480, Train Steps/Sec: 0.28, Epoch: 0.249028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12816, "loss": 0.21543477475643158, "memory_gb": 7.721559524536133, "step_time_ms": 3357.550621032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:32] (step=0012816) Train Loss: 0.2761, Train Steps/Sec: 0.28, Epoch: 0.24904780411970462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12817, "loss": 0.3158194422721863, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3125133514404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:36] (step=0012817) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.24906723668869024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12818, "loss": 0.25139421224594116, "memory_gb": 7.721559524536133, "step_time_ms": 3358.633041381836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:40] (step=0012818) Train Loss: 0.2769, Train Steps/Sec: 0.28, Epoch: 0.24908666925767586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12819, "loss": 0.2683253884315491, "memory_gb": 7.721559524536133, "step_time_ms": 3506.4549446105957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:43] (step=0012819) Train Loss: 0.2747, Train Steps/Sec: 0.28, Epoch: 0.24910610182666149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12820, "loss": 0.2934313118457794, "memory_gb": 7.721559524536133, "step_time_ms": 3351.818561553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:47] (step=0012820) Train Loss: 0.2942, Train Steps/Sec: 0.28, Epoch: 0.2491255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12821, "loss": 0.2655576169490814, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6221466064453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:50] (step=0012821) Train Loss: 0.2620, Train Steps/Sec: 0.28, Epoch: 0.24914496696463273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12822, "loss": 0.2725105583667755, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6718101501465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:54] (step=0012822) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.24916439953361835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:49:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12823, "loss": 0.2631380259990692, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6973209381104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:49:57] (step=0012823) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.24918383210260397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12824, "loss": 0.160396009683609, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8949699401855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:01] (step=0012824) Train Loss: 0.1709, Train Steps/Sec: 0.28, Epoch: 0.2492032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12825, "loss": 0.29674699902534485, "memory_gb": 7.715639114379883, "step_time_ms": 3321.357488632202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:04] (step=0012825) Train Loss: 0.2473, Train Steps/Sec: 0.28, Epoch: 0.2492226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12826, "loss": 0.25413617491722107, "memory_gb": 7.721559524536133, "step_time_ms": 3356.175661087036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:08] (step=0012826) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.24924212980956081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12827, "loss": 0.25504270195961, "memory_gb": 7.721559524536133, "step_time_ms": 3359.508275985718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:12] (step=0012827) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.24926156237854644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12828, "loss": 0.25552037358283997, "memory_gb": 7.721559524536133, "step_time_ms": 3360.077381134033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:15] (step=0012828) Train Loss: 0.2829, Train Steps/Sec: 0.28, Epoch: 0.24928099494753206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12829, "loss": 0.2908417880535126, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8836917877197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:19] (step=0012829) Train Loss: 0.2970, Train Steps/Sec: 0.28, Epoch: 0.24930042751651768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12830, "loss": 0.12420177459716797, "memory_gb": 7.721559524536133, "step_time_ms": 3356.93097114563, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:22] (step=0012830) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.2493198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12831, "loss": 0.2118290662765503, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2254390716553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:26] (step=0012831) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.24933929265448893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12832, "loss": 0.2932147979736328, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6184043884277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:29] (step=0012832) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.24935872522347455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12833, "loss": 0.20936119556427002, "memory_gb": 7.721559524536133, "step_time_ms": 3360.201835632324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:33] (step=0012833) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.24937815779246017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12834, "loss": 0.3045816123485565, "memory_gb": 7.721559524536133, "step_time_ms": 3361.652135848999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:37] (step=0012834) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.2493975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12835, "loss": 0.34565088152885437, "memory_gb": 7.721559524536133, "step_time_ms": 3360.166311264038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:40] (step=0012835) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.24941702293043141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12836, "loss": 0.21765215694904327, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9844913482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:44] (step=0012836) Train Loss: 0.1870, Train Steps/Sec: 0.28, Epoch: 0.249436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12837, "loss": 0.19406744837760925, "memory_gb": 7.721559524536133, "step_time_ms": 3362.593412399292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:47] (step=0012837) Train Loss: 0.1535, Train Steps/Sec: 0.28, Epoch: 0.24945588806840263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12838, "loss": 0.21285809576511383, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3929328918457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:51] (step=0012838) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.24947532063738825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12839, "loss": 0.23306693136692047, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7335815429688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:54] (step=0012839) Train Loss: 0.2832, Train Steps/Sec: 0.28, Epoch: 0.24949475320637388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12840, "loss": 0.2699800133705139, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9866371154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:50:58] (step=0012840) Train Loss: 0.2925, Train Steps/Sec: 0.28, Epoch: 0.2495141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12841, "loss": 0.21440891921520233, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3719940185547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:02] (step=0012841) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.24953361834434512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12842, "loss": 0.29566526412963867, "memory_gb": 7.715639114379883, "step_time_ms": 3325.0057697296143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:05] (step=0012842) Train Loss: 0.2891, Train Steps/Sec: 0.28, Epoch: 0.24955305091333074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12843, "loss": 0.21434414386749268, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9955520629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:09] (step=0012843) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.24957248348231637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12844, "loss": 0.2247202843427658, "memory_gb": 7.721559524536133, "step_time_ms": 3360.71515083313, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:12] (step=0012844) Train Loss: 0.2128, Train Steps/Sec: 0.28, Epoch: 0.249591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12845, "loss": 0.21365618705749512, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1482849121094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:16] (step=0012845) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.2496113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12846, "loss": 0.14166386425495148, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3922805786133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:19] (step=0012846) Train Loss: 0.1694, Train Steps/Sec: 0.28, Epoch: 0.24963078118927323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12847, "loss": 0.2655043303966522, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7356033325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:23] (step=0012847) Train Loss: 0.2611, Train Steps/Sec: 0.27, Epoch: 0.24965021375825885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12848, "loss": 0.27554917335510254, "memory_gb": 7.721559524536133, "step_time_ms": 3364.058494567871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:27] (step=0012848) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.24966964632724445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12849, "loss": 0.21256093680858612, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4677963256836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:30] (step=0012849) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.24968907889623007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12850, "loss": 0.31092846393585205, "memory_gb": 7.721559524536133, "step_time_ms": 3359.250783920288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:34] (step=0012850) Train Loss: 0.2879, Train Steps/Sec: 0.28, Epoch: 0.2497085114652157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12851, "loss": 0.20681680738925934, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5192642211914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:37] (step=0012851) Train Loss: 0.2035, Train Steps/Sec: 0.28, Epoch: 0.24972794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12852, "loss": 0.27816659212112427, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4234199523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:41] (step=0012852) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.24974737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12853, "loss": 0.3033057451248169, "memory_gb": 7.721559524536133, "step_time_ms": 3361.248254776001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:44] (step=0012853) Train Loss: 0.3058, Train Steps/Sec: 0.28, Epoch: 0.24976680917217256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12854, "loss": 0.2153944969177246, "memory_gb": 7.721559524536133, "step_time_ms": 3356.546640396118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:48] (step=0012854) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.24978624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12855, "loss": 0.3242523670196533, "memory_gb": 7.715639114379883, "step_time_ms": 3231.27818107605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:51] (step=0012855) Train Loss: 0.2833, Train Steps/Sec: 0.29, Epoch: 0.2498056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12856, "loss": 0.3672519326210022, "memory_gb": 7.721559524536133, "step_time_ms": 3359.858989715576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:55] (step=0012856) Train Loss: 0.3303, Train Steps/Sec: 0.28, Epoch: 0.24982510687912943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:51:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12857, "loss": 0.2940276861190796, "memory_gb": 7.721559524536133, "step_time_ms": 3356.435537338257, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:51:59] (step=0012857) Train Loss: 0.2693, Train Steps/Sec: 0.28, Epoch: 0.24984453944811505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12858, "loss": 0.2057909220457077, "memory_gb": 7.721559524536133, "step_time_ms": 3356.34183883667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:02] (step=0012858) Train Loss: 0.1986, Train Steps/Sec: 0.28, Epoch: 0.24986397201710067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12859, "loss": 0.18209926784038544, "memory_gb": 7.721559524536133, "step_time_ms": 3362.264394760132, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:06] (step=0012859) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2498834045860863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12860, "loss": 0.19110387563705444, "memory_gb": 7.721559524536133, "step_time_ms": 3510.1966857910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:09] (step=0012860) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.2499028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12861, "loss": 0.30709290504455566, "memory_gb": 7.721559524536133, "step_time_ms": 3357.696771621704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:13] (step=0012861) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.2499222697240575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12862, "loss": 0.16550874710083008, "memory_gb": 7.721559524536133, "step_time_ms": 3364.912986755371, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:16] (step=0012862) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.24994170229304313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12863, "loss": 0.20033179223537445, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7382774353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:20] (step=0012863) Train Loss: 0.1843, Train Steps/Sec: 0.28, Epoch: 0.24996113486202876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12864, "loss": 0.17971158027648926, "memory_gb": 7.721559524536133, "step_time_ms": 3361.097812652588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:24] (step=0012864) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.24998056743101438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12865, "loss": 0.2009362429380417, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1152477264404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:27] (step=0012865) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.25, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12866, "loss": 0.24597296118736267, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9531650543213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:31] (step=0012866) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.2500194325689856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12867, "loss": 0.2504989504814148, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0750999450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:34] (step=0012867) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.25003886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12868, "loss": 0.26240479946136475, "memory_gb": 7.721559524536133, "step_time_ms": 3357.92875289917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:38] (step=0012868) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.25005829770695687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12869, "loss": 0.27966856956481934, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6297454833984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:41] (step=0012869) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.2500777302759425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12870, "loss": 0.22418472170829773, "memory_gb": 7.721559524536133, "step_time_ms": 3363.590955734253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:45] (step=0012870) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.2500971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12871, "loss": 0.3672023415565491, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4934425354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:49] (step=0012871) Train Loss: 0.3151, Train Steps/Sec: 0.28, Epoch: 0.25011659541391373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12872, "loss": 0.28161269426345825, "memory_gb": 7.721559524536133, "step_time_ms": 3364.701986312866, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:52] (step=0012872) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.25013602798289936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12873, "loss": 0.1830645501613617, "memory_gb": 7.721559524536133, "step_time_ms": 3365.015983581543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:56] (step=0012873) Train Loss: 0.2742, Train Steps/Sec: 0.28, Epoch: 0.250155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:52:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12874, "loss": 0.22692295908927917, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5634441375732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:52:59] (step=0012874) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.2501748931208706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12875, "loss": 0.23660114407539368, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8644008636475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:03] (step=0012875) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.2501943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12876, "loss": 0.38418030738830566, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7539196014404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:06] (step=0012876) Train Loss: 0.3408, Train Steps/Sec: 0.28, Epoch: 0.25021375825884185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12877, "loss": 0.31995677947998047, "memory_gb": 7.721559524536133, "step_time_ms": 3359.079360961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:10] (step=0012877) Train Loss: 0.3438, Train Steps/Sec: 0.28, Epoch: 0.2502331908278274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12878, "loss": 0.20137298107147217, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3548488616943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:14] (step=0012878) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.25025262339681303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12879, "loss": 0.19292107224464417, "memory_gb": 7.721559524536133, "step_time_ms": 3361.502170562744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:17] (step=0012879) Train Loss: 0.2529, Train Steps/Sec: 0.28, Epoch: 0.25027205596579866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12880, "loss": 0.3160429000854492, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2137775421143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:21] (step=0012880) Train Loss: 0.3120, Train Steps/Sec: 0.28, Epoch: 0.2502914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12881, "loss": 0.19917500019073486, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2516860961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:24] (step=0012881) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.2503109211037699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12882, "loss": 0.20545358955860138, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6935997009277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:28] (step=0012882) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.2503303536727555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12883, "loss": 0.2118387520313263, "memory_gb": 7.721559524536133, "step_time_ms": 3365.285634994507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:31] (step=0012883) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.25034978624174115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12884, "loss": 0.23128876090049744, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8710765838623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:35] (step=0012884) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.25036921881072677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12885, "loss": 0.20115546882152557, "memory_gb": 7.721559524536133, "step_time_ms": 3364.755868911743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:39] (step=0012885) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.2503886513797124, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12886, "loss": 0.26312950253486633, "memory_gb": 7.715639114379883, "step_time_ms": 3324.052333831787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:42] (step=0012886) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.250408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12887, "loss": 0.21032924950122833, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3291511535645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:46] (step=0012887) Train Loss: 0.2796, Train Steps/Sec: 0.28, Epoch: 0.25042751651768363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12888, "loss": 0.17408548295497894, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7895889282227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:49] (step=0012888) Train Loss: 0.2239, Train Steps/Sec: 0.27, Epoch: 0.25044694908666926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12889, "loss": 0.19222518801689148, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3479862213135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:53] (step=0012889) Train Loss: 0.1897, Train Steps/Sec: 0.28, Epoch: 0.2504663816556549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12890, "loss": 0.2634369134902954, "memory_gb": 7.721559524536133, "step_time_ms": 3343.0936336517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:53:57] (step=0012890) Train Loss: 0.3020, Train Steps/Sec: 0.28, Epoch: 0.2504858142246405, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12891, "loss": 0.18081676959991455, "memory_gb": 7.721559524536133, "step_time_ms": 3358.564615249634, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:00] (step=0012891) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.2505052467936261, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12892, "loss": 0.23312929272651672, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6487770080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:04] (step=0012892) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.25052467936261175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12893, "loss": 0.2231241911649704, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9155673980713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:07] (step=0012893) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.25054411193159737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12894, "loss": 0.294673889875412, "memory_gb": 7.721559524536133, "step_time_ms": 3347.9886054992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:11] (step=0012894) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.250563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12895, "loss": 0.18847237527370453, "memory_gb": 7.721559524536133, "step_time_ms": 3350.242853164673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:14] (step=0012895) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.2505829770695686, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12896, "loss": 0.2898961901664734, "memory_gb": 7.721559524536133, "step_time_ms": 3353.634834289551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:18] (step=0012896) Train Loss: 0.2908, Train Steps/Sec: 0.28, Epoch: 0.25060240963855424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12897, "loss": 0.2004900425672531, "memory_gb": 7.721559524536133, "step_time_ms": 3342.723608016968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:21] (step=0012897) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.25062184220753986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12898, "loss": 0.29913225769996643, "memory_gb": 7.721559524536133, "step_time_ms": 3358.823776245117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:25] (step=0012898) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.2506412747765255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12899, "loss": 0.22942599654197693, "memory_gb": 7.721559524536133, "step_time_ms": 3360.04376411438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:29] (step=0012899) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.2506607073455111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12900, "loss": 0.25851625204086304, "memory_gb": 7.721559524536133, "step_time_ms": 3353.78360748291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:32] (step=0012900) Train Loss: 0.3035, Train Steps/Sec: 0.28, Epoch: 0.25068013991449667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12901, "loss": 0.21861423552036285, "memory_gb": 7.721559524536133, "step_time_ms": 3356.482982635498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:36] (step=0012901) Train Loss: 0.2144, Train Steps/Sec: 0.28, Epoch: 0.2506995724834823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12902, "loss": 0.3141945004463196, "memory_gb": 7.721559524536133, "step_time_ms": 3357.032299041748, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:39] (step=0012902) Train Loss: 0.2443, Train Steps/Sec: 0.28, Epoch: 0.2507190050524679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12903, "loss": 0.16436398029327393, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3079109191895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:43] (step=0012903) Train Loss: 0.2037, Train Steps/Sec: 0.28, Epoch: 0.25073843762145354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12904, "loss": 0.2163296639919281, "memory_gb": 7.721559524536133, "step_time_ms": 3356.208562850952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:46] (step=0012904) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.25075787019043916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12905, "loss": 0.2239534854888916, "memory_gb": 7.721559524536133, "step_time_ms": 3355.102777481079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:50] (step=0012905) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.2507773027594248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12906, "loss": 0.24746409058570862, "memory_gb": 7.721559524536133, "step_time_ms": 3357.055902481079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:54] (step=0012906) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.2507967353284104, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:54:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12907, "loss": 0.31038182973861694, "memory_gb": 7.721559524536133, "step_time_ms": 3359.96675491333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:54:57] (step=0012907) Train Loss: 0.2707, Train Steps/Sec: 0.28, Epoch: 0.250816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12908, "loss": 0.2608680725097656, "memory_gb": 7.721559524536133, "step_time_ms": 3496.760129928589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:01] (step=0012908) Train Loss: 0.2749, Train Steps/Sec: 0.28, Epoch: 0.25083560046638165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12909, "loss": 0.2123238742351532, "memory_gb": 7.721559524536133, "step_time_ms": 3357.311964035034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:04] (step=0012909) Train Loss: 0.2527, Train Steps/Sec: 0.28, Epoch: 0.25085503303536727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12910, "loss": 0.23476596176624298, "memory_gb": 7.721559524536133, "step_time_ms": 3355.126142501831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:08] (step=0012910) Train Loss: 0.2078, Train Steps/Sec: 0.28, Epoch: 0.2508744656043529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12911, "loss": 0.2290724664926529, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3412475585938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:11] (step=0012911) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.2508938981733385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12912, "loss": 0.13768434524536133, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3355464935303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:15] (step=0012912) Train Loss: 0.1681, Train Steps/Sec: 0.28, Epoch: 0.25091333074232414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12913, "loss": 0.30365753173828125, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2981357574463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:19] (step=0012913) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.25093276331130976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12914, "loss": 0.23766326904296875, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9131412506104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:22] (step=0012914) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.2509521958802954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12915, "loss": 0.2782067060470581, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3753623962402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:26] (step=0012915) Train Loss: 0.2411, Train Steps/Sec: 0.28, Epoch: 0.250971628449281, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12916, "loss": 0.24090264737606049, "memory_gb": 7.721559524536133, "step_time_ms": 3354.288339614868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:29] (step=0012916) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.2509910610182666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12917, "loss": 0.2687382698059082, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9011268615723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:33] (step=0012917) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.25101049358725225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12918, "loss": 0.22529907524585724, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2652130126953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:36] (step=0012918) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.25102992615623787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 12919, "loss": 0.2843751311302185, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7542819976807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:40] (step=0012919) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.2510493587252235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12920, "loss": 0.23877090215682983, "memory_gb": 7.721559524536133, "step_time_ms": 3356.184244155884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:44] (step=0012920) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.2510687912942091, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 12921, "loss": 0.23145732283592224, "memory_gb": 7.721559524536133, "step_time_ms": 3356.326103210449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:47] (step=0012921) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.25108822386319474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12922, "loss": 0.17930692434310913, "memory_gb": 7.721559524536133, "step_time_ms": 3350.032329559326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:51] (step=0012922) Train Loss: 0.1707, Train Steps/Sec: 0.28, Epoch: 0.25110765643218036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12923, "loss": 0.19152921438217163, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7473678588867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:54] (step=0012923) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.251127089001166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:55:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 12924, "loss": 0.16058728098869324, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4199256896973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:55:58] (step=0012924) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.25114652157015155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12925, "loss": 0.13919006288051605, "memory_gb": 7.721559524536133, "step_time_ms": 3356.02068901062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:01] (step=0012925) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.25116595413913717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 12926, "loss": 0.1829807162284851, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6460399627686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:05] (step=0012926) Train Loss: 0.1945, Train Steps/Sec: 0.28, Epoch: 0.2511853867081228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12927, "loss": 0.3411477208137512, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3449382781982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:08] (step=0012927) Train Loss: 0.2863, Train Steps/Sec: 0.28, Epoch: 0.2512048192771084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12928, "loss": 0.24003471434116364, "memory_gb": 7.721559524536133, "step_time_ms": 3358.555316925049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:12] (step=0012928) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.25122425184609404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12929, "loss": 0.2072935700416565, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6930294036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:16] (step=0012929) Train Loss: 0.1823, Train Steps/Sec: 0.28, Epoch: 0.25124368441507966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12930, "loss": 0.10313467681407928, "memory_gb": 7.721559524536133, "step_time_ms": 3349.323272705078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:19] (step=0012930) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.2512631169840653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 12931, "loss": 0.26927992701530457, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5354347229004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:23] (step=0012931) Train Loss: 0.2732, Train Steps/Sec: 0.28, Epoch: 0.2512825495530509, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12932, "loss": 0.22542127966880798, "memory_gb": 7.721559524536133, "step_time_ms": 3354.205369949341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:26] (step=0012932) Train Loss: 0.1904, Train Steps/Sec: 0.28, Epoch: 0.2513019821220365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 12933, "loss": 0.1133788526058197, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2092781066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:30] (step=0012933) Train Loss: 0.1662, Train Steps/Sec: 0.28, Epoch: 0.25132141469102215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 12934, "loss": 0.27768078446388245, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7565937042236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:33] (step=0012934) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.25134084726000777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 12935, "loss": 0.3768257796764374, "memory_gb": 7.715639114379883, "step_time_ms": 3322.6935863494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:37] (step=0012935) Train Loss: 0.2940, Train Steps/Sec: 0.28, Epoch: 0.2513602798289934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12936, "loss": 0.12077079713344574, "memory_gb": 7.721559524536133, "step_time_ms": 3361.078977584839, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:41] (step=0012936) Train Loss: 0.1610, Train Steps/Sec: 0.27, Epoch: 0.251379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 12937, "loss": 0.2654199004173279, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1629314422607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:44] (step=0012937) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.25139914496696464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12938, "loss": 0.3753809332847595, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3386249542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:48] (step=0012938) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.25141857753595026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 12939, "loss": 0.3096381425857544, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7563762664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:51] (step=0012939) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.2514380101049359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 12940, "loss": 0.2465842366218567, "memory_gb": 7.721559524536133, "step_time_ms": 3360.415458679199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:55] (step=0012940) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.2514574426739215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:56:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12941, "loss": 0.19714727997779846, "memory_gb": 7.721559524536133, "step_time_ms": 3357.184886932373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:56:59] (step=0012941) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.2514768752429071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 12942, "loss": 0.20740894973278046, "memory_gb": 7.721559524536133, "step_time_ms": 3354.417562484741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:02] (step=0012942) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.25149630781189275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12943, "loss": 0.18439993262290955, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1509799957275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:06] (step=0012943) Train Loss: 0.1839, Train Steps/Sec: 0.28, Epoch: 0.25151574038087837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 12944, "loss": 0.15767133235931396, "memory_gb": 7.721559524536133, "step_time_ms": 3362.926483154297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:09] (step=0012944) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.251535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12945, "loss": 0.2861006259918213, "memory_gb": 7.721559524536133, "step_time_ms": 3356.609106063843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:13] (step=0012945) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.2515546055188496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 12946, "loss": 0.15845444798469543, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2763671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:16] (step=0012946) Train Loss: 0.1644, Train Steps/Sec: 0.28, Epoch: 0.25157403808783524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 12947, "loss": 0.17761453986167908, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3811264038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:20] (step=0012947) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.2515934706568208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12948, "loss": 0.29883354902267456, "memory_gb": 7.721559524536133, "step_time_ms": 3509.9589824676514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:24] (step=0012948) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.25161290322580643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 12949, "loss": 0.15175817906856537, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2593879699707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:27] (step=0012949) Train Loss: 0.1355, Train Steps/Sec: 0.28, Epoch: 0.25163233579479205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12950, "loss": 0.26299017667770386, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2008304595947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:31] (step=0012950) Train Loss: 0.2509, Train Steps/Sec: 0.28, Epoch: 0.2516517683637777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 12951, "loss": 0.23425127565860748, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9160232543945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:34] (step=0012951) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.2516712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12952, "loss": 0.23165756464004517, "memory_gb": 7.721559524536133, "step_time_ms": 3361.680746078491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:38] (step=0012952) Train Loss: 0.2155, Train Steps/Sec: 0.28, Epoch: 0.2516906335017489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 12953, "loss": 0.1902613788843155, "memory_gb": 7.721559524536133, "step_time_ms": 3358.501434326172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:41] (step=0012953) Train Loss: 0.2647, Train Steps/Sec: 0.28, Epoch: 0.25171006607073454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 12954, "loss": 0.30148813128471375, "memory_gb": 7.721559524536133, "step_time_ms": 3356.718063354492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:45] (step=0012954) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.25172949863972016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 12955, "loss": 0.19937613606452942, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9871864318848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:48] (step=0012955) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.2517489312087058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 12956, "loss": 0.18166467547416687, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0336990356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:52] (step=0012956) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.2517683637776914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12957, "loss": 0.1804433912038803, "memory_gb": 7.721559524536133, "step_time_ms": 3361.874580383301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:56] (step=0012957) Train Loss: 0.1829, Train Steps/Sec: 0.28, Epoch: 0.25178779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:57:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 12958, "loss": 0.1639484167098999, "memory_gb": 7.721559524536133, "step_time_ms": 3363.057851791382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:57:59] (step=0012958) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.25180722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 12959, "loss": 0.18021294474601746, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0805740356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:03] (step=0012959) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.2518266614846483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 12960, "loss": 0.25327375531196594, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9188327789307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:06] (step=0012960) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.2518460940536339, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 12961, "loss": 0.1553725153207779, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3363971710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:10] (step=0012961) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.2518655266226195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 12962, "loss": 0.14538627862930298, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9115867614746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:13] (step=0012962) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.25188495919160514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 12963, "loss": 0.2230740487575531, "memory_gb": 7.721559524536133, "step_time_ms": 3358.015775680542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:17] (step=0012963) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.25190439176059076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12964, "loss": 0.319374144077301, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9229068756104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:21] (step=0012964) Train Loss: 0.2661, Train Steps/Sec: 0.28, Epoch: 0.2519238243295764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 12965, "loss": 0.30331045389175415, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3040657043457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:24] (step=0012965) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.251943256898562, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 12966, "loss": 0.22544586658477783, "memory_gb": 7.721559524536133, "step_time_ms": 3358.295202255249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:28] (step=0012966) Train Loss: 0.2272, Train Steps/Sec: 0.28, Epoch: 0.25196268946754763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 12967, "loss": 0.2938254177570343, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4609241485596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:31] (step=0012967) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.25198212203653325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 12968, "loss": 0.22352376580238342, "memory_gb": 7.721559524536133, "step_time_ms": 3363.461971282959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:35] (step=0012968) Train Loss: 0.1875, Train Steps/Sec: 0.28, Epoch: 0.2520015546055189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 12969, "loss": 0.19756412506103516, "memory_gb": 7.721559524536133, "step_time_ms": 3359.968662261963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:38] (step=0012969) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.2520209871745045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 12970, "loss": 0.25663554668426514, "memory_gb": 7.721559524536133, "step_time_ms": 3365.417718887329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:42] (step=0012970) Train Loss: 0.2217, Train Steps/Sec: 0.28, Epoch: 0.25204041974349006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12971, "loss": 0.2630254924297333, "memory_gb": 7.721559524536133, "step_time_ms": 3359.616994857788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:46] (step=0012971) Train Loss: 0.2547, Train Steps/Sec: 0.28, Epoch: 0.2520598523124757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 12972, "loss": 0.3533264994621277, "memory_gb": 7.721559524536133, "step_time_ms": 3367.271661758423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:49] (step=0012972) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.2520792848814613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 12973, "loss": 0.22807539999485016, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1229705810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:53] (step=0012973) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.25209871745044693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:58:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 12974, "loss": 0.19587692618370056, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3954524993896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:58:56] (step=0012974) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.25211815001943255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 12975, "loss": 0.2695748209953308, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0469875335693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:00] (step=0012975) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.2521375825884182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12976, "loss": 0.18760088086128235, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2173023223877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:04] (step=0012976) Train Loss: 0.1703, Train Steps/Sec: 0.27, Epoch: 0.2521570151574038, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 12977, "loss": 0.3738887310028076, "memory_gb": 7.721559524536133, "step_time_ms": 3357.961654663086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:07] (step=0012977) Train Loss: 0.3430, Train Steps/Sec: 0.28, Epoch: 0.2521764477263894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 12978, "loss": 0.2331787645816803, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1878929138184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:11] (step=0012978) Train Loss: 0.2593, Train Steps/Sec: 0.28, Epoch: 0.25219588029537504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 12979, "loss": 0.14032182097434998, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6714477539062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:14] (step=0012979) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.25221531286436066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 12980, "loss": 0.31579864025115967, "memory_gb": 7.721559524536133, "step_time_ms": 3360.356569290161, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:18] (step=0012980) Train Loss: 0.3093, Train Steps/Sec: 0.28, Epoch: 0.2522347454333463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 12981, "loss": 0.2444208562374115, "memory_gb": 7.721559524536133, "step_time_ms": 3347.008228302002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:21] (step=0012981) Train Loss: 0.2536, Train Steps/Sec: 0.28, Epoch: 0.2522541780023319, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 12982, "loss": 0.14079424738883972, "memory_gb": 7.721559524536133, "step_time_ms": 3356.656074523926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:25] (step=0012982) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.25227361057131753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 12983, "loss": 0.17457810044288635, "memory_gb": 7.721559524536133, "step_time_ms": 3362.653970718384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:29] (step=0012983) Train Loss: 0.1555, Train Steps/Sec: 0.28, Epoch: 0.25229304314030315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 12984, "loss": 0.2113213688135147, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8062057495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:32] (step=0012984) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.2523124757092888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 12985, "loss": 0.23910163342952728, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0973358154297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:36] (step=0012985) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.2523319082782744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 12986, "loss": 0.34211987257003784, "memory_gb": 7.721559524536133, "step_time_ms": 3357.125759124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:39] (step=0012986) Train Loss: 0.2493, Train Steps/Sec: 0.28, Epoch: 0.25235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 12987, "loss": 0.3209370970726013, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8315715789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:43] (step=0012987) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.25237077341624564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 12988, "loss": 0.1697675585746765, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0704460144043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:46] (step=0012988) Train Loss: 0.1556, Train Steps/Sec: 0.28, Epoch: 0.25239020598523126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 12989, "loss": 0.231130450963974, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9890213012695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:50] (step=0012989) Train Loss: 0.2553, Train Steps/Sec: 0.28, Epoch: 0.2524096385542169, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 12990, "loss": 0.30930060148239136, "memory_gb": 7.721559524536133, "step_time_ms": 3343.2798385620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:54] (step=0012990) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2524290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 12:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 12991, "loss": 0.18933118879795074, "memory_gb": 7.721559524536133, "step_time_ms": 3354.860782623291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 12:59:57] (step=0012991) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.25244850369218813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 12992, "loss": 0.27466511726379395, "memory_gb": 7.721559524536133, "step_time_ms": 3355.597972869873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:01] (step=0012992) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.25246793626117375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 12993, "loss": 0.10432332754135132, "memory_gb": 7.721559524536133, "step_time_ms": 3357.090711593628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:04] (step=0012993) Train Loss: 0.1532, Train Steps/Sec: 0.28, Epoch: 0.2524873688301593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 12994, "loss": 0.2727943956851959, "memory_gb": 7.721559524536133, "step_time_ms": 3342.406749725342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:08] (step=0012994) Train Loss: 0.2657, Train Steps/Sec: 0.28, Epoch: 0.25250680139914494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 12995, "loss": 0.12635895609855652, "memory_gb": 7.721559524536133, "step_time_ms": 3493.654489517212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:12] (step=0012995) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.25252623396813056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 12996, "loss": 0.14264793694019318, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3904552459717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:15] (step=0012996) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.2525456665371162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 12997, "loss": 0.19693027436733246, "memory_gb": 7.721559524536133, "step_time_ms": 3350.264310836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:19] (step=0012997) Train Loss: 0.2012, Train Steps/Sec: 0.28, Epoch: 0.2525650991061018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 12998, "loss": 0.323910653591156, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1904430389404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:22] (step=0012998) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.25258453167508743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 12999, "loss": 0.15942466259002686, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3599376678467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:26] (step=0012999) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.25260396424407305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13000, "loss": 0.2273838222026825, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9982776641846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:29] (step=0013000) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.2526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13001, "loss": 0.2395346313714981, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2543907165527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:33] (step=0013001) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.2526428293820443, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13002, "loss": 0.13826805353164673, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4957637786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:37] (step=0013002) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.2526622619510299, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13003, "loss": 0.2573394179344177, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7301292419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:40] (step=0013003) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.25268169452001554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13004, "loss": 0.17658811807632446, "memory_gb": 7.721559524536133, "step_time_ms": 3357.905864715576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:44] (step=0013004) Train Loss: 0.1921, Train Steps/Sec: 0.28, Epoch: 0.25270112708900117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13005, "loss": 0.2496728003025055, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5025539398193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:47] (step=0013005) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.2527205596579868, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13006, "loss": 0.16997087001800537, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8499813079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:51] (step=0013006) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.2527399922269724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13007, "loss": 0.19734613597393036, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2304763793945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:54] (step=0013007) Train Loss: 0.1863, Train Steps/Sec: 0.28, Epoch: 0.25275942479595803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13008, "loss": 0.3148266077041626, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6742153167725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:00:58] (step=0013008) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.25277885736494365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13009, "loss": 0.2051660120487213, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7519702911377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:02] (step=0013009) Train Loss: 0.2138, Train Steps/Sec: 0.28, Epoch: 0.2527982899339293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13010, "loss": 0.28121674060821533, "memory_gb": 7.721559524536133, "step_time_ms": 3344.95210647583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:05] (step=0013010) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.2528177225029149, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13011, "loss": 0.2522573471069336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0397624969482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:09] (step=0013011) Train Loss: 0.2331, Train Steps/Sec: 0.28, Epoch: 0.2528371550719005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13012, "loss": 0.316462904214859, "memory_gb": 7.721559524536133, "step_time_ms": 3353.254556655884, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:12] (step=0013012) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.25285658764088614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13013, "loss": 0.34189844131469727, "memory_gb": 7.715639114379883, "step_time_ms": 3320.8138942718506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:16] (step=0013013) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.25287602020987177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13014, "loss": 0.28031906485557556, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9371490478516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:19] (step=0013014) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.2528954527788574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13015, "loss": 0.261128306388855, "memory_gb": 7.721559524536133, "step_time_ms": 3353.837251663208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:23] (step=0013015) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.252914885347843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13016, "loss": 0.332533597946167, "memory_gb": 7.721559524536133, "step_time_ms": 3356.611967086792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:26] (step=0013016) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.2529343179168286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13017, "loss": 0.15193283557891846, "memory_gb": 7.721559524536133, "step_time_ms": 3347.644329071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:30] (step=0013017) Train Loss: 0.1445, Train Steps/Sec: 0.28, Epoch: 0.2529537504858142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13018, "loss": 0.2998059391975403, "memory_gb": 7.715639114379883, "step_time_ms": 3316.3061141967773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:34] (step=0013018) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.2529731830547998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13019, "loss": 0.21271762251853943, "memory_gb": 7.721559524536133, "step_time_ms": 3357.487916946411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:37] (step=0013019) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.25299261562378544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13020, "loss": 0.35707682371139526, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7546558380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:41] (step=0013020) Train Loss: 0.3198, Train Steps/Sec: 0.28, Epoch: 0.25301204819277107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13021, "loss": 0.22884024679660797, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9492568969727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:44] (step=0013021) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.2530314807617567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13022, "loss": 0.2308778464794159, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0777435302734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:48] (step=0013022) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.2530509133307423, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13023, "loss": 0.2052067667245865, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5400581359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:52] (step=0013023) Train Loss: 0.2661, Train Steps/Sec: 0.27, Epoch: 0.25307034589972793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13024, "loss": 0.1455763280391693, "memory_gb": 7.721559524536133, "step_time_ms": 3356.989622116089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:55] (step=0013024) Train Loss: 0.1505, Train Steps/Sec: 0.28, Epoch: 0.25308977846871356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:01:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13025, "loss": 0.16249457001686096, "memory_gb": 7.721559524536133, "step_time_ms": 3353.382349014282, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:01:59] (step=0013025) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.2531092110376992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13026, "loss": 0.24213841557502747, "memory_gb": 7.721559524536133, "step_time_ms": 3353.116512298584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:02] (step=0013026) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.2531286436066848, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13027, "loss": 0.16339175403118134, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7987213134766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:06] (step=0013027) Train Loss: 0.1682, Train Steps/Sec: 0.28, Epoch: 0.2531480761756704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13028, "loss": 0.2061852663755417, "memory_gb": 7.721559524536133, "step_time_ms": 3356.088638305664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:09] (step=0013028) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.25316750874465604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13029, "loss": 0.31026095151901245, "memory_gb": 7.721559524536133, "step_time_ms": 3349.5028018951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:13] (step=0013029) Train Loss: 0.3032, Train Steps/Sec: 0.28, Epoch: 0.25318694131364167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13030, "loss": 0.22043940424919128, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6126613616943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:17] (step=0013030) Train Loss: 0.1866, Train Steps/Sec: 0.28, Epoch: 0.2532063738826273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13031, "loss": 0.16621606051921844, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6231002807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:20] (step=0013031) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.2532258064516129, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13032, "loss": 0.20403002202510834, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1882972717285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:24] (step=0013032) Train Loss: 0.2292, Train Steps/Sec: 0.28, Epoch: 0.25324523902059853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13033, "loss": 0.2797701358795166, "memory_gb": 7.721559524536133, "step_time_ms": 3357.295274734497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:27] (step=0013033) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.25326467158958416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13034, "loss": 0.21114087104797363, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5809211730957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:31] (step=0013034) Train Loss: 0.1936, Train Steps/Sec: 0.28, Epoch: 0.2532841041585698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13035, "loss": 0.24026547372341156, "memory_gb": 7.721559524536133, "step_time_ms": 3362.320899963379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:34] (step=0013035) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.2533035367275554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13036, "loss": 0.231004536151886, "memory_gb": 7.721559524536133, "step_time_ms": 3504.8201084136963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:38] (step=0013036) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.253322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13037, "loss": 0.18881861865520477, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6248626708984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:41] (step=0013037) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.25334240186552665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13038, "loss": 0.1921660602092743, "memory_gb": 7.721559524536133, "step_time_ms": 3358.086347579956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:45] (step=0013038) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.25336183443451227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13039, "loss": 0.3220027983188629, "memory_gb": 7.721559524536133, "step_time_ms": 3347.3758697509766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:49] (step=0013039) Train Loss: 0.2941, Train Steps/Sec: 0.28, Epoch: 0.2533812670034979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13040, "loss": 0.2896229922771454, "memory_gb": 7.721559524536133, "step_time_ms": 3353.295087814331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:52] (step=0013040) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.25340069957248346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13041, "loss": 0.22853314876556396, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2215309143066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:56] (step=0013041) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.2534201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13042, "loss": 0.28465768694877625, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3921241760254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:02:59] (step=0013042) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.2534395647104547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13043, "loss": 0.2277287244796753, "memory_gb": 7.721559524536133, "step_time_ms": 3356.015682220459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:03] (step=0013043) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.2534589972794403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13044, "loss": 0.14630669355392456, "memory_gb": 7.721559524536133, "step_time_ms": 3344.3214893341064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:06] (step=0013044) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.25347842984842595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13045, "loss": 0.22769735753536224, "memory_gb": 7.721559524536133, "step_time_ms": 3359.508275985718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:10] (step=0013045) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.25349786241741157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13046, "loss": 0.24266237020492554, "memory_gb": 7.721559524536133, "step_time_ms": 3341.6552543640137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:14] (step=0013046) Train Loss: 0.2235, Train Steps/Sec: 0.28, Epoch: 0.2535172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13047, "loss": 0.26681774854660034, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4837913513184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:17] (step=0013047) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.2535367275553828, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13048, "loss": 0.22645114362239838, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5924377441406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:21] (step=0013048) Train Loss: 0.2843, Train Steps/Sec: 0.28, Epoch: 0.25355616012436843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13049, "loss": 0.2227352261543274, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6784648895264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:24] (step=0013049) Train Loss: 0.2496, Train Steps/Sec: 0.28, Epoch: 0.25357559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13050, "loss": 0.19443140923976898, "memory_gb": 7.721559524536133, "step_time_ms": 3361.006498336792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:28] (step=0013050) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.2535950252623397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13051, "loss": 0.2550390362739563, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7355518341064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:31] (step=0013051) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.2536144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13052, "loss": 0.19517847895622253, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8061332702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:35] (step=0013052) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.2536338904003109, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13053, "loss": 0.1463756263256073, "memory_gb": 7.721559524536133, "step_time_ms": 3339.9178981781006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:39] (step=0013053) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.25365332296929655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13054, "loss": 0.23492395877838135, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4323959350586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:42] (step=0013054) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.25367275553828217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13055, "loss": 0.33642682433128357, "memory_gb": 7.721559524536133, "step_time_ms": 3347.9621410369873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:46] (step=0013055) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.2536921881072678, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13056, "loss": 0.30170953273773193, "memory_gb": 7.721559524536133, "step_time_ms": 3346.4901447296143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:49] (step=0013056) Train Loss: 0.2950, Train Steps/Sec: 0.28, Epoch: 0.2537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13057, "loss": 0.1577197015285492, "memory_gb": 7.721559524536133, "step_time_ms": 3356.621742248535, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:53] (step=0013057) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.25373105324523904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:03:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13058, "loss": 0.3361426591873169, "memory_gb": 7.721559524536133, "step_time_ms": 3354.870557785034, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:03:56] (step=0013058) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.25375048581422466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13059, "loss": 0.18156921863555908, "memory_gb": 7.721559524536133, "step_time_ms": 3363.637924194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:00] (step=0013059) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.2537699183832103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13060, "loss": 0.22183886170387268, "memory_gb": 7.721559524536133, "step_time_ms": 3356.53018951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:03] (step=0013060) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.2537893509521959, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13061, "loss": 0.1752116084098816, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1294078826904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:07] (step=0013061) Train Loss: 0.2627, Train Steps/Sec: 0.28, Epoch: 0.2538087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13062, "loss": 0.19609594345092773, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3105545043945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:11] (step=0013062) Train Loss: 0.2047, Train Steps/Sec: 0.28, Epoch: 0.25382821609016715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13063, "loss": 0.16958168148994446, "memory_gb": 7.721559524536133, "step_time_ms": 3360.055446624756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:14] (step=0013063) Train Loss: 0.1734, Train Steps/Sec: 0.28, Epoch: 0.2538476486591527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13064, "loss": 0.22641894221305847, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5527153015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:18] (step=0013064) Train Loss: 0.2287, Train Steps/Sec: 0.28, Epoch: 0.25386708122813834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13065, "loss": 0.18616272509098053, "memory_gb": 7.721559524536133, "step_time_ms": 3346.768856048584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:21] (step=0013065) Train Loss: 0.1734, Train Steps/Sec: 0.28, Epoch: 0.25388651379712396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13066, "loss": 0.15434062480926514, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4580211639404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:25] (step=0013066) Train Loss: 0.1807, Train Steps/Sec: 0.28, Epoch: 0.2539059463661096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13067, "loss": 0.34499430656433105, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9601726531982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:28] (step=0013067) Train Loss: 0.3022, Train Steps/Sec: 0.28, Epoch: 0.2539253789350952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13068, "loss": 0.25082099437713623, "memory_gb": 7.721559524536133, "step_time_ms": 3358.143091201782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:32] (step=0013068) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.2539448115040808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13069, "loss": 0.25646060705184937, "memory_gb": 7.721559524536133, "step_time_ms": 3360.978126525879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:36] (step=0013069) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.25396424407306645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13070, "loss": 0.2604745626449585, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3479442596436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:39] (step=0013070) Train Loss: 0.2142, Train Steps/Sec: 0.28, Epoch: 0.25398367664205207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13071, "loss": 0.2337387502193451, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0217781066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:43] (step=0013071) Train Loss: 0.2334, Train Steps/Sec: 0.27, Epoch: 0.2540031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13072, "loss": 0.30924510955810547, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6911945343018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:46] (step=0013072) Train Loss: 0.3147, Train Steps/Sec: 0.28, Epoch: 0.2540225417800233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13073, "loss": 0.24903559684753418, "memory_gb": 7.721559524536133, "step_time_ms": 3355.480432510376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:50] (step=0013073) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.25404197434900894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13074, "loss": 0.29110991954803467, "memory_gb": 7.715639114379883, "step_time_ms": 3318.56369972229, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:54] (step=0013074) Train Loss: 0.3196, Train Steps/Sec: 0.28, Epoch: 0.25406140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13075, "loss": 0.23531083762645721, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2682819366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:04:57] (step=0013075) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.2540808394869802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13076, "loss": 0.3719469904899597, "memory_gb": 7.715639114379883, "step_time_ms": 3316.8110847473145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:01] (step=0013076) Train Loss: 0.3005, Train Steps/Sec: 0.28, Epoch: 0.2541002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13077, "loss": 0.2581045925617218, "memory_gb": 7.721559524536133, "step_time_ms": 3356.003522872925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:04] (step=0013077) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.2541197046249514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13078, "loss": 0.24267008900642395, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1674823760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:08] (step=0013078) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.25413913719393705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13079, "loss": 0.1344371736049652, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5097789764404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:11] (step=0013079) Train Loss: 0.1419, Train Steps/Sec: 0.28, Epoch: 0.25415856976292267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13080, "loss": 0.285063773393631, "memory_gb": 7.721559524536133, "step_time_ms": 3359.142303466797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:15] (step=0013080) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.2541780023319083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13081, "loss": 0.22633618116378784, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2752475738525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:19] (step=0013081) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.2541974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13082, "loss": 0.14277851581573486, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7709922790527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:22] (step=0013082) Train Loss: 0.1762, Train Steps/Sec: 0.28, Epoch: 0.25421686746987954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13083, "loss": 0.3153599202632904, "memory_gb": 7.721559524536133, "step_time_ms": 3358.391761779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:26] (step=0013083) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.25423630003886516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13084, "loss": 0.14387056231498718, "memory_gb": 7.721559524536133, "step_time_ms": 3493.618965148926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:29] (step=0013084) Train Loss: 0.1599, Train Steps/Sec: 0.28, Epoch: 0.2542557326078508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13085, "loss": 0.26212412118911743, "memory_gb": 7.721559524536133, "step_time_ms": 3357.945203781128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:33] (step=0013085) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.2542751651768364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13086, "loss": 0.23947493731975555, "memory_gb": 7.721559524536133, "step_time_ms": 3355.151653289795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:36] (step=0013086) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.25429459774582197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13087, "loss": 0.24492861330509186, "memory_gb": 7.721559524536133, "step_time_ms": 3355.38387298584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:40] (step=0013087) Train Loss: 0.2565, Train Steps/Sec: 0.28, Epoch: 0.2543140303148076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13088, "loss": 0.2345714271068573, "memory_gb": 7.721559524536133, "step_time_ms": 3361.276149749756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:44] (step=0013088) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.2543334628837932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13089, "loss": 0.23241201043128967, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6511611938477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:47] (step=0013089) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.25435289545277884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13090, "loss": 0.29608213901519775, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6128787994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:51] (step=0013090) Train Loss: 0.3099, Train Steps/Sec: 0.28, Epoch: 0.25437232802176446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13091, "loss": 0.17360301315784454, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9886684417725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:54] (step=0013091) Train Loss: 0.1816, Train Steps/Sec: 0.28, Epoch: 0.2543917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13092, "loss": 0.21942138671875, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4723262786865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:05:58] (step=0013092) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.2544111931597357, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13093, "loss": 0.13492703437805176, "memory_gb": 7.721559524536133, "step_time_ms": 3342.7891731262207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:01] (step=0013093) Train Loss: 0.1891, Train Steps/Sec: 0.28, Epoch: 0.2544306257287213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13094, "loss": 0.23955224454402924, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8053035736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:05] (step=0013094) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.25445005829770695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13095, "loss": 0.18939077854156494, "memory_gb": 7.721559524536133, "step_time_ms": 3354.471206665039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:09] (step=0013095) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.25446949086669257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13096, "loss": 0.2208479791879654, "memory_gb": 7.721559524536133, "step_time_ms": 3355.064630508423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:12] (step=0013096) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.2544889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13097, "loss": 0.21421772241592407, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3192100524902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:16] (step=0013097) Train Loss: 0.2626, Train Steps/Sec: 0.28, Epoch: 0.2545083560046638, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13098, "loss": 0.24229009449481964, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9230823516846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:19] (step=0013098) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.25452778857364944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13099, "loss": 0.14597900211811066, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9275608062744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:23] (step=0013099) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.25454722114263506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13100, "loss": 0.14464804530143738, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8904399871826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:26] (step=0013100) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.2545666537116207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13101, "loss": 0.2711385488510132, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6834411621094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:30] (step=0013101) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.2545860862806063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13102, "loss": 0.21857915818691254, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6737689971924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:34] (step=0013102) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.2546055188495919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13103, "loss": 0.24734771251678467, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1504821777344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:37] (step=0013103) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.25462495141857755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13104, "loss": 0.1708124428987503, "memory_gb": 7.721559524536133, "step_time_ms": 3358.171224594116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:41] (step=0013104) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.25464438398756317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13105, "loss": 0.1690167486667633, "memory_gb": 7.721559524536133, "step_time_ms": 3345.592260360718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:44] (step=0013105) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.2546638165565488, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13106, "loss": 0.27978605031967163, "memory_gb": 7.721559524536133, "step_time_ms": 3339.479684829712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:48] (step=0013106) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.2546832491255344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13107, "loss": 0.2048959732055664, "memory_gb": 7.721559524536133, "step_time_ms": 3357.924222946167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:51] (step=0013107) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.25470268169452004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13108, "loss": 0.32490918040275574, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4584255218506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:55] (step=0013108) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.25472211426350566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13109, "loss": 0.2303999662399292, "memory_gb": 7.721559524536133, "step_time_ms": 3338.164806365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:06:59] (step=0013109) Train Loss: 0.2000, Train Steps/Sec: 0.28, Epoch: 0.25474154683249123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13110, "loss": 0.16050127148628235, "memory_gb": 7.721559524536133, "step_time_ms": 3354.464054107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:02] (step=0013110) Train Loss: 0.2042, Train Steps/Sec: 0.28, Epoch: 0.25476097940147685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13111, "loss": 0.23796263337135315, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0639362335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:06] (step=0013111) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.2547804119704625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13112, "loss": 0.20135751366615295, "memory_gb": 7.721559524536133, "step_time_ms": 3356.882333755493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:09] (step=0013112) Train Loss: 0.2466, Train Steps/Sec: 0.27, Epoch: 0.2547998445394481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13113, "loss": 0.15762382745742798, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2931594848633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:13] (step=0013113) Train Loss: 0.2012, Train Steps/Sec: 0.28, Epoch: 0.2548192771084337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13114, "loss": 0.35290732979774475, "memory_gb": 7.721559524536133, "step_time_ms": 3358.039140701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:17] (step=0013114) Train Loss: 0.3579, Train Steps/Sec: 0.28, Epoch: 0.25483870967741934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13115, "loss": 0.19756540656089783, "memory_gb": 7.721559524536133, "step_time_ms": 3351.703882217407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:20] (step=0013115) Train Loss: 0.1952, Train Steps/Sec: 0.28, Epoch: 0.25485814224640496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13116, "loss": 0.227633535861969, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2021980285645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:24] (step=0013116) Train Loss: 0.2816, Train Steps/Sec: 0.28, Epoch: 0.2548775748153906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13117, "loss": 0.2578756511211395, "memory_gb": 7.721559524536133, "step_time_ms": 3338.3848667144775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:27] (step=0013117) Train Loss: 0.2852, Train Steps/Sec: 0.28, Epoch: 0.2548970073843762, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13118, "loss": 0.2118915617465973, "memory_gb": 7.721559524536133, "step_time_ms": 3352.069854736328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:31] (step=0013118) Train Loss: 0.2764, Train Steps/Sec: 0.28, Epoch: 0.25491643995336183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13119, "loss": 0.27223071455955505, "memory_gb": 7.721559524536133, "step_time_ms": 3348.4086990356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:34] (step=0013119) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.25493587252234745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13120, "loss": 0.21874256432056427, "memory_gb": 7.721559524536133, "step_time_ms": 3334.603786468506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:38] (step=0013120) Train Loss: 0.3103, Train Steps/Sec: 0.28, Epoch: 0.2549553050913331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13121, "loss": 0.2061784565448761, "memory_gb": 7.721559524536133, "step_time_ms": 3351.221799850464, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:41] (step=0013121) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.2549747376603187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13122, "loss": 0.26059842109680176, "memory_gb": 7.721559524536133, "step_time_ms": 3343.5351848602295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:45] (step=0013122) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.2549941702293043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13123, "loss": 0.19206903874874115, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0845642089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:49] (step=0013123) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.25501360279828994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13124, "loss": 0.23816779255867004, "memory_gb": 7.721559524536133, "step_time_ms": 3500.2763271331787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:52] (step=0013124) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.25503303536727556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13125, "loss": 0.24619325995445251, "memory_gb": 7.721559524536133, "step_time_ms": 3354.478359222412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:56] (step=0013125) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.2550524679362612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:07:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13126, "loss": 0.13028989732265472, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2745838165283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:07:59] (step=0013126) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.2550719005052468, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13127, "loss": 0.2856835722923279, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6129417419434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:03] (step=0013127) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.25509133307423243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13128, "loss": 0.23540827631950378, "memory_gb": 7.721559524536133, "step_time_ms": 3349.668502807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:06] (step=0013128) Train Loss: 0.2549, Train Steps/Sec: 0.28, Epoch: 0.25511076564321805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13129, "loss": 0.3163658380508423, "memory_gb": 7.721559524536133, "step_time_ms": 3352.022647857666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:10] (step=0013129) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.2551301982122037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13130, "loss": 0.21528874337673187, "memory_gb": 7.721559524536133, "step_time_ms": 3358.328104019165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:13] (step=0013130) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.2551496307811893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13131, "loss": 0.14543519914150238, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1594066619873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:17] (step=0013131) Train Loss: 0.2103, Train Steps/Sec: 0.28, Epoch: 0.2551690633501749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13132, "loss": 0.24315960705280304, "memory_gb": 7.721559524536133, "step_time_ms": 3352.773666381836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:21] (step=0013132) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.25518849591916054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13133, "loss": 0.16813185811042786, "memory_gb": 7.721559524536133, "step_time_ms": 3358.370780944824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:24] (step=0013133) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2552079284881461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13134, "loss": 0.22166916728019714, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1043014526367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:28] (step=0013134) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.25522736105713173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13135, "loss": 0.2446540892124176, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0174446105957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:31] (step=0013135) Train Loss: 0.1937, Train Steps/Sec: 0.28, Epoch: 0.25524679362611735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13136, "loss": 0.17075267434120178, "memory_gb": 7.721559524536133, "step_time_ms": 3356.685161590576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:35] (step=0013136) Train Loss: 0.2297, Train Steps/Sec: 0.28, Epoch: 0.255266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13137, "loss": 0.14656442403793335, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0079803466797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:38] (step=0013137) Train Loss: 0.2082, Train Steps/Sec: 0.28, Epoch: 0.2552856587640886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13138, "loss": 0.39376339316368103, "memory_gb": 7.721559524536133, "step_time_ms": 3360.403537750244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:42] (step=0013138) Train Loss: 0.3452, Train Steps/Sec: 0.28, Epoch: 0.2553050913330742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13139, "loss": 0.29775843024253845, "memory_gb": 7.721559524536133, "step_time_ms": 3359.311103820801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:46] (step=0013139) Train Loss: 0.3064, Train Steps/Sec: 0.28, Epoch: 0.25532452390205984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13140, "loss": 0.2213575839996338, "memory_gb": 7.721559524536133, "step_time_ms": 3357.774257659912, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:49] (step=0013140) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.25534395647104546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13141, "loss": 0.30334019660949707, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1481189727783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:53] (step=0013141) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.2553633890400311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:08:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13142, "loss": 0.27303346991539, "memory_gb": 7.721559524536133, "step_time_ms": 3358.210563659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:08:56] (step=0013142) Train Loss: 0.2607, Train Steps/Sec: 0.28, Epoch: 0.2553828216090167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13143, "loss": 0.3055121898651123, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6465797424316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:00] (step=0013143) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.25540225417800233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13144, "loss": 0.3148612082004547, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6549034118652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:03] (step=0013144) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.25542168674698795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13145, "loss": 0.2999945878982544, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1197261810303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:07] (step=0013145) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.2554411193159736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13146, "loss": 0.270779550075531, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0262145996094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:11] (step=0013146) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.2554605518849592, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13147, "loss": 0.32953399419784546, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9723110198975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:14] (step=0013147) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.2554799844539448, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13148, "loss": 0.351389616727829, "memory_gb": 7.721559524536133, "step_time_ms": 3355.346918106079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:18] (step=0013148) Train Loss: 0.3198, Train Steps/Sec: 0.28, Epoch: 0.25549941702293044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13149, "loss": 0.24119892716407776, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2675457000732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:21] (step=0013149) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.25551884959191606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13150, "loss": 0.28121355175971985, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5852851867676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:25] (step=0013150) Train Loss: 0.3079, Train Steps/Sec: 0.28, Epoch: 0.2555382821609017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13151, "loss": 0.3115774989128113, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3319396972656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:28] (step=0013151) Train Loss: 0.2929, Train Steps/Sec: 0.28, Epoch: 0.2555577147298873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13152, "loss": 0.257358193397522, "memory_gb": 7.721559524536133, "step_time_ms": 3347.27144241333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:32] (step=0013152) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.25557714729887293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13153, "loss": 0.17518340051174164, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5986881256104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:36] (step=0013153) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.25559657986785855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13154, "loss": 0.29597657918930054, "memory_gb": 7.721559524536133, "step_time_ms": 3356.792211532593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:39] (step=0013154) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.2556160124368442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13155, "loss": 0.2220752239227295, "memory_gb": 7.721559524536133, "step_time_ms": 3361.630916595459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:43] (step=0013155) Train Loss: 0.1866, Train Steps/Sec: 0.28, Epoch: 0.2556354450058298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13156, "loss": 0.2829975485801697, "memory_gb": 7.721559524536133, "step_time_ms": 3353.08575630188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:46] (step=0013156) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.25565487757481536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13157, "loss": 0.24342329800128937, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3028106689453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:50] (step=0013157) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.255674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13158, "loss": 0.2985435128211975, "memory_gb": 7.721559524536133, "step_time_ms": 3361.955165863037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:53] (step=0013158) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.2556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:09:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13159, "loss": 0.14849603176116943, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6245613098145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:09:57] (step=0013159) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.25571317528177223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13160, "loss": 0.141286239027977, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3899478912354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:01] (step=0013160) Train Loss: 0.1914, Train Steps/Sec: 0.27, Epoch: 0.25573260785075785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13161, "loss": 0.33001816272735596, "memory_gb": 7.721559524536133, "step_time_ms": 3358.226776123047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:04] (step=0013161) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.2557520404197435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13162, "loss": 0.21395844221115112, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3368740081787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:08] (step=0013162) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.2557714729887291, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13163, "loss": 0.16701120138168335, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1512908935547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:11] (step=0013163) Train Loss: 0.1570, Train Steps/Sec: 0.28, Epoch: 0.2557909055577147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13164, "loss": 0.16795623302459717, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1503582000732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:15] (step=0013164) Train Loss: 0.1549, Train Steps/Sec: 0.28, Epoch: 0.25581033812670034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13165, "loss": 0.3418654203414917, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7349910736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:19] (step=0013165) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.25582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13166, "loss": 0.17599566280841827, "memory_gb": 7.721559524536133, "step_time_ms": 3356.461763381958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:22] (step=0013166) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.2558492032646716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13167, "loss": 0.23378589749336243, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5492420196533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:26] (step=0013167) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.2558686358336572, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13168, "loss": 0.12994752824306488, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4085750579834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:29] (step=0013168) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.25588806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13169, "loss": 0.27160292863845825, "memory_gb": 7.721559524536133, "step_time_ms": 3352.675676345825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:33] (step=0013169) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.25590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13170, "loss": 0.19678455591201782, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5588207244873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:36] (step=0013170) Train Loss: 0.1858, Train Steps/Sec: 0.28, Epoch: 0.2559269335406141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13171, "loss": 0.18535178899765015, "memory_gb": 7.721559524536133, "step_time_ms": 3512.9075050354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:40] (step=0013171) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.2559463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13172, "loss": 0.25120604038238525, "memory_gb": 7.721559524536133, "step_time_ms": 3358.609437942505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:44] (step=0013172) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.2559657986785853, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13173, "loss": 0.14305129647254944, "memory_gb": 7.721559524536133, "step_time_ms": 3361.142873764038, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:47] (step=0013173) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.25598523124757094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13174, "loss": 0.27312833070755005, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9256534576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:51] (step=0013174) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.25600466381655657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13175, "loss": 0.1668354868888855, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6512126922607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:54] (step=0013175) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.2560240963855422, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:10:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13176, "loss": 0.25274887681007385, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1268997192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:10:58] (step=0013176) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.2560435289545278, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13177, "loss": 0.298944890499115, "memory_gb": 7.721559524536133, "step_time_ms": 3360.074996948242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:01] (step=0013177) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.25606296152351343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13178, "loss": 0.22822225093841553, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6668243408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:05] (step=0013178) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.25608239409249905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13179, "loss": 0.23276537656784058, "memory_gb": 7.721559524536133, "step_time_ms": 3361.685276031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:09] (step=0013179) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.2561018266614846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13180, "loss": 0.11845384538173676, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8711280822754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:12] (step=0013180) Train Loss: 0.1571, Train Steps/Sec: 0.28, Epoch: 0.25612125923047024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13181, "loss": 0.25325748324394226, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4026260375977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:16] (step=0013181) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.25614069179945587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13182, "loss": 0.2622937858104706, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8713665008545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:19] (step=0013182) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.2561601243684415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13183, "loss": 0.3736354112625122, "memory_gb": 7.721559524536133, "step_time_ms": 3362.807273864746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:23] (step=0013183) Train Loss: 0.3208, Train Steps/Sec: 0.28, Epoch: 0.2561795569374271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13184, "loss": 0.23594480752944946, "memory_gb": 7.721559524536133, "step_time_ms": 3345.98708152771, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:26] (step=0013184) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.25619898950641273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13185, "loss": 0.22795376181602478, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5754165649414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:30] (step=0013185) Train Loss: 0.2020, Train Steps/Sec: 0.28, Epoch: 0.25621842207539836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13186, "loss": 0.12099937349557877, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1863384246826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:34] (step=0013186) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.256237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13187, "loss": 0.3342059254646301, "memory_gb": 7.715639114379883, "step_time_ms": 3325.5701065063477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:37] (step=0013187) Train Loss: 0.3263, Train Steps/Sec: 0.28, Epoch: 0.2562572872133696, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13188, "loss": 0.25913166999816895, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5430431365967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:41] (step=0013188) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.2562767197823552, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13189, "loss": 0.2500075101852417, "memory_gb": 7.721559524536133, "step_time_ms": 3358.319044113159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:44] (step=0013189) Train Loss: 0.2777, Train Steps/Sec: 0.28, Epoch: 0.25629615235134084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13190, "loss": 0.2710953652858734, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5729084014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:48] (step=0013190) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.25631558492032647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13191, "loss": 0.17683136463165283, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7017784118652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:52] (step=0013191) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.2563350174893121, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13192, "loss": 0.16223369538784027, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2137565612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:55] (step=0013192) Train Loss: 0.1543, Train Steps/Sec: 0.28, Epoch: 0.2563544500582977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:11:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13193, "loss": 0.1305369734764099, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5618267059326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:11:59] (step=0013193) Train Loss: 0.1741, Train Steps/Sec: 0.28, Epoch: 0.25637388262728333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13194, "loss": 0.28349465131759644, "memory_gb": 7.721559524536133, "step_time_ms": 3362.563371658325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:02] (step=0013194) Train Loss: 0.2955, Train Steps/Sec: 0.28, Epoch: 0.25639331519626896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13195, "loss": 0.31874001026153564, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7514934539795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:06] (step=0013195) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.2564127477652546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13196, "loss": 0.21576382219791412, "memory_gb": 7.721559524536133, "step_time_ms": 3345.8030223846436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:09] (step=0013196) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.2564321803342402, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13197, "loss": 0.21006666123867035, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4329357147217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:13] (step=0013197) Train Loss: 0.2207, Train Steps/Sec: 0.28, Epoch: 0.2564516129032258, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13198, "loss": 0.24649256467819214, "memory_gb": 7.721559524536133, "step_time_ms": 3360.454797744751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:17] (step=0013198) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.25647104547221145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13199, "loss": 0.27790403366088867, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2096099853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:20] (step=0013199) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.25649047804119707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13200, "loss": 0.17699280381202698, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8814945220947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:24] (step=0013200) Train Loss: 0.2068, Train Steps/Sec: 0.27, Epoch: 0.2565099106101827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13201, "loss": 0.3397943377494812, "memory_gb": 7.721559524536133, "step_time_ms": 3354.013442993164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:27] (step=0013201) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.2565293431791683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13202, "loss": 0.258672833442688, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8621616363525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:31] (step=0013202) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.2565487757481539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13203, "loss": 0.2599378824234009, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9608459472656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:35] (step=0013203) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.2565682083171395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13204, "loss": 0.24576973915100098, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0278930664062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:38] (step=0013204) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.2565876408861251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13205, "loss": 0.25552284717559814, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0891876220703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:42] (step=0013205) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.25660707345511075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13206, "loss": 0.2404046505689621, "memory_gb": 7.721559524536133, "step_time_ms": 3349.0536212921143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:45] (step=0013206) Train Loss: 0.2703, Train Steps/Sec: 0.28, Epoch: 0.25662650602409637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13207, "loss": 0.22557297348976135, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9349098205566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:49] (step=0013207) Train Loss: 0.2842, Train Steps/Sec: 0.28, Epoch: 0.256645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13208, "loss": 0.17310398817062378, "memory_gb": 7.721559524536133, "step_time_ms": 3342.1943187713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:52] (step=0013208) Train Loss: 0.1966, Train Steps/Sec: 0.28, Epoch: 0.2566653711620676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:12:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13209, "loss": 0.30094000697135925, "memory_gb": 7.721559524536133, "step_time_ms": 3354.597568511963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:12:56] (step=0013209) Train Loss: 0.2833, Train Steps/Sec: 0.28, Epoch: 0.25668480373105323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13210, "loss": 0.2439352571964264, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9280376434326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:00] (step=0013210) Train Loss: 0.2631, Train Steps/Sec: 0.28, Epoch: 0.25670423630003886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13211, "loss": 0.11262354999780655, "memory_gb": 7.721559524536133, "step_time_ms": 3355.976104736328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:03] (step=0013211) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.2567236688690245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13212, "loss": 0.13558702170848846, "memory_gb": 7.721559524536133, "step_time_ms": 3497.2777366638184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:07] (step=0013212) Train Loss: 0.2030, Train Steps/Sec: 0.28, Epoch: 0.2567431014380101, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13213, "loss": 0.20740878582000732, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0453186035156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:10] (step=0013213) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.2567625340069957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13214, "loss": 0.1429751217365265, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1663932800293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:14] (step=0013214) Train Loss: 0.1562, Train Steps/Sec: 0.28, Epoch: 0.25678196657598135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13215, "loss": 0.3185421824455261, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4726581573486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:17] (step=0013215) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.25680139914496697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13216, "loss": 0.17610836029052734, "memory_gb": 7.721559524536133, "step_time_ms": 3338.7582302093506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:21] (step=0013216) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.2568208317139526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13217, "loss": 0.2949959635734558, "memory_gb": 7.721559524536133, "step_time_ms": 3356.782913208008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:24] (step=0013217) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.2568402642829382, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13218, "loss": 0.24134814739227295, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9329509735107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:28] (step=0013218) Train Loss: 0.2742, Train Steps/Sec: 0.28, Epoch: 0.25685969685192384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13219, "loss": 0.2906828224658966, "memory_gb": 7.721559524536133, "step_time_ms": 3346.851110458374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:32] (step=0013219) Train Loss: 0.2734, Train Steps/Sec: 0.28, Epoch: 0.25687912942090946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13220, "loss": 0.1617325246334076, "memory_gb": 7.721559524536133, "step_time_ms": 3352.759838104248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:35] (step=0013220) Train Loss: 0.1812, Train Steps/Sec: 0.28, Epoch: 0.2568985619898951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13221, "loss": 0.2661556601524353, "memory_gb": 7.721559524536133, "step_time_ms": 3346.1930751800537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:39] (step=0013221) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.2569179945588807, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13222, "loss": 0.2065986543893814, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1418781280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:42] (step=0013222) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.2569374271278663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13223, "loss": 0.2520016133785248, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3081073760986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:46] (step=0013223) Train Loss: 0.2898, Train Steps/Sec: 0.28, Epoch: 0.25695685969685195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13224, "loss": 0.2557764947414398, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7050743103027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:49] (step=0013224) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.25697629226583757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13225, "loss": 0.23521578311920166, "memory_gb": 7.721559524536133, "step_time_ms": 3351.0844707489014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:53] (step=0013225) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.25699572483482314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:13:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13226, "loss": 0.2067689448595047, "memory_gb": 7.721559524536133, "step_time_ms": 3355.881452560425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:13:57] (step=0013226) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.25701515740380876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13227, "loss": 0.32651275396347046, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5502891540527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:00] (step=0013227) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.2570345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13228, "loss": 0.2546302080154419, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8036556243896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:04] (step=0013228) Train Loss: 0.2475, Train Steps/Sec: 0.28, Epoch: 0.25705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13229, "loss": 0.17992857098579407, "memory_gb": 7.721559524536133, "step_time_ms": 3352.588653564453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:07] (step=0013229) Train Loss: 0.1907, Train Steps/Sec: 0.28, Epoch: 0.2570734551107656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13230, "loss": 0.17652317881584167, "memory_gb": 7.721559524536133, "step_time_ms": 3354.079484939575, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:11] (step=0013230) Train Loss: 0.2289, Train Steps/Sec: 0.28, Epoch: 0.25709288767975125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13231, "loss": 0.3050268590450287, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3291816711426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:14] (step=0013231) Train Loss: 0.2752, Train Steps/Sec: 0.28, Epoch: 0.25711232024873687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13232, "loss": 0.27795591950416565, "memory_gb": 7.721559524536133, "step_time_ms": 3354.107141494751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:18] (step=0013232) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.2571317528177225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13233, "loss": 0.23463965952396393, "memory_gb": 7.721559524536133, "step_time_ms": 3350.149154663086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:22] (step=0013233) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.2571511853867081, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13234, "loss": 0.3213939070701599, "memory_gb": 7.721559524536133, "step_time_ms": 3339.0676975250244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:25] (step=0013234) Train Loss: 0.3434, Train Steps/Sec: 0.29, Epoch: 0.25717061795569374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13235, "loss": 0.23134708404541016, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7001094818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:29] (step=0013235) Train Loss: 0.2514, Train Steps/Sec: 0.28, Epoch: 0.25719005052467936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13236, "loss": 0.3287777602672577, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2474460601807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:32] (step=0013236) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.257209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13237, "loss": 0.24296364188194275, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1793823242188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:36] (step=0013237) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.2572289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13238, "loss": 0.22444242238998413, "memory_gb": 7.721559524536133, "step_time_ms": 3344.5301055908203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:39] (step=0013238) Train Loss: 0.1902, Train Steps/Sec: 0.28, Epoch: 0.2572483482316362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13239, "loss": 0.21582180261611938, "memory_gb": 7.721559524536133, "step_time_ms": 3358.999729156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:43] (step=0013239) Train Loss: 0.2370, Train Steps/Sec: 0.28, Epoch: 0.25726778080062185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13240, "loss": 0.23485299944877625, "memory_gb": 7.715639114379883, "step_time_ms": 3316.8606758117676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:46] (step=0013240) Train Loss: 0.2007, Train Steps/Sec: 0.28, Epoch: 0.25728721336960747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13241, "loss": 0.22862327098846436, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9513092041016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:50] (step=0013241) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.2573066459385931, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13242, "loss": 0.37854838371276855, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2261848449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:54] (step=0013242) Train Loss: 0.3348, Train Steps/Sec: 0.28, Epoch: 0.2573260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13243, "loss": 0.1820037066936493, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7064113616943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:14:57] (step=0013243) Train Loss: 0.1767, Train Steps/Sec: 0.28, Epoch: 0.25734551107656434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13244, "loss": 0.3152642250061035, "memory_gb": 7.721559524536133, "step_time_ms": 3357.088327407837, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:01] (step=0013244) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.25736494364554996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13245, "loss": 0.26501864194869995, "memory_gb": 7.721559524536133, "step_time_ms": 3359.562873840332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:04] (step=0013245) Train Loss: 0.2723, Train Steps/Sec: 0.28, Epoch: 0.2573843762145356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13246, "loss": 0.24482478201389313, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0472259521484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:08] (step=0013246) Train Loss: 0.2743, Train Steps/Sec: 0.28, Epoch: 0.2574038087835212, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13247, "loss": 0.17638719081878662, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1320514678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:12] (step=0013247) Train Loss: 0.1648, Train Steps/Sec: 0.27, Epoch: 0.2574232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13248, "loss": 0.1592358648777008, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5169315338135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:15] (step=0013248) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.25744267392149245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13249, "loss": 0.30776357650756836, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9092960357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:19] (step=0013249) Train Loss: 0.2611, Train Steps/Sec: 0.28, Epoch: 0.257462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13250, "loss": 0.26482850313186646, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7163944244385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:22] (step=0013250) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.25748153905946364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13251, "loss": 0.2657669186592102, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7421741485596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:26] (step=0013251) Train Loss: 0.3078, Train Steps/Sec: 0.28, Epoch: 0.25750097162844926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13252, "loss": 0.29644352197647095, "memory_gb": 7.721559524536133, "step_time_ms": 3351.592779159546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:29] (step=0013252) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.2575204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13253, "loss": 0.32464849948883057, "memory_gb": 7.721559524536133, "step_time_ms": 3358.004093170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:33] (step=0013253) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.2575398367664205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13254, "loss": 0.3295079469680786, "memory_gb": 7.721559524536133, "step_time_ms": 3349.294424057007, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:36] (step=0013254) Train Loss: 0.3382, Train Steps/Sec: 0.28, Epoch: 0.2575592693354061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13255, "loss": 0.20194226503372192, "memory_gb": 7.721559524536133, "step_time_ms": 3357.456684112549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:40] (step=0013255) Train Loss: 0.2168, Train Steps/Sec: 0.28, Epoch: 0.25757870190439175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13256, "loss": 0.3091675937175751, "memory_gb": 7.721559524536133, "step_time_ms": 3356.975793838501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:44] (step=0013256) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.25759813447337737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13257, "loss": 0.24992239475250244, "memory_gb": 7.721559524536133, "step_time_ms": 3350.780487060547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:47] (step=0013257) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.257617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13258, "loss": 0.20852415263652802, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5400581359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:51] (step=0013258) Train Loss: 0.2617, Train Steps/Sec: 0.28, Epoch: 0.2576369996113486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13259, "loss": 0.2773780822753906, "memory_gb": 7.721559524536133, "step_time_ms": 3357.980251312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:54] (step=0013259) Train Loss: 0.2933, Train Steps/Sec: 0.28, Epoch: 0.25765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13260, "loss": 0.31199753284454346, "memory_gb": 7.721559524536133, "step_time_ms": 3495.164632797241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:15:58] (step=0013260) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.25767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13261, "loss": 0.2178097814321518, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2182655334473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:02] (step=0013261) Train Loss: 0.2490, Train Steps/Sec: 0.28, Epoch: 0.2576952973183055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13262, "loss": 0.20572161674499512, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2544326782227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:05] (step=0013262) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.2577147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13263, "loss": 0.1887291967868805, "memory_gb": 7.721559524536133, "step_time_ms": 3357.633352279663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:09] (step=0013263) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.2577341624562767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13264, "loss": 0.21526023745536804, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7139797210693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:12] (step=0013264) Train Loss: 0.1845, Train Steps/Sec: 0.28, Epoch: 0.25775359502526235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13265, "loss": 0.2420254945755005, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0754318237305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:16] (step=0013265) Train Loss: 0.2363, Train Steps/Sec: 0.28, Epoch: 0.25777302759424797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13266, "loss": 0.218763068318367, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4819049835205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:19] (step=0013266) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.2577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13267, "loss": 0.29005634784698486, "memory_gb": 7.721559524536133, "step_time_ms": 3361.471176147461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:23] (step=0013267) Train Loss: 0.2585, Train Steps/Sec: 0.28, Epoch: 0.2578118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13268, "loss": 0.2078249156475067, "memory_gb": 7.721559524536133, "step_time_ms": 3357.165813446045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:27] (step=0013268) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.25783132530120484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13269, "loss": 0.29460409283638, "memory_gb": 7.721559524536133, "step_time_ms": 3343.1310653686523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:30] (step=0013269) Train Loss: 0.2862, Train Steps/Sec: 0.29, Epoch: 0.25785075787019046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13270, "loss": 0.12171252071857452, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9613647460938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:34] (step=0013270) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.2578701904391761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13271, "loss": 0.2624828517436981, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6651554107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:37] (step=0013271) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.2578896230081617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13272, "loss": 0.2021266520023346, "memory_gb": 7.721559524536133, "step_time_ms": 3360.612392425537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:41] (step=0013272) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.2579090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13273, "loss": 0.21219117939472198, "memory_gb": 7.721559524536133, "step_time_ms": 3365.039587020874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:44] (step=0013273) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.2579284881461329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13274, "loss": 0.29880309104919434, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3105964660645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:48] (step=0013274) Train Loss: 0.3082, Train Steps/Sec: 0.28, Epoch: 0.2579479207151185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13275, "loss": 0.1709510087966919, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5638904571533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:52] (step=0013275) Train Loss: 0.1475, Train Steps/Sec: 0.28, Epoch: 0.25796735328410414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13276, "loss": 0.1813032031059265, "memory_gb": 7.721559524536133, "step_time_ms": 3355.348587036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:55] (step=0013276) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.25798678585308976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13277, "loss": 0.20677754282951355, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8761768341064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:16:59] (step=0013277) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.2580062184220754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13278, "loss": 0.20713652670383453, "memory_gb": 7.721559524536133, "step_time_ms": 3360.163688659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:02] (step=0013278) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.258025650991061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13279, "loss": 0.12329784035682678, "memory_gb": 7.721559524536133, "step_time_ms": 3358.58154296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:06] (step=0013279) Train Loss: 0.1826, Train Steps/Sec: 0.28, Epoch: 0.25804508356004663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13280, "loss": 0.38352590799331665, "memory_gb": 7.715639114379883, "step_time_ms": 3326.666831970215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:09] (step=0013280) Train Loss: 0.3378, Train Steps/Sec: 0.28, Epoch: 0.25806451612903225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13281, "loss": 0.2343246340751648, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6574535369873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:13] (step=0013281) Train Loss: 0.1916, Train Steps/Sec: 0.28, Epoch: 0.2580839486980179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13282, "loss": 0.23361176252365112, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1246185302734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:17] (step=0013282) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.2581033812670035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13283, "loss": 0.1713530421257019, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6531105041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:20] (step=0013283) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.2581228138359891, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13284, "loss": 0.2714376449584961, "memory_gb": 7.721559524536133, "step_time_ms": 3357.698440551758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:24] (step=0013284) Train Loss: 0.2059, Train Steps/Sec: 0.28, Epoch: 0.25814224640497474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13285, "loss": 0.22618401050567627, "memory_gb": 7.721559524536133, "step_time_ms": 3360.548734664917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:27] (step=0013285) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.25816167897396036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13286, "loss": 0.2315981388092041, "memory_gb": 7.721559524536133, "step_time_ms": 3360.374927520752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:31] (step=0013286) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.258181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13287, "loss": 0.20373952388763428, "memory_gb": 7.721559524536133, "step_time_ms": 3356.252670288086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:34] (step=0013287) Train Loss: 0.2406, Train Steps/Sec: 0.28, Epoch: 0.2582005441119316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13288, "loss": 0.3131234645843506, "memory_gb": 7.721559524536133, "step_time_ms": 3358.020305633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:38] (step=0013288) Train Loss: 0.2863, Train Steps/Sec: 0.27, Epoch: 0.25821997668091723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13289, "loss": 0.2845447361469269, "memory_gb": 7.721559524536133, "step_time_ms": 3350.020170211792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:42] (step=0013289) Train Loss: 0.3145, Train Steps/Sec: 0.28, Epoch: 0.25823940924990285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13290, "loss": 0.2863565683364868, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8005561828613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:45] (step=0013290) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.2582588418188885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13291, "loss": 0.2046256959438324, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7446937561035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:49] (step=0013291) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.2582782743878741, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13292, "loss": 0.2398928999900818, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6139163970947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:52] (step=0013292) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.2582977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13293, "loss": 0.30915457010269165, "memory_gb": 7.721559524536133, "step_time_ms": 3359.464645385742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:17:56] (step=0013293) Train Loss: 0.2574, Train Steps/Sec: 0.28, Epoch: 0.25831713952584534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13294, "loss": 0.2981114387512207, "memory_gb": 7.721559524536133, "step_time_ms": 3358.823537826538, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:00] (step=0013294) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.25833657209483096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13295, "loss": 0.10919024050235748, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7986488342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:03] (step=0013295) Train Loss: 0.1409, Train Steps/Sec: 0.28, Epoch: 0.25835600466381653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13296, "loss": 0.246545672416687, "memory_gb": 7.721559524536133, "step_time_ms": 3365.006923675537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:07] (step=0013296) Train Loss: 0.2745, Train Steps/Sec: 0.28, Epoch: 0.25837543723280215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13297, "loss": 0.2562379837036133, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3883514404297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:10] (step=0013297) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.2583948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13298, "loss": 0.26915526390075684, "memory_gb": 7.721559524536133, "step_time_ms": 3352.452278137207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:14] (step=0013298) Train Loss: 0.2713, Train Steps/Sec: 0.28, Epoch: 0.2584143023707734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13299, "loss": 0.12800925970077515, "memory_gb": 7.721559524536133, "step_time_ms": 3354.905366897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:17] (step=0013299) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.258433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13300, "loss": 0.30729538202285767, "memory_gb": 7.721559524536133, "step_time_ms": 3495.192527770996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:21] (step=0013300) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.25845316750874464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13301, "loss": 0.20409417152404785, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2517681121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:25] (step=0013301) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.25847260007773026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13302, "loss": 0.19114738702774048, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7774295806885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:28] (step=0013302) Train Loss: 0.1759, Train Steps/Sec: 0.28, Epoch: 0.2584920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13303, "loss": 0.25367265939712524, "memory_gb": 7.721559524536133, "step_time_ms": 3348.126173019409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:32] (step=0013303) Train Loss: 0.1888, Train Steps/Sec: 0.28, Epoch: 0.2585114652157015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13304, "loss": 0.2356780618429184, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1855087280273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:35] (step=0013304) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.25853089778468713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13305, "loss": 0.20937398076057434, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8383922576904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:39] (step=0013305) Train Loss: 0.1818, Train Steps/Sec: 0.28, Epoch: 0.25855033035367275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13306, "loss": 0.23028108477592468, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2156944274902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:42] (step=0013306) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.2585697629226584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13307, "loss": 0.2737024426460266, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1382598876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:46] (step=0013307) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.258589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13308, "loss": 0.17279905080795288, "memory_gb": 7.721559524536133, "step_time_ms": 3348.4108448028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:50] (step=0013308) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.2586086280606296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13309, "loss": 0.23998281359672546, "memory_gb": 7.721559524536133, "step_time_ms": 3350.9960174560547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:53] (step=0013309) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.25862806062961524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13310, "loss": 0.207599937915802, "memory_gb": 7.721559524536133, "step_time_ms": 3352.694511413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:18:57] (step=0013310) Train Loss: 0.1861, Train Steps/Sec: 0.28, Epoch: 0.25864749319860086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13311, "loss": 0.2811887264251709, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2608280181885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:00] (step=0013311) Train Loss: 0.3128, Train Steps/Sec: 0.28, Epoch: 0.2586669257675865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13312, "loss": 0.2553102672100067, "memory_gb": 7.721559524536133, "step_time_ms": 3356.792211532593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:04] (step=0013312) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.2586863583365721, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13313, "loss": 0.3155111074447632, "memory_gb": 7.715639114379883, "step_time_ms": 3321.5060234069824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:07] (step=0013313) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.25870579090555773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13314, "loss": 0.32451969385147095, "memory_gb": 7.721559524536133, "step_time_ms": 3345.9250926971436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:11] (step=0013314) Train Loss: 0.2931, Train Steps/Sec: 0.28, Epoch: 0.25872522347454335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13315, "loss": 0.1317395269870758, "memory_gb": 7.721559524536133, "step_time_ms": 3356.320381164551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:15] (step=0013315) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.258744656043529, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13316, "loss": 0.2572081983089447, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9958000183105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:18] (step=0013316) Train Loss: 0.2733, Train Steps/Sec: 0.28, Epoch: 0.2587640886125146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13317, "loss": 0.2756805419921875, "memory_gb": 7.721559524536133, "step_time_ms": 3356.241464614868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:22] (step=0013317) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.2587835211815002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13318, "loss": 0.2085004597902298, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5893173217773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:25] (step=0013318) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.2588029537504858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13319, "loss": 0.1881830096244812, "memory_gb": 7.721559524536133, "step_time_ms": 3353.461503982544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:29] (step=0013319) Train Loss: 0.2425, Train Steps/Sec: 0.28, Epoch: 0.2588223863194714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13320, "loss": 0.251426100730896, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4956912994385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:32] (step=0013320) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.25884181888845703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13321, "loss": 0.24939614534378052, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2821407318115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:36] (step=0013321) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.25886125145744265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13322, "loss": 0.21108421683311462, "memory_gb": 7.721559524536133, "step_time_ms": 3356.492757797241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:39] (step=0013322) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.2588806840264283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13323, "loss": 0.20793017745018005, "memory_gb": 7.721559524536133, "step_time_ms": 3359.029531478882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:43] (step=0013323) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.2589001165954139, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13324, "loss": 0.1799391806125641, "memory_gb": 7.721559524536133, "step_time_ms": 3349.6692180633545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:47] (step=0013324) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.2589195491643995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13325, "loss": 0.20922091603279114, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7966480255127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:50] (step=0013325) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.25893898173338514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13326, "loss": 0.2633846402168274, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1202545166016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:54] (step=0013326) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.25895841430237077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:19:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13327, "loss": 0.30176740884780884, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5705032348633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:19:57] (step=0013327) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.2589778468713564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13328, "loss": 0.19097226858139038, "memory_gb": 7.721559524536133, "step_time_ms": 3352.810859680176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:01] (step=0013328) Train Loss: 0.2300, Train Steps/Sec: 0.28, Epoch: 0.258997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13329, "loss": 0.11599703133106232, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2813110351562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:04] (step=0013329) Train Loss: 0.1645, Train Steps/Sec: 0.28, Epoch: 0.25901671200932763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13330, "loss": 0.17475315928459167, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5002212524414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:08] (step=0013330) Train Loss: 0.1877, Train Steps/Sec: 0.28, Epoch: 0.25903614457831325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13331, "loss": 0.1723451018333435, "memory_gb": 7.721559524536133, "step_time_ms": 3351.695775985718, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:12] (step=0013331) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.2590555771472989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13332, "loss": 0.31452998518943787, "memory_gb": 7.721559524536133, "step_time_ms": 3356.426477432251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:15] (step=0013332) Train Loss: 0.3575, Train Steps/Sec: 0.28, Epoch: 0.2590750097162845, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13333, "loss": 0.2624998986721039, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6019325256348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:19] (step=0013333) Train Loss: 0.2943, Train Steps/Sec: 0.28, Epoch: 0.2590944422852701, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13334, "loss": 0.1798093020915985, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9436798095703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:22] (step=0013334) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.25911387485425574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13335, "loss": 0.311855673789978, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1741790771484, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:26] (step=0013335) Train Loss: 0.3224, Train Steps/Sec: 0.28, Epoch: 0.25913330742324137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13336, "loss": 0.3351729214191437, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1905574798584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:30] (step=0013336) Train Loss: 0.3008, Train Steps/Sec: 0.27, Epoch: 0.259152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13337, "loss": 0.1528906375169754, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6253700256348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:33] (step=0013337) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.2591721725612126, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13338, "loss": 0.2328280508518219, "memory_gb": 7.721559524536133, "step_time_ms": 3353.564739227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:37] (step=0013338) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.25919160513019823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13339, "loss": 0.16163751482963562, "memory_gb": 7.721559524536133, "step_time_ms": 3345.120906829834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:40] (step=0013339) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.25921103769918385, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13340, "loss": 0.2208234965801239, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1178188323975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:44] (step=0013340) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.2592304702681695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13341, "loss": 0.20649486780166626, "memory_gb": 7.721559524536133, "step_time_ms": 3359.51566696167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:47] (step=0013341) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.2592499028371551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13342, "loss": 0.19438014924526215, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8308238983154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:51] (step=0013342) Train Loss: 0.2446, Train Steps/Sec: 0.29, Epoch: 0.25926933540614067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13343, "loss": 0.17698007822036743, "memory_gb": 7.721559524536133, "step_time_ms": 3360.623836517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:54] (step=0013343) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.2592887679751263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:20:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13344, "loss": 0.2044447660446167, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6905727386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:20:58] (step=0013344) Train Loss: 0.2209, Train Steps/Sec: 0.28, Epoch: 0.2593082005441119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13345, "loss": 0.1487586349248886, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7851524353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:02] (step=0013345) Train Loss: 0.1666, Train Steps/Sec: 0.28, Epoch: 0.25932763311309753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13346, "loss": 0.23711852729320526, "memory_gb": 7.721559524536133, "step_time_ms": 3354.285717010498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:05] (step=0013346) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.25934706568208316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13347, "loss": 0.13638989627361298, "memory_gb": 7.721559524536133, "step_time_ms": 3496.351718902588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:09] (step=0013347) Train Loss: 0.1542, Train Steps/Sec: 0.28, Epoch: 0.2593664982510688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13348, "loss": 0.323112815618515, "memory_gb": 7.721559524536133, "step_time_ms": 3357.70320892334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:12] (step=0013348) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.2593859308200544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13349, "loss": 0.24037346243858337, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4413528442383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:16] (step=0013349) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.25940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13350, "loss": 0.17836818099021912, "memory_gb": 7.721559524536133, "step_time_ms": 3354.663610458374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:19] (step=0013350) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.25942479595802564, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13351, "loss": 0.22576113045215607, "memory_gb": 7.721559524536133, "step_time_ms": 3364.861249923706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:23] (step=0013351) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.25944422852701127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13352, "loss": 0.18887120485305786, "memory_gb": 7.721559524536133, "step_time_ms": 3359.429359436035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:26] (step=0013352) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.2594636610959969, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13353, "loss": 0.2660430073738098, "memory_gb": 7.721559524536133, "step_time_ms": 3361.530303955078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:30] (step=0013353) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.2594830936649825, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13354, "loss": 0.2617655396461487, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7184677124023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:34] (step=0013354) Train Loss: 0.1988, Train Steps/Sec: 0.28, Epoch: 0.25950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13355, "loss": 0.22345076501369476, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4308109283447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:37] (step=0013355) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.25952195880295376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13356, "loss": 0.19058877229690552, "memory_gb": 7.721559524536133, "step_time_ms": 3360.065460205078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:41] (step=0013356) Train Loss: 0.2010, Train Steps/Sec: 0.28, Epoch: 0.2595413913719394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13357, "loss": 0.21136726438999176, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8296852111816, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:44] (step=0013357) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.259560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13358, "loss": 0.21729090809822083, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0309619903564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:48] (step=0013358) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.2595802565099106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13359, "loss": 0.2761765718460083, "memory_gb": 7.721559524536133, "step_time_ms": 3361.438035964966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:51] (step=0013359) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.25959968907889625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13360, "loss": 0.254069983959198, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1952533721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:55] (step=0013360) Train Loss: 0.2482, Train Steps/Sec: 0.28, Epoch: 0.25961912164788187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:21:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13361, "loss": 0.2472231239080429, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2024269104004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:21:59] (step=0013361) Train Loss: 0.2988, Train Steps/Sec: 0.28, Epoch: 0.2596385542168675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13362, "loss": 0.3050391674041748, "memory_gb": 7.721559524536133, "step_time_ms": 3345.4713821411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:02] (step=0013362) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.2596579867858531, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13363, "loss": 0.32589250802993774, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6860847473145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:06] (step=0013363) Train Loss: 0.2492, Train Steps/Sec: 0.28, Epoch: 0.25967741935483873, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13364, "loss": 0.2181963473558426, "memory_gb": 7.721559524536133, "step_time_ms": 3362.124443054199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:09] (step=0013364) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.25969685192382436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13365, "loss": 0.1795400083065033, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9517345428467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:13] (step=0013365) Train Loss: 0.1768, Train Steps/Sec: 0.28, Epoch: 0.2597162844928099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13366, "loss": 0.3016815781593323, "memory_gb": 7.721559524536133, "step_time_ms": 3362.419843673706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:16] (step=0013366) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.25973571706179555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13367, "loss": 0.266682505607605, "memory_gb": 7.721559524536133, "step_time_ms": 3360.337257385254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:20] (step=0013367) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.25975514963078117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13368, "loss": 0.11733852326869965, "memory_gb": 7.721559524536133, "step_time_ms": 3357.22017288208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:24] (step=0013368) Train Loss: 0.1870, Train Steps/Sec: 0.28, Epoch: 0.2597745821997668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13369, "loss": 0.1734880805015564, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6677055358887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:27] (step=0013369) Train Loss: 0.1725, Train Steps/Sec: 0.28, Epoch: 0.2597940147687524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13370, "loss": 0.3272397816181183, "memory_gb": 7.721559524536133, "step_time_ms": 3364.121198654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:31] (step=0013370) Train Loss: 0.2972, Train Steps/Sec: 0.28, Epoch: 0.25981344733773803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13371, "loss": 0.13696876168251038, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9889278411865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:34] (step=0013371) Train Loss: 0.1434, Train Steps/Sec: 0.28, Epoch: 0.25983287990672366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13372, "loss": 0.24476304650306702, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9157333374023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:38] (step=0013372) Train Loss: 0.2678, Train Steps/Sec: 0.28, Epoch: 0.2598523124757093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13373, "loss": 0.16631251573562622, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0666522979736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:41] (step=0013373) Train Loss: 0.1976, Train Steps/Sec: 0.28, Epoch: 0.2598717450446949, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13374, "loss": 0.2749090790748596, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1526069641113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:45] (step=0013374) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.2598911776136805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13375, "loss": 0.2291662096977234, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8973541259766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:49] (step=0013375) Train Loss: 0.2598, Train Steps/Sec: 0.28, Epoch: 0.25991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13376, "loss": 0.19155636429786682, "memory_gb": 7.721559524536133, "step_time_ms": 3357.689619064331, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:52] (step=0013376) Train Loss: 0.1943, Train Steps/Sec: 0.27, Epoch: 0.25993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13377, "loss": 0.3357371687889099, "memory_gb": 7.721559524536133, "step_time_ms": 3354.494333267212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:56] (step=0013377) Train Loss: 0.2971, Train Steps/Sec: 0.28, Epoch: 0.2599494753206374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:22:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13378, "loss": 0.19593545794487, "memory_gb": 7.715639114379883, "step_time_ms": 3307.0878982543945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:22:59] (step=0013378) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.259968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13379, "loss": 0.23825877904891968, "memory_gb": 7.721559524536133, "step_time_ms": 3354.570150375366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:03] (step=0013379) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.25998834045860864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13380, "loss": 0.33950039744377136, "memory_gb": 7.715639114379883, "step_time_ms": 3316.667318344116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:07] (step=0013380) Train Loss: 0.3309, Train Steps/Sec: 0.28, Epoch: 0.26000777302759426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13381, "loss": 0.22567933797836304, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2630672454834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:10] (step=0013381) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.2600272055965799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13382, "loss": 0.30235612392425537, "memory_gb": 7.721559524536133, "step_time_ms": 3349.242687225342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:14] (step=0013382) Train Loss: 0.2924, Train Steps/Sec: 0.28, Epoch: 0.2600466381655655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13383, "loss": 0.1804458200931549, "memory_gb": 7.721559524536133, "step_time_ms": 3350.17991065979, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:17] (step=0013383) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.2600660707345511, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13384, "loss": 0.2016524225473404, "memory_gb": 7.721559524536133, "step_time_ms": 3361.541748046875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:21] (step=0013384) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.26008550330353675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13385, "loss": 0.24846306443214417, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8198471069336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:25] (step=0013385) Train Loss: 0.2409, Train Steps/Sec: 0.28, Epoch: 0.26010493587252237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13386, "loss": 0.2762347459793091, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1906394958496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:28] (step=0013386) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.260124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13387, "loss": 0.08426405489444733, "memory_gb": 7.721559524536133, "step_time_ms": 3353.276014328003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:32] (step=0013387) Train Loss: 0.1498, Train Steps/Sec: 0.28, Epoch: 0.2601438010104936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13388, "loss": 0.2547876238822937, "memory_gb": 7.721559524536133, "step_time_ms": 3361.447811126709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:35] (step=0013388) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.2601632335794792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13389, "loss": 0.3092074990272522, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9507598876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:39] (step=0013389) Train Loss: 0.3258, Train Steps/Sec: 0.28, Epoch: 0.2601826661484648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13390, "loss": 0.14095550775527954, "memory_gb": 7.721559524536133, "step_time_ms": 3352.266311645508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:42] (step=0013390) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.2602020987174504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13391, "loss": 0.26541876792907715, "memory_gb": 7.721559524536133, "step_time_ms": 3354.858160018921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:46] (step=0013391) Train Loss: 0.2513, Train Steps/Sec: 0.28, Epoch: 0.26022153128643605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13392, "loss": 0.20452244579792023, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6442260742188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:50] (step=0013392) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.26024096385542167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13393, "loss": 0.30853888392448425, "memory_gb": 7.721559524536133, "step_time_ms": 3356.0385704040527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:53] (step=0013393) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.2602603964244073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:23:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13394, "loss": 0.2227245569229126, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3995571136475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:23:57] (step=0013394) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.2602798289933929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13395, "loss": 0.19796936213970184, "memory_gb": 7.721559524536133, "step_time_ms": 3498.1861114501953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:00] (step=0013395) Train Loss: 0.1850, Train Steps/Sec: 0.28, Epoch: 0.26029926156237854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13396, "loss": 0.21794499456882477, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1198196411133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:04] (step=0013396) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.26031869413136416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13397, "loss": 0.26359403133392334, "memory_gb": 7.721559524536133, "step_time_ms": 3341.5515422821045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:07] (step=0013397) Train Loss: 0.2947, Train Steps/Sec: 0.28, Epoch: 0.2603381267003498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13398, "loss": 0.2497406154870987, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2680740356445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:11] (step=0013398) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.2603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13399, "loss": 0.29202425479888916, "memory_gb": 7.721559524536133, "step_time_ms": 3335.287570953369, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:14] (step=0013399) Train Loss: 0.2769, Train Steps/Sec: 0.29, Epoch: 0.260376991838321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13400, "loss": 0.23871271312236786, "memory_gb": 7.721559524536133, "step_time_ms": 3339.8849964141846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:18] (step=0013400) Train Loss: 0.2285, Train Steps/Sec: 0.29, Epoch: 0.26039642440730665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13401, "loss": 0.25451189279556274, "memory_gb": 7.721559524536133, "step_time_ms": 3354.052782058716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:21] (step=0013401) Train Loss: 0.2019, Train Steps/Sec: 0.28, Epoch: 0.26041585697629227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13402, "loss": 0.28504735231399536, "memory_gb": 7.721559524536133, "step_time_ms": 3347.597122192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:25] (step=0013402) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.2604352895452779, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13403, "loss": 0.2651570737361908, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4202575683594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:29] (step=0013403) Train Loss: 0.2835, Train Steps/Sec: 0.28, Epoch: 0.2604547221142635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13404, "loss": 0.2666049897670746, "memory_gb": 7.721559524536133, "step_time_ms": 3351.336717605591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:32] (step=0013404) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.26047415468324914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13405, "loss": 0.2784840166568756, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4600734710693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:36] (step=0013405) Train Loss: 0.3011, Train Steps/Sec: 0.28, Epoch: 0.26049358725223476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13406, "loss": 0.2637573778629303, "memory_gb": 7.721559524536133, "step_time_ms": 3351.0680198669434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:39] (step=0013406) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.2605130198212204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13407, "loss": 0.18501316010951996, "memory_gb": 7.721559524536133, "step_time_ms": 3354.361057281494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:43] (step=0013407) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.260532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13408, "loss": 0.26278215646743774, "memory_gb": 7.721559524536133, "step_time_ms": 3356.327533721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:46] (step=0013408) Train Loss: 0.2750, Train Steps/Sec: 0.28, Epoch: 0.2605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13409, "loss": 0.2709740698337555, "memory_gb": 7.715639114379883, "step_time_ms": 3317.1463012695312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:50] (step=0013409) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.26057131752817725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13410, "loss": 0.13766895234584808, "memory_gb": 7.721559524536133, "step_time_ms": 3354.189395904541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:54] (step=0013410) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.26059075009716287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:24:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13411, "loss": 0.17862796783447266, "memory_gb": 7.721559524536133, "step_time_ms": 3348.8941192626953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:24:57] (step=0013411) Train Loss: 0.1545, Train Steps/Sec: 0.28, Epoch: 0.26061018266614844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13412, "loss": 0.271289587020874, "memory_gb": 7.721559524536133, "step_time_ms": 3344.285726547241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:01] (step=0013412) Train Loss: 0.2650, Train Steps/Sec: 0.28, Epoch: 0.26062961523513406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13413, "loss": 0.2446843832731247, "memory_gb": 7.721559524536133, "step_time_ms": 3354.214668273926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:04] (step=0013413) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.2606490478041197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13414, "loss": 0.2830306589603424, "memory_gb": 7.721559524536133, "step_time_ms": 3355.804920196533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:08] (step=0013414) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.2606684803731053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13415, "loss": 0.21996667981147766, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1291484832764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:11] (step=0013415) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.2606879129420909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13416, "loss": 0.19380971789360046, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7282009124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:15] (step=0013416) Train Loss: 0.2335, Train Steps/Sec: 0.28, Epoch: 0.26070734551107655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13417, "loss": 0.28393223881721497, "memory_gb": 7.721559524536133, "step_time_ms": 3354.088544845581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:18] (step=0013417) Train Loss: 0.2756, Train Steps/Sec: 0.28, Epoch: 0.26072677808006217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13418, "loss": 0.1958329975605011, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0191650390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:22] (step=0013418) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.2607462106490478, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13419, "loss": 0.207488015294075, "memory_gb": 7.721559524536133, "step_time_ms": 3345.287561416626, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:26] (step=0013419) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.2607656432180334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13420, "loss": 0.10830289125442505, "memory_gb": 7.721559524536133, "step_time_ms": 3347.5677967071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:29] (step=0013420) Train Loss: 0.1801, Train Steps/Sec: 0.28, Epoch: 0.26078507578701904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13421, "loss": 0.2551259696483612, "memory_gb": 7.721559524536133, "step_time_ms": 3340.3375148773193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:33] (step=0013421) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.26080450835600466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13422, "loss": 0.3104075789451599, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1879653930664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:36] (step=0013422) Train Loss: 0.2980, Train Steps/Sec: 0.28, Epoch: 0.2608239409249903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13423, "loss": 0.3336249589920044, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1017513275146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:40] (step=0013423) Train Loss: 0.2590, Train Steps/Sec: 0.27, Epoch: 0.2608433734939759, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13424, "loss": 0.1692974865436554, "memory_gb": 7.721559524536133, "step_time_ms": 3347.5403785705566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:44] (step=0013424) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.2608628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13425, "loss": 0.23373377323150635, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6016006469727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:47] (step=0013425) Train Loss: 0.2583, Train Steps/Sec: 0.28, Epoch: 0.26088223863194715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13426, "loss": 0.22784587740898132, "memory_gb": 7.721559524536133, "step_time_ms": 3344.599485397339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:51] (step=0013426) Train Loss: 0.2251, Train Steps/Sec: 0.28, Epoch: 0.26090167120093277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13427, "loss": 0.23824678361415863, "memory_gb": 7.721559524536133, "step_time_ms": 3350.006580352783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:54] (step=0013427) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2609211037699184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13428, "loss": 0.26032158732414246, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4274826049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:25:58] (step=0013428) Train Loss: 0.2504, Train Steps/Sec: 0.28, Epoch: 0.260940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13429, "loss": 0.1904151439666748, "memory_gb": 7.721559524536133, "step_time_ms": 3357.926845550537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:01] (step=0013429) Train Loss: 0.2352, Train Steps/Sec: 0.28, Epoch: 0.26095996890788964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13430, "loss": 0.24738729000091553, "memory_gb": 7.721559524536133, "step_time_ms": 3349.67303276062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:05] (step=0013430) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.26097940147687526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13431, "loss": 0.3170897364616394, "memory_gb": 7.721559524536133, "step_time_ms": 3354.525089263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:09] (step=0013431) Train Loss: 0.3184, Train Steps/Sec: 0.28, Epoch: 0.2609988340458609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13432, "loss": 0.22356164455413818, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7888526916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:12] (step=0013432) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.2610182666148465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13433, "loss": 0.20240101218223572, "memory_gb": 7.721559524536133, "step_time_ms": 3356.276035308838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:16] (step=0013433) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.26103769918383213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13434, "loss": 0.2910718321800232, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4108142852783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:19] (step=0013434) Train Loss: 0.2412, Train Steps/Sec: 0.28, Epoch: 0.2610571317528177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13435, "loss": 0.1538672000169754, "memory_gb": 7.721559524536133, "step_time_ms": 3353.102207183838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:23] (step=0013435) Train Loss: 0.2246, Train Steps/Sec: 0.28, Epoch: 0.2610765643218033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13436, "loss": 0.23802661895751953, "memory_gb": 7.721559524536133, "step_time_ms": 3493.4232234954834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:26] (step=0013436) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.26109599689078894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13437, "loss": 0.38290926814079285, "memory_gb": 7.715639114379883, "step_time_ms": 3322.2718238830566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:30] (step=0013437) Train Loss: 0.3102, Train Steps/Sec: 0.28, Epoch: 0.26111542945977456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13438, "loss": 0.21332979202270508, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4068546295166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:34] (step=0013438) Train Loss: 0.1893, Train Steps/Sec: 0.28, Epoch: 0.2611348620287602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13439, "loss": 0.2938691973686218, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6581478118896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:37] (step=0013439) Train Loss: 0.2869, Train Steps/Sec: 0.28, Epoch: 0.2611542945977458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13440, "loss": 0.2805994749069214, "memory_gb": 7.721559524536133, "step_time_ms": 3356.328010559082, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:41] (step=0013440) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.26117372716673143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13441, "loss": 0.19062072038650513, "memory_gb": 7.721559524536133, "step_time_ms": 3356.956958770752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:44] (step=0013441) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.26119315973571705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13442, "loss": 0.2243548184633255, "memory_gb": 7.721559524536133, "step_time_ms": 3355.191707611084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:48] (step=0013442) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.2612125923047027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13443, "loss": 0.17042580246925354, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5762977600098, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:51] (step=0013443) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.2612320248736883, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13444, "loss": 0.23197278380393982, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2776832580566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:55] (step=0013444) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.2612514574426739, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13445, "loss": 0.23915472626686096, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8662147521973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:26:59] (step=0013445) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.26127089001165954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13446, "loss": 0.13896659016609192, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2522144317627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:02] (step=0013446) Train Loss: 0.1392, Train Steps/Sec: 0.28, Epoch: 0.26129032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13447, "loss": 0.1373416781425476, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8526458740234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:06] (step=0013447) Train Loss: 0.1575, Train Steps/Sec: 0.28, Epoch: 0.2613097551496308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13448, "loss": 0.24731378257274628, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7978916168213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:09] (step=0013448) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.2613291877186164, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13449, "loss": 0.2666672170162201, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2279891967773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:13] (step=0013449) Train Loss: 0.2809, Train Steps/Sec: 0.28, Epoch: 0.26134862028760203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13450, "loss": 0.25462913513183594, "memory_gb": 7.721559524536133, "step_time_ms": 3366.246223449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:16] (step=0013450) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.26136805285658765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13451, "loss": 0.31464773416519165, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4820079803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:20] (step=0013451) Train Loss: 0.2814, Train Steps/Sec: 0.28, Epoch: 0.2613874854255733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13452, "loss": 0.21905645728111267, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5923442840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:24] (step=0013452) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.2614069179945589, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13453, "loss": 0.21753326058387756, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7146530151367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:27] (step=0013453) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.2614263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13454, "loss": 0.22235971689224243, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4345111846924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:31] (step=0013454) Train Loss: 0.2245, Train Steps/Sec: 0.28, Epoch: 0.26144578313253014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13455, "loss": 0.21064452826976776, "memory_gb": 7.721559524536133, "step_time_ms": 3359.377384185791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:34] (step=0013455) Train Loss: 0.2342, Train Steps/Sec: 0.28, Epoch: 0.26146521570151576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13456, "loss": 0.217695415019989, "memory_gb": 7.721559524536133, "step_time_ms": 3359.032154083252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:38] (step=0013456) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.2614846482705014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13457, "loss": 0.2751394510269165, "memory_gb": 7.721559524536133, "step_time_ms": 3354.984998703003, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:41] (step=0013457) Train Loss: 0.2288, Train Steps/Sec: 0.28, Epoch: 0.261504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13458, "loss": 0.23933687806129456, "memory_gb": 7.721559524536133, "step_time_ms": 3360.765218734741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:45] (step=0013458) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.2615235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13459, "loss": 0.20672491192817688, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0806045532227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:49] (step=0013459) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.2615429459774582, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13460, "loss": 0.1824887990951538, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6796474456787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:52] (step=0013460) Train Loss: 0.1984, Train Steps/Sec: 0.28, Epoch: 0.2615623785464438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13461, "loss": 0.1303068995475769, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3383560180664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:56] (step=0013461) Train Loss: 0.1923, Train Steps/Sec: 0.28, Epoch: 0.26158181111542944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:27:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13462, "loss": 0.24312502145767212, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4882488250732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:27:59] (step=0013462) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.26160124368441506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13463, "loss": 0.2689736485481262, "memory_gb": 7.721559524536133, "step_time_ms": 3356.703519821167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:03] (step=0013463) Train Loss: 0.2999, Train Steps/Sec: 0.28, Epoch: 0.2616206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13464, "loss": 0.31323498487472534, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1320304870605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:07] (step=0013464) Train Loss: 0.2850, Train Steps/Sec: 0.27, Epoch: 0.2616401088223863, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13465, "loss": 0.28750351071357727, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9914054870605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:10] (step=0013465) Train Loss: 0.2754, Train Steps/Sec: 0.28, Epoch: 0.26165954139137193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13466, "loss": 0.25669312477111816, "memory_gb": 7.721559524536133, "step_time_ms": 3355.461835861206, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:14] (step=0013466) Train Loss: 0.2760, Train Steps/Sec: 0.28, Epoch: 0.26167897396035755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13467, "loss": 0.21100710332393646, "memory_gb": 7.721559524536133, "step_time_ms": 3360.072135925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:17] (step=0013467) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.2616984065293432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13468, "loss": 0.21595969796180725, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3601455688477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:21] (step=0013468) Train Loss: 0.2085, Train Steps/Sec: 0.28, Epoch: 0.2617178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13469, "loss": 0.3264045715332031, "memory_gb": 7.721559524536133, "step_time_ms": 3360.201120376587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:24] (step=0013469) Train Loss: 0.3111, Train Steps/Sec: 0.28, Epoch: 0.2617372716673144, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13470, "loss": 0.277316153049469, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3273887634277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:28] (step=0013470) Train Loss: 0.2691, Train Steps/Sec: 0.28, Epoch: 0.26175670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13471, "loss": 0.30111488699913025, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5313301086426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:32] (step=0013471) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.26177613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13472, "loss": 0.21788346767425537, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6723594665527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:35] (step=0013472) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.2617955693742713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13473, "loss": 0.15358442068099976, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0175170898438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:39] (step=0013473) Train Loss: 0.2160, Train Steps/Sec: 0.28, Epoch: 0.2618150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13474, "loss": 0.16072040796279907, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1127395629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:42] (step=0013474) Train Loss: 0.1958, Train Steps/Sec: 0.28, Epoch: 0.26183443451224253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13475, "loss": 0.10164053738117218, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3641777038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:46] (step=0013475) Train Loss: 0.1520, Train Steps/Sec: 0.28, Epoch: 0.26185386708122815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13476, "loss": 0.10353413224220276, "memory_gb": 7.721559524536133, "step_time_ms": 3352.512836456299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:49] (step=0013476) Train Loss: 0.1908, Train Steps/Sec: 0.28, Epoch: 0.2618732996502138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13477, "loss": 0.26647788286209106, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0780029296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:53] (step=0013477) Train Loss: 0.2864, Train Steps/Sec: 0.28, Epoch: 0.2618927322191994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:28:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13478, "loss": 0.2428874969482422, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1574897766113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:28:57] (step=0013478) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.261912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13479, "loss": 0.22161519527435303, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7660789489746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:00] (step=0013479) Train Loss: 0.2449, Train Steps/Sec: 0.28, Epoch: 0.26193159735717064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13480, "loss": 0.21416278183460236, "memory_gb": 7.721559524536133, "step_time_ms": 3366.692066192627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:04] (step=0013480) Train Loss: 0.2077, Train Steps/Sec: 0.28, Epoch: 0.26195102992615626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13481, "loss": 0.18024730682373047, "memory_gb": 7.721559524536133, "step_time_ms": 3357.111692428589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:07] (step=0013481) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.26197046249514183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13482, "loss": 0.21871909499168396, "memory_gb": 7.721559524536133, "step_time_ms": 3350.407361984253, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:11] (step=0013482) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.26198989506412745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13483, "loss": 0.23527908325195312, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1478595733643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:15] (step=0013483) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.2620093276331131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13484, "loss": 0.18831662833690643, "memory_gb": 7.721559524536133, "step_time_ms": 3495.1281547546387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:18] (step=0013484) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.2620287602020987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13485, "loss": 0.2288724035024643, "memory_gb": 7.721559524536133, "step_time_ms": 3353.283166885376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:22] (step=0013485) Train Loss: 0.1867, Train Steps/Sec: 0.28, Epoch: 0.2620481927710843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13486, "loss": 0.17357110977172852, "memory_gb": 7.721559524536133, "step_time_ms": 3356.400728225708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:25] (step=0013486) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.26206762534006994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13487, "loss": 0.21226446330547333, "memory_gb": 7.721559524536133, "step_time_ms": 3347.3916053771973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:29] (step=0013487) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.26208705790905557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13488, "loss": 0.2283039093017578, "memory_gb": 7.721559524536133, "step_time_ms": 3356.757879257202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:32] (step=0013488) Train Loss: 0.2151, Train Steps/Sec: 0.28, Epoch: 0.2621064904780412, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13489, "loss": 0.2906728982925415, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2100143432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:36] (step=0013489) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.2621259230470268, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13490, "loss": 0.24822142720222473, "memory_gb": 7.721559524536133, "step_time_ms": 3357.433080673218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:40] (step=0013490) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.26214535561601243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13491, "loss": 0.24544809758663177, "memory_gb": 7.721559524536133, "step_time_ms": 3357.611894607544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:43] (step=0013491) Train Loss: 0.2173, Train Steps/Sec: 0.28, Epoch: 0.26216478818499805, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13492, "loss": 0.3186247646808624, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4512004852295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:47] (step=0013492) Train Loss: 0.2987, Train Steps/Sec: 0.28, Epoch: 0.2621842207539837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13493, "loss": 0.19348397850990295, "memory_gb": 7.721559524536133, "step_time_ms": 3354.816675186157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:50] (step=0013493) Train Loss: 0.1866, Train Steps/Sec: 0.28, Epoch: 0.2622036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13494, "loss": 0.29521217942237854, "memory_gb": 7.715639114379883, "step_time_ms": 3320.3275203704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:54] (step=0013494) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.2622230858919549, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:29:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13495, "loss": 0.24813920259475708, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1619262695312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:29:57] (step=0013495) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.26224251846094054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13496, "loss": 0.25529196858406067, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1535816192627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:01] (step=0013496) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.26226195102992617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13497, "loss": 0.17461425065994263, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0152263641357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:05] (step=0013497) Train Loss: 0.1797, Train Steps/Sec: 0.28, Epoch: 0.2622813835989118, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13498, "loss": 0.34327614307403564, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1297397613525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:08] (step=0013498) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.2623008161678974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13499, "loss": 0.20547166466712952, "memory_gb": 7.721559524536133, "step_time_ms": 3358.187675476074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:12] (step=0013499) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.26232024873688303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13500, "loss": 0.24451503157615662, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4960231781006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:15] (step=0013500) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.26233968130586865, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13501, "loss": 0.3013758063316345, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0072441101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:19] (step=0013501) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.2623591138748543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13502, "loss": 0.21998099982738495, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9323806762695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:22] (step=0013502) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.2623785464438399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13503, "loss": 0.18117904663085938, "memory_gb": 7.721559524536133, "step_time_ms": 3337.9251956939697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:26] (step=0013503) Train Loss: 0.2371, Train Steps/Sec: 0.29, Epoch: 0.2623979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13504, "loss": 0.2817190885543823, "memory_gb": 7.721559524536133, "step_time_ms": 3349.0288257598877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:29] (step=0013504) Train Loss: 0.2719, Train Steps/Sec: 0.28, Epoch: 0.2624174115818111, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13505, "loss": 0.31631773710250854, "memory_gb": 7.721559524536133, "step_time_ms": 3347.829818725586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:33] (step=0013505) Train Loss: 0.3048, Train Steps/Sec: 0.28, Epoch: 0.2624368441507967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13506, "loss": 0.3374159336090088, "memory_gb": 7.721559524536133, "step_time_ms": 3352.520704269409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:37] (step=0013506) Train Loss: 0.3133, Train Steps/Sec: 0.28, Epoch: 0.26245627671978233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13507, "loss": 0.21658022701740265, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9349517822266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:40] (step=0013507) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.26247570928876796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13508, "loss": 0.2079782485961914, "memory_gb": 7.721559524536133, "step_time_ms": 3346.656084060669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:44] (step=0013508) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.2624951418577536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13509, "loss": 0.2513297498226166, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9088497161865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:47] (step=0013509) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2625145744267392, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13510, "loss": 0.26781314611434937, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1342487335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:51] (step=0013510) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.2625340069957248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13511, "loss": 0.2921818792819977, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2825450897217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:54] (step=0013511) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.26255343956471044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:30:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13512, "loss": 0.32185301184654236, "memory_gb": 7.721559524536133, "step_time_ms": 3357.970952987671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:30:58] (step=0013512) Train Loss: 0.2616, Train Steps/Sec: 0.27, Epoch: 0.26257287213369607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13513, "loss": 0.25108638405799866, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4749488830566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:02] (step=0013513) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.2625923047026817, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13514, "loss": 0.15542668104171753, "memory_gb": 7.721559524536133, "step_time_ms": 3356.449604034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:05] (step=0013514) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.2626117372716673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13515, "loss": 0.24553236365318298, "memory_gb": 7.721559524536133, "step_time_ms": 3354.774236679077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:09] (step=0013515) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.26263116984065293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13516, "loss": 0.2046598345041275, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5712699890137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:12] (step=0013516) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.26265060240963856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13517, "loss": 0.27324795722961426, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4363975524902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:16] (step=0013517) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.2626700349786242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13518, "loss": 0.24371075630187988, "memory_gb": 7.721559524536133, "step_time_ms": 3348.550319671631, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:20] (step=0013518) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.2626894675476098, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13519, "loss": 0.2928769886493683, "memory_gb": 7.721559524536133, "step_time_ms": 3355.259656906128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:23] (step=0013519) Train Loss: 0.3282, Train Steps/Sec: 0.28, Epoch: 0.2627089001165954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13520, "loss": 0.28478798270225525, "memory_gb": 7.721559524536133, "step_time_ms": 3353.666305541992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:27] (step=0013520) Train Loss: 0.2350, Train Steps/Sec: 0.28, Epoch: 0.26272833268558105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13521, "loss": 0.21248465776443481, "memory_gb": 7.721559524536133, "step_time_ms": 3351.840019226074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:30] (step=0013521) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.26274776525456667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13522, "loss": 0.24463960528373718, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8969078063965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:34] (step=0013522) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.2627671978235523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13523, "loss": 0.28911006450653076, "memory_gb": 7.721559524536133, "step_time_ms": 3352.710485458374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:37] (step=0013523) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.2627866303925379, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13524, "loss": 0.27503496408462524, "memory_gb": 7.721559524536133, "step_time_ms": 3489.9532794952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:41] (step=0013524) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.26280606296152353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13525, "loss": 0.18979188799858093, "memory_gb": 7.721559524536133, "step_time_ms": 3350.649356842041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:45] (step=0013525) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.26282549553050916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13526, "loss": 0.1448453813791275, "memory_gb": 7.721559524536133, "step_time_ms": 3341.5544033050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:48] (step=0013526) Train Loss: 0.1719, Train Steps/Sec: 0.28, Epoch: 0.2628449280994948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13527, "loss": 0.3198317289352417, "memory_gb": 7.721559524536133, "step_time_ms": 3349.858283996582, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:52] (step=0013527) Train Loss: 0.2685, Train Steps/Sec: 0.28, Epoch: 0.26286436066848035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13528, "loss": 0.1558341532945633, "memory_gb": 7.721559524536133, "step_time_ms": 3352.421283721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:55] (step=0013528) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.26288379323746597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:31:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13529, "loss": 0.212794691324234, "memory_gb": 7.721559524536133, "step_time_ms": 3349.0264415740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:31:59] (step=0013529) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.2629032258064516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13530, "loss": 0.23925600945949554, "memory_gb": 7.721559524536133, "step_time_ms": 3338.930606842041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:02] (step=0013530) Train Loss: 0.2506, Train Steps/Sec: 0.28, Epoch: 0.2629226583754372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13531, "loss": 0.22603119909763336, "memory_gb": 7.721559524536133, "step_time_ms": 3356.40549659729, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:06] (step=0013531) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.26294209094442283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13532, "loss": 0.2605954706668854, "memory_gb": 7.721559524536133, "step_time_ms": 3351.397752761841, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:09] (step=0013532) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.26296152351340846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13533, "loss": 0.20510323345661163, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6320667266846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:13] (step=0013533) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.2629809560823941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13534, "loss": 0.19302718341350555, "memory_gb": 7.721559524536133, "step_time_ms": 3352.023124694824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:17] (step=0013534) Train Loss: 0.1858, Train Steps/Sec: 0.28, Epoch: 0.2630003886513797, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13535, "loss": 0.2521459460258484, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4442443847656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:20] (step=0013535) Train Loss: 0.2713, Train Steps/Sec: 0.28, Epoch: 0.2630198212203653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13536, "loss": 0.4288865923881531, "memory_gb": 7.715639114379883, "step_time_ms": 3316.6792392730713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:24] (step=0013536) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.26303925378935095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13537, "loss": 0.19478626549243927, "memory_gb": 7.721559524536133, "step_time_ms": 3353.564977645874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:27] (step=0013537) Train Loss: 0.2165, Train Steps/Sec: 0.28, Epoch: 0.26305868635833657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13538, "loss": 0.27582019567489624, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3009757995605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:31] (step=0013538) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.2630781189273222, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13539, "loss": 0.15652118623256683, "memory_gb": 7.721559524536133, "step_time_ms": 3352.137565612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:34] (step=0013539) Train Loss: 0.1726, Train Steps/Sec: 0.28, Epoch: 0.2630975514963078, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13540, "loss": 0.19669368863105774, "memory_gb": 7.721559524536133, "step_time_ms": 3352.980613708496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:38] (step=0013540) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.26311698406529344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13541, "loss": 0.21739846467971802, "memory_gb": 7.721559524536133, "step_time_ms": 3353.965997695923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:42] (step=0013541) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.26313641663427906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13542, "loss": 0.1542748659849167, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9048385620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:45] (step=0013542) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.2631558492032647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13543, "loss": 0.2231009304523468, "memory_gb": 7.721559524536133, "step_time_ms": 3353.51824760437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:49] (step=0013543) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.2631752817722503, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13544, "loss": 0.18724772334098816, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4951934814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:52] (step=0013544) Train Loss: 0.1852, Train Steps/Sec: 0.28, Epoch: 0.2631947143412359, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13545, "loss": 0.1989302784204483, "memory_gb": 7.721559524536133, "step_time_ms": 3354.9413681030273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:56] (step=0013545) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.26321414691022155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:32:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13546, "loss": 0.20533642172813416, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7204990386963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:32:59] (step=0013546) Train Loss: 0.1977, Train Steps/Sec: 0.28, Epoch: 0.26323357947920717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13547, "loss": 0.1683528572320938, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4789085388184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:03] (step=0013547) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.2632530120481928, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13548, "loss": 0.3875453770160675, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6980152130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:07] (step=0013548) Train Loss: 0.3332, Train Steps/Sec: 0.28, Epoch: 0.2632724446171784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13549, "loss": 0.3071705102920532, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4393520355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:10] (step=0013549) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.26329187718616404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13550, "loss": 0.19997237622737885, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2463989257812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:14] (step=0013550) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.26331130975514966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13551, "loss": 0.25059646368026733, "memory_gb": 7.721559524536133, "step_time_ms": 3353.689193725586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:17] (step=0013551) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.2633307423241352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13552, "loss": 0.2921169400215149, "memory_gb": 7.715639114379883, "step_time_ms": 3320.6334114074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:21] (step=0013552) Train Loss: 0.2744, Train Steps/Sec: 0.27, Epoch: 0.26335017489312085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13553, "loss": 0.18855473399162292, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9023399353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:24] (step=0013553) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.26336960746210647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13554, "loss": 0.2494385540485382, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4882583618164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:28] (step=0013554) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.2633890400310921, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13555, "loss": 0.32023999094963074, "memory_gb": 7.721559524536133, "step_time_ms": 3354.851007461548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:32] (step=0013555) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2634084726000777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13556, "loss": 0.29444950819015503, "memory_gb": 7.721559524536133, "step_time_ms": 3354.403257369995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:35] (step=0013556) Train Loss: 0.2541, Train Steps/Sec: 0.28, Epoch: 0.26342790516906334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13557, "loss": 0.28100723028182983, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1477661132812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:39] (step=0013557) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.26344733773804896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13558, "loss": 0.19143854081630707, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9182624816895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:42] (step=0013558) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.2634667703070346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13559, "loss": 0.2533246576786041, "memory_gb": 7.721559524536133, "step_time_ms": 3349.299192428589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:46] (step=0013559) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.2634862028760202, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13560, "loss": 0.2456313669681549, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0849895477295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:49] (step=0013560) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.2635056354450058, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13561, "loss": 0.20510414242744446, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2868576049805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:53] (step=0013561) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.26352506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:33:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13562, "loss": 0.21298497915267944, "memory_gb": 7.721559524536133, "step_time_ms": 3350.4929542541504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:33:57] (step=0013562) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.26354450058297707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13563, "loss": 0.11868979036808014, "memory_gb": 7.721559524536133, "step_time_ms": 3345.889091491699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:00] (step=0013563) Train Loss: 0.1767, Train Steps/Sec: 0.28, Epoch: 0.2635639331519627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13564, "loss": 0.312099814414978, "memory_gb": 7.721559524536133, "step_time_ms": 3356.449604034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:04] (step=0013564) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.2635833657209483, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13565, "loss": 0.2607715427875519, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0716590881348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:07] (step=0013565) Train Loss: 0.2581, Train Steps/Sec: 0.28, Epoch: 0.26360279828993394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13566, "loss": 0.20340673625469208, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0969619750977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:11] (step=0013566) Train Loss: 0.1841, Train Steps/Sec: 0.28, Epoch: 0.26362223085891956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13567, "loss": 0.19347676634788513, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7825088500977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:14] (step=0013567) Train Loss: 0.1822, Train Steps/Sec: 0.28, Epoch: 0.2636416634279052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13568, "loss": 0.31869930028915405, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4931621551514, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:18] (step=0013568) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.2636610959968908, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13569, "loss": 0.2191699594259262, "memory_gb": 7.721559524536133, "step_time_ms": 3355.903387069702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:22] (step=0013569) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.2636805285658764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13570, "loss": 0.17027106881141663, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9627017974854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:25] (step=0013570) Train Loss: 0.2044, Train Steps/Sec: 0.28, Epoch: 0.26369996113486205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13571, "loss": 0.12595289945602417, "memory_gb": 7.721559524536133, "step_time_ms": 3495.642900466919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:29] (step=0013571) Train Loss: 0.2064, Train Steps/Sec: 0.28, Epoch: 0.26371939370384767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13572, "loss": 0.10508435219526291, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8263568878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:32] (step=0013572) Train Loss: 0.1987, Train Steps/Sec: 0.28, Epoch: 0.2637388262728333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13573, "loss": 0.2565405070781708, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6034564971924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:36] (step=0013573) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.2637582588418189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13574, "loss": 0.2492614984512329, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7142791748047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:39] (step=0013574) Train Loss: 0.3084, Train Steps/Sec: 0.28, Epoch: 0.2637776914108045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13575, "loss": 0.1869039535522461, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7167778015137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:43] (step=0013575) Train Loss: 0.1547, Train Steps/Sec: 0.28, Epoch: 0.2637971239797901, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13576, "loss": 0.18987765908241272, "memory_gb": 7.721559524536133, "step_time_ms": 3358.593463897705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:47] (step=0013576) Train Loss: 0.2107, Train Steps/Sec: 0.28, Epoch: 0.2638165565487757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13577, "loss": 0.25951769948005676, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6206436157227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:50] (step=0013577) Train Loss: 0.2499, Train Steps/Sec: 0.28, Epoch: 0.26383598911776135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13578, "loss": 0.16892841458320618, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3899269104004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:54] (step=0013578) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.26385542168674697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13579, "loss": 0.3434835970401764, "memory_gb": 7.721559524536133, "step_time_ms": 3343.4276580810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:34:57] (step=0013579) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.2638748542557326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13580, "loss": 0.1814752072095871, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1668815612793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:01] (step=0013580) Train Loss: 0.1747, Train Steps/Sec: 0.28, Epoch: 0.2638942868247182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13581, "loss": 0.29164186120033264, "memory_gb": 7.721559524536133, "step_time_ms": 3362.87260055542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:04] (step=0013581) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.26391371939370384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13582, "loss": 0.20353421568870544, "memory_gb": 7.721559524536133, "step_time_ms": 3358.668804168701, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:08] (step=0013582) Train Loss: 0.1891, Train Steps/Sec: 0.28, Epoch: 0.26393315196268946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13583, "loss": 0.13648319244384766, "memory_gb": 7.721559524536133, "step_time_ms": 3360.546350479126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:12] (step=0013583) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.2639525845316751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13584, "loss": 0.13082271814346313, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1973571777344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:15] (step=0013584) Train Loss: 0.1834, Train Steps/Sec: 0.28, Epoch: 0.2639720171006607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13585, "loss": 0.2349696159362793, "memory_gb": 7.721559524536133, "step_time_ms": 3357.807159423828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:19] (step=0013585) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.2639914496696463, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13586, "loss": 0.2528592348098755, "memory_gb": 7.721559524536133, "step_time_ms": 3356.323480606079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:22] (step=0013586) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.26401088223863195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13587, "loss": 0.29195207357406616, "memory_gb": 7.721559524536133, "step_time_ms": 3349.344253540039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:26] (step=0013587) Train Loss: 0.2588, Train Steps/Sec: 0.28, Epoch: 0.26403031480761757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13588, "loss": 0.25622421503067017, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1252098083496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:30] (step=0013588) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.2640497473766032, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13589, "loss": 0.21952606737613678, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2676906585693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:33] (step=0013589) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.2640691799455888, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13590, "loss": 0.2983889579772949, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4745864868164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:37] (step=0013590) Train Loss: 0.2889, Train Steps/Sec: 0.28, Epoch: 0.26408861251457444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13591, "loss": 0.21690420806407928, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7448177337646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:40] (step=0013591) Train Loss: 0.2916, Train Steps/Sec: 0.28, Epoch: 0.26410804508356006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13592, "loss": 0.2018904834985733, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7771396636963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:44] (step=0013592) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.2641274776525457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13593, "loss": 0.2693682909011841, "memory_gb": 7.721559524536133, "step_time_ms": 3351.874589920044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:47] (step=0013593) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.2641469102215313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13594, "loss": 0.3416174054145813, "memory_gb": 7.721559524536133, "step_time_ms": 3358.114719390869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:51] (step=0013594) Train Loss: 0.3196, Train Steps/Sec: 0.28, Epoch: 0.26416634279051693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13595, "loss": 0.2755690813064575, "memory_gb": 7.721559524536133, "step_time_ms": 3354.00652885437, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:55] (step=0013595) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.26418577535950255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13596, "loss": 0.1591658890247345, "memory_gb": 7.721559524536133, "step_time_ms": 3351.8731594085693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:35:58] (step=0013596) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.2642052079284882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13597, "loss": 0.2633107900619507, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6816787719727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:02] (step=0013597) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.26422464049747374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13598, "loss": 0.2637491226196289, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5145778656006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:05] (step=0013598) Train Loss: 0.3007, Train Steps/Sec: 0.28, Epoch: 0.26424407306645936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13599, "loss": 0.24371661245822906, "memory_gb": 7.721559524536133, "step_time_ms": 3359.931230545044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:09] (step=0013599) Train Loss: 0.2545, Train Steps/Sec: 0.27, Epoch: 0.264263505635445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13600, "loss": 0.23046354949474335, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3667068481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:13] (step=0013600) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.2642829382044306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13601, "loss": 0.2533419728279114, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3404083251953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:16] (step=0013601) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.26430237077341623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13602, "loss": 0.22126205265522003, "memory_gb": 7.721559524536133, "step_time_ms": 3356.58860206604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:20] (step=0013602) Train Loss: 0.2150, Train Steps/Sec: 0.28, Epoch: 0.26432180334240185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13603, "loss": 0.21037545800209045, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8810997009277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:23] (step=0013603) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.2643412359113875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13604, "loss": 0.17231318354606628, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6596508026123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:27] (step=0013604) Train Loss: 0.1785, Train Steps/Sec: 0.28, Epoch: 0.2643606684803731, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13605, "loss": 0.248394176363945, "memory_gb": 7.721559524536133, "step_time_ms": 3353.604793548584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:30] (step=0013605) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.2643801010493587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13606, "loss": 0.29516616463661194, "memory_gb": 7.721559524536133, "step_time_ms": 3345.2303409576416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:34] (step=0013606) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.26439953361834434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13607, "loss": 0.21213389933109283, "memory_gb": 7.721559524536133, "step_time_ms": 3346.938371658325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:38] (step=0013607) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.26441896618732996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13608, "loss": 0.17048513889312744, "memory_gb": 7.721559524536133, "step_time_ms": 3352.158546447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:41] (step=0013608) Train Loss: 0.1924, Train Steps/Sec: 0.28, Epoch: 0.2644383987563156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13609, "loss": 0.2375515103340149, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8570919036865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:45] (step=0013609) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.2644578313253012, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13610, "loss": 0.331216037273407, "memory_gb": 7.721559524536133, "step_time_ms": 3352.365493774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:48] (step=0013610) Train Loss: 0.3270, Train Steps/Sec: 0.28, Epoch: 0.26447726389428683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13611, "loss": 0.24017779529094696, "memory_gb": 7.721559524536133, "step_time_ms": 3349.611759185791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:52] (step=0013611) Train Loss: 0.3022, Train Steps/Sec: 0.28, Epoch: 0.26449669646327245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13612, "loss": 0.2722279131412506, "memory_gb": 7.721559524536133, "step_time_ms": 3489.121437072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:55] (step=0013612) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.2645161290322581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:36:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13613, "loss": 0.14288341999053955, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5826930999756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:36:59] (step=0013613) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.2645355616012437, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13614, "loss": 0.2102455496788025, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5779972076416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:03] (step=0013614) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.2645549941702293, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13615, "loss": 0.25802451372146606, "memory_gb": 7.721559524536133, "step_time_ms": 3351.013422012329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:06] (step=0013615) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.26457442673921494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13616, "loss": 0.22407883405685425, "memory_gb": 7.721559524536133, "step_time_ms": 3350.2659797668457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:10] (step=0013616) Train Loss: 0.2154, Train Steps/Sec: 0.28, Epoch: 0.26459385930820056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13617, "loss": 0.2471047341823578, "memory_gb": 7.721559524536133, "step_time_ms": 3349.677562713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:13] (step=0013617) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.2646132918771862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13618, "loss": 0.23644399642944336, "memory_gb": 7.721559524536133, "step_time_ms": 3350.66294670105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:17] (step=0013618) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.2646327244461718, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13619, "loss": 0.15596342086791992, "memory_gb": 7.721559524536133, "step_time_ms": 3347.7888107299805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:20] (step=0013619) Train Loss: 0.1600, Train Steps/Sec: 0.28, Epoch: 0.26465215701515743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13620, "loss": 0.1917184442281723, "memory_gb": 7.721559524536133, "step_time_ms": 3348.339080810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:24] (step=0013620) Train Loss: 0.1929, Train Steps/Sec: 0.28, Epoch: 0.264671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13621, "loss": 0.30011388659477234, "memory_gb": 7.721559524536133, "step_time_ms": 3351.236581802368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:28] (step=0013621) Train Loss: 0.2831, Train Steps/Sec: 0.28, Epoch: 0.2646910221531286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13622, "loss": 0.25503334403038025, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5163192749023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:31] (step=0013622) Train Loss: 0.3131, Train Steps/Sec: 0.28, Epoch: 0.26471045472211424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13623, "loss": 0.4047093987464905, "memory_gb": 7.721559524536133, "step_time_ms": 3344.4244861602783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:35] (step=0013623) Train Loss: 0.3010, Train Steps/Sec: 0.28, Epoch: 0.26472988729109986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13624, "loss": 0.18786081671714783, "memory_gb": 7.721559524536133, "step_time_ms": 3342.7982330322266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:38] (step=0013624) Train Loss: 0.1931, Train Steps/Sec: 0.28, Epoch: 0.2647493198600855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13625, "loss": 0.2180429995059967, "memory_gb": 7.721559524536133, "step_time_ms": 3354.905366897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:42] (step=0013625) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.2647687524290711, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13626, "loss": 0.12438958138227463, "memory_gb": 7.721559524536133, "step_time_ms": 3353.064775466919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:45] (step=0013626) Train Loss: 0.1669, Train Steps/Sec: 0.28, Epoch: 0.26478818499805673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13627, "loss": 0.2728109359741211, "memory_gb": 7.721559524536133, "step_time_ms": 3349.433422088623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:49] (step=0013627) Train Loss: 0.1991, Train Steps/Sec: 0.28, Epoch: 0.26480761756704235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13628, "loss": 0.2105734944343567, "memory_gb": 7.721559524536133, "step_time_ms": 3344.3548679351807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:53] (step=0013628) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.264827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:37:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13629, "loss": 0.2591986060142517, "memory_gb": 7.721559524536133, "step_time_ms": 3351.513624191284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:37:56] (step=0013629) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.2648464827050136, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13630, "loss": 0.26513606309890747, "memory_gb": 7.721559524536133, "step_time_ms": 3351.590871810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:00] (step=0013630) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.2648659152739992, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13631, "loss": 0.2249261736869812, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7070541381836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:03] (step=0013631) Train Loss: 0.1858, Train Steps/Sec: 0.28, Epoch: 0.26488534784298484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13632, "loss": 0.2529260516166687, "memory_gb": 7.721559524536133, "step_time_ms": 3348.609685897827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:07] (step=0013632) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.26490478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13633, "loss": 0.32627981901168823, "memory_gb": 7.721559524536133, "step_time_ms": 3349.459171295166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:10] (step=0013633) Train Loss: 0.2676, Train Steps/Sec: 0.28, Epoch: 0.2649242129809561, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13634, "loss": 0.28653818368911743, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8735847473145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:14] (step=0013634) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.2649436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13635, "loss": 0.2285737693309784, "memory_gb": 7.721559524536133, "step_time_ms": 3350.966453552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:18] (step=0013635) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.26496307811892733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13636, "loss": 0.26956066489219666, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3824005126953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:21] (step=0013636) Train Loss: 0.2314, Train Steps/Sec: 0.28, Epoch: 0.26498251068791295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13637, "loss": 0.2688634991645813, "memory_gb": 7.721559524536133, "step_time_ms": 3351.762294769287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:25] (step=0013637) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.2650019432568986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13638, "loss": 0.26275959610939026, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7885417938232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:28] (step=0013638) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.2650213758258842, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13639, "loss": 0.23207904398441315, "memory_gb": 7.721559524536133, "step_time_ms": 3351.165771484375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:32] (step=0013639) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.2650408083948698, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13640, "loss": 0.18034523725509644, "memory_gb": 7.721559524536133, "step_time_ms": 3347.41473197937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:35] (step=0013640) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.26506024096385544, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13641, "loss": 0.23846228420734406, "memory_gb": 7.721559524536133, "step_time_ms": 3350.642681121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:39] (step=0013641) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.26507967353284106, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13642, "loss": 0.1380492001771927, "memory_gb": 7.721559524536133, "step_time_ms": 3352.461576461792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:42] (step=0013642) Train Loss: 0.1825, Train Steps/Sec: 0.28, Epoch: 0.2650991061018267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13643, "loss": 0.30624085664749146, "memory_gb": 7.721559524536133, "step_time_ms": 3354.504108428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:46] (step=0013643) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.26511853867081225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13644, "loss": 0.28504061698913574, "memory_gb": 7.721559524536133, "step_time_ms": 3350.196599960327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:50] (step=0013644) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.2651379712397979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13645, "loss": 0.31829890608787537, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7898273468018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:53] (step=0013645) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.2651574038087835, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:38:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13646, "loss": 0.23496294021606445, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3952445983887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:38:57] (step=0013646) Train Loss: 0.2603, Train Steps/Sec: 0.28, Epoch: 0.2651768363777691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13647, "loss": 0.22309328615665436, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6912879943848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:00] (step=0013647) Train Loss: 0.2709, Train Steps/Sec: 0.27, Epoch: 0.26519626894675474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13648, "loss": 0.2907399535179138, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3876457214355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:04] (step=0013648) Train Loss: 0.2765, Train Steps/Sec: 0.28, Epoch: 0.26521570151574037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13649, "loss": 0.22532978653907776, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6805381774902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:08] (step=0013649) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.265235134084726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13650, "loss": 0.3051829934120178, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4182262420654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:11] (step=0013650) Train Loss: 0.2640, Train Steps/Sec: 0.28, Epoch: 0.2652545666537116, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13651, "loss": 0.2582477331161499, "memory_gb": 7.721559524536133, "step_time_ms": 3354.926824569702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:15] (step=0013651) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.26527399922269723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13652, "loss": 0.21812966465950012, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4586429595947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:18] (step=0013652) Train Loss: 0.2132, Train Steps/Sec: 0.28, Epoch: 0.26529343179168285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13653, "loss": 0.3214109241962433, "memory_gb": 7.721559524536133, "step_time_ms": 3347.764253616333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:22] (step=0013653) Train Loss: 0.2836, Train Steps/Sec: 0.28, Epoch: 0.2653128643606685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13654, "loss": 0.16155904531478882, "memory_gb": 7.721559524536133, "step_time_ms": 3358.80184173584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:25] (step=0013654) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.2653322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13655, "loss": 0.25372904539108276, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9931983947754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:29] (step=0013655) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.2653517294986397, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13656, "loss": 0.2510870695114136, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0431423187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:33] (step=0013656) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.26537116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13657, "loss": 0.2634374797344208, "memory_gb": 7.721559524536133, "step_time_ms": 3359.538793563843, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:36] (step=0013657) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.26539059463661097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13658, "loss": 0.25067397952079773, "memory_gb": 7.721559524536133, "step_time_ms": 3357.499837875366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:40] (step=0013658) Train Loss: 0.2948, Train Steps/Sec: 0.28, Epoch: 0.2654100272055966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13659, "loss": 0.30906951427459717, "memory_gb": 7.721559524536133, "step_time_ms": 3359.062433242798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:43] (step=0013659) Train Loss: 0.2978, Train Steps/Sec: 0.28, Epoch: 0.2654294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13660, "loss": 0.18063271045684814, "memory_gb": 7.721559524536133, "step_time_ms": 3502.4001598358154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:47] (step=0013660) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.26544889234356783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13661, "loss": 0.2821001410484314, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7042655944824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:50] (step=0013661) Train Loss: 0.2566, Train Steps/Sec: 0.28, Epoch: 0.26546832491255345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13662, "loss": 0.2679280638694763, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7677478790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:54] (step=0013662) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.2654877574815391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13663, "loss": 0.3081369400024414, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3694438934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:39:58] (step=0013663) Train Loss: 0.2826, Train Steps/Sec: 0.28, Epoch: 0.2655071900505247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13664, "loss": 0.2902297079563141, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6128482818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:01] (step=0013664) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.2655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13665, "loss": 0.29217228293418884, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3404808044434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:05] (step=0013665) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.26554605518849594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13666, "loss": 0.2032148540019989, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3321781158447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:08] (step=0013666) Train Loss: 0.2198, Train Steps/Sec: 0.28, Epoch: 0.26556548775748157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13667, "loss": 0.19650141894817352, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9322566986084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:12] (step=0013667) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.26558492032646713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13668, "loss": 0.2651996910572052, "memory_gb": 7.721559524536133, "step_time_ms": 3356.813669204712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:15] (step=0013668) Train Loss: 0.3050, Train Steps/Sec: 0.28, Epoch: 0.26560435289545276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13669, "loss": 0.23391520977020264, "memory_gb": 7.721559524536133, "step_time_ms": 3360.53466796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:19] (step=0013669) Train Loss: 0.2223, Train Steps/Sec: 0.28, Epoch: 0.2656237854644384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13670, "loss": 0.30778706073760986, "memory_gb": 7.721559524536133, "step_time_ms": 3350.987672805786, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:23] (step=0013670) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.265643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13671, "loss": 0.2171945422887802, "memory_gb": 7.721559524536133, "step_time_ms": 3355.436086654663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:26] (step=0013671) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.2656626506024096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13672, "loss": 0.17872726917266846, "memory_gb": 7.721559524536133, "step_time_ms": 3357.034921646118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:30] (step=0013672) Train Loss: 0.1526, Train Steps/Sec: 0.28, Epoch: 0.26568208317139524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13673, "loss": 0.28818032145500183, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5869331359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:33] (step=0013673) Train Loss: 0.2683, Train Steps/Sec: 0.28, Epoch: 0.26570151574038087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13674, "loss": 0.24976176023483276, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8101863861084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:37] (step=0013674) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.2657209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13675, "loss": 0.26762211322784424, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9254360198975, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:40] (step=0013675) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.2657403808783521, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13676, "loss": 0.281097948551178, "memory_gb": 7.721559524536133, "step_time_ms": 3364.874839782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:44] (step=0013676) Train Loss: 0.2827, Train Steps/Sec: 0.28, Epoch: 0.26575981344733773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13677, "loss": 0.1816723644733429, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0005378723145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:48] (step=0013677) Train Loss: 0.1955, Train Steps/Sec: 0.28, Epoch: 0.26577924601632336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13678, "loss": 0.3029254674911499, "memory_gb": 7.721559524536133, "step_time_ms": 3353.5401821136475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:51] (step=0013678) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.265798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13679, "loss": 0.234534353017807, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2162132263184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:55] (step=0013679) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.2658181111542946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13680, "loss": 0.3225419521331787, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5148792266846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:40:58] (step=0013680) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.2658375437232802, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13681, "loss": 0.22600284218788147, "memory_gb": 7.721559524536133, "step_time_ms": 3357.917070388794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:02] (step=0013681) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.26585697629226585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13682, "loss": 0.2120981216430664, "memory_gb": 7.721559524536133, "step_time_ms": 3364.049196243286, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:05] (step=0013682) Train Loss: 0.2013, Train Steps/Sec: 0.28, Epoch: 0.26587640886125147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13683, "loss": 0.25246572494506836, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4822158813477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:09] (step=0013683) Train Loss: 0.2249, Train Steps/Sec: 0.28, Epoch: 0.2658958414302371, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13684, "loss": 0.18830588459968567, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1787815093994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:13] (step=0013684) Train Loss: 0.1938, Train Steps/Sec: 0.28, Epoch: 0.2659152739992227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13685, "loss": 0.266436904668808, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4751358032227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:16] (step=0013685) Train Loss: 0.3139, Train Steps/Sec: 0.28, Epoch: 0.26593470656820833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13686, "loss": 0.15664160251617432, "memory_gb": 7.721559524536133, "step_time_ms": 3358.210563659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:20] (step=0013686) Train Loss: 0.1941, Train Steps/Sec: 0.28, Epoch: 0.26595413913719396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13687, "loss": 0.3463869094848633, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7283363342285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:23] (step=0013687) Train Loss: 0.3012, Train Steps/Sec: 0.28, Epoch: 0.2659735717061796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13688, "loss": 0.23751388490200043, "memory_gb": 7.721559524536133, "step_time_ms": 3361.26446723938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:27] (step=0013688) Train Loss: 0.2840, Train Steps/Sec: 0.27, Epoch: 0.2659930042751652, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13689, "loss": 0.22517675161361694, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7987422943115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:31] (step=0013689) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.2660124368441508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13690, "loss": 0.27686187624931335, "memory_gb": 7.721559524536133, "step_time_ms": 3348.125696182251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:34] (step=0013690) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.2660318694131364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13691, "loss": 0.17820283770561218, "memory_gb": 7.721559524536133, "step_time_ms": 3360.682249069214, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:38] (step=0013691) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.266051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13692, "loss": 0.17137038707733154, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5449924468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:41] (step=0013692) Train Loss: 0.1996, Train Steps/Sec: 0.28, Epoch: 0.26607073455110763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13693, "loss": 0.3250929117202759, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2278747558594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:45] (step=0013693) Train Loss: 0.2819, Train Steps/Sec: 0.28, Epoch: 0.26609016712009326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13694, "loss": 0.2071380615234375, "memory_gb": 7.721559524536133, "step_time_ms": 3341.848850250244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:48] (step=0013694) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.2661095996890789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13695, "loss": 0.31208375096321106, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7426204681396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:52] (step=0013695) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.2661290322580645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13696, "loss": 0.307324081659317, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7397384643555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:56] (step=0013696) Train Loss: 0.2772, Train Steps/Sec: 0.28, Epoch: 0.2661484648270501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:41:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13697, "loss": 0.23603029549121857, "memory_gb": 7.721559524536133, "step_time_ms": 3349.2445945739746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:41:59] (step=0013697) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.26616789739603575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13698, "loss": 0.18337096273899078, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6763191223145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:03] (step=0013698) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.26618732996502137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13699, "loss": 0.2880662977695465, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9868965148926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:06] (step=0013699) Train Loss: 0.2879, Train Steps/Sec: 0.28, Epoch: 0.266206762534007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13700, "loss": 0.1694609820842743, "memory_gb": 7.721559524536133, "step_time_ms": 3514.3227577209473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:10] (step=0013700) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.2662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13701, "loss": 0.28148719668388367, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3338985443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:13] (step=0013701) Train Loss: 0.3129, Train Steps/Sec: 0.28, Epoch: 0.26624562767197824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13702, "loss": 0.27587130665779114, "memory_gb": 7.721559524536133, "step_time_ms": 3362.156867980957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:17] (step=0013702) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.26626506024096386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13703, "loss": 0.1747686266899109, "memory_gb": 7.721559524536133, "step_time_ms": 3359.900951385498, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:21] (step=0013703) Train Loss: 0.2312, Train Steps/Sec: 0.28, Epoch: 0.2662844928099495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13704, "loss": 0.2598598003387451, "memory_gb": 7.721559524536133, "step_time_ms": 3348.942995071411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:24] (step=0013704) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.2663039253789351, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13705, "loss": 0.25102096796035767, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9906692504883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:28] (step=0013705) Train Loss: 0.2687, Train Steps/Sec: 0.28, Epoch: 0.2663233579479207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13706, "loss": 0.18265405297279358, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2142124176025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:31] (step=0013706) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.26634279051690635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13707, "loss": 0.2694169878959656, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9607009887695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:35] (step=0013707) Train Loss: 0.2964, Train Steps/Sec: 0.28, Epoch: 0.26636222308589197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13708, "loss": 0.27352002263069153, "memory_gb": 7.721559524536133, "step_time_ms": 3342.6620960235596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:38] (step=0013708) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.2663816556548776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13709, "loss": 0.16925287246704102, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5738410949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:42] (step=0013709) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.2664010882238632, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13710, "loss": 0.16891497373580933, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3380031585693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:46] (step=0013710) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.26642052079284884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13711, "loss": 0.1765386313199997, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1978340148926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:49] (step=0013711) Train Loss: 0.1931, Train Steps/Sec: 0.28, Epoch: 0.26643995336183446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13712, "loss": 0.17489677667617798, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8595390319824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:53] (step=0013712) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.2664593859308201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:42:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13713, "loss": 0.1565934121608734, "memory_gb": 7.721559524536133, "step_time_ms": 3357.400417327881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:42:56] (step=0013713) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.26647881849980565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13714, "loss": 0.25334545969963074, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1768531799316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:00] (step=0013714) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.26649825106879127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13715, "loss": 0.22626584768295288, "memory_gb": 7.721559524536133, "step_time_ms": 3360.562562942505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:03] (step=0013715) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.2665176836377769, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13716, "loss": 0.18770501017570496, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5988025665283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:07] (step=0013716) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.2665371162067625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13717, "loss": 0.21835727989673615, "memory_gb": 7.721559524536133, "step_time_ms": 3350.235939025879, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:11] (step=0013717) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.26655654877574814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13718, "loss": 0.2291295975446701, "memory_gb": 7.721559524536133, "step_time_ms": 3348.6108779907227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:14] (step=0013718) Train Loss: 0.2027, Train Steps/Sec: 0.28, Epoch: 0.26657598134473376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13719, "loss": 0.21753966808319092, "memory_gb": 7.721559524536133, "step_time_ms": 3357.837438583374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:18] (step=0013719) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.2665954139137194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13720, "loss": 0.19997942447662354, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5915565490723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:21] (step=0013720) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.266614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13721, "loss": 0.298658549785614, "memory_gb": 7.721559524536133, "step_time_ms": 3360.201358795166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:25] (step=0013721) Train Loss: 0.2332, Train Steps/Sec: 0.28, Epoch: 0.2666342790516906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13722, "loss": 0.28113216161727905, "memory_gb": 7.721559524536133, "step_time_ms": 3355.38649559021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:28] (step=0013722) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.26665371162067625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13723, "loss": 0.3282402753829956, "memory_gb": 7.721559524536133, "step_time_ms": 3351.6907691955566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:32] (step=0013723) Train Loss: 0.2872, Train Steps/Sec: 0.28, Epoch: 0.26667314418966187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13724, "loss": 0.20692035555839539, "memory_gb": 7.721559524536133, "step_time_ms": 3357.877254486084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:35] (step=0013724) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.2666925767586475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13725, "loss": 0.23954156041145325, "memory_gb": 7.721559524536133, "step_time_ms": 3356.11891746521, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:39] (step=0013725) Train Loss: 0.2215, Train Steps/Sec: 0.28, Epoch: 0.2667120093276331, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13726, "loss": 0.2592601478099823, "memory_gb": 7.721559524536133, "step_time_ms": 3337.874412536621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:43] (step=0013726) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.26673144189661874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13727, "loss": 0.3092588484287262, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3941764831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:46] (step=0013727) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.26675087446560436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13728, "loss": 0.18889157474040985, "memory_gb": 7.721559524536133, "step_time_ms": 3349.9088287353516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:50] (step=0013728) Train Loss: 0.2114, Train Steps/Sec: 0.28, Epoch: 0.26677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13729, "loss": 0.1917416751384735, "memory_gb": 7.721559524536133, "step_time_ms": 3357.062578201294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:53] (step=0013729) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.2667897396035756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:43:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13730, "loss": 0.18677839636802673, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9891662597656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:43:57] (step=0013730) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.2668091721725612, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13731, "loss": 0.23991768062114716, "memory_gb": 7.721559524536133, "step_time_ms": 3351.287364959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:00] (step=0013731) Train Loss: 0.2788, Train Steps/Sec: 0.28, Epoch: 0.26682860474154685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13732, "loss": 0.24740275740623474, "memory_gb": 7.721559524536133, "step_time_ms": 3355.800151824951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:04] (step=0013732) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.26684803731053247, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13733, "loss": 0.3379811644554138, "memory_gb": 7.721559524536133, "step_time_ms": 3359.440326690674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:08] (step=0013733) Train Loss: 0.3354, Train Steps/Sec: 0.28, Epoch: 0.2668674698795181, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13734, "loss": 0.2673845887184143, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9277572631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:11] (step=0013734) Train Loss: 0.3262, Train Steps/Sec: 0.28, Epoch: 0.2668869024485037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13735, "loss": 0.24199697375297546, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7703704833984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:15] (step=0013735) Train Loss: 0.2431, Train Steps/Sec: 0.28, Epoch: 0.26690633501748934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13736, "loss": 0.2735309898853302, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2395057678223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:18] (step=0013736) Train Loss: 0.2894, Train Steps/Sec: 0.27, Epoch: 0.2669257675864749, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13737, "loss": 0.23056712746620178, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7398529052734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:22] (step=0013737) Train Loss: 0.2735, Train Steps/Sec: 0.28, Epoch: 0.2669452001554605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13738, "loss": 0.21687419712543488, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8457832336426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:26] (step=0013738) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.26696463272444615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13739, "loss": 0.2255156934261322, "memory_gb": 7.721559524536133, "step_time_ms": 3359.133005142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:29] (step=0013739) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.26698406529343177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13740, "loss": 0.20432281494140625, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5170040130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:33] (step=0013740) Train Loss: 0.2536, Train Steps/Sec: 0.28, Epoch: 0.2670034978624174, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13741, "loss": 0.327129065990448, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5058917999268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:36] (step=0013741) Train Loss: 0.2995, Train Steps/Sec: 0.28, Epoch: 0.267022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13742, "loss": 0.24406544864177704, "memory_gb": 7.721559524536133, "step_time_ms": 3354.149103164673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:40] (step=0013742) Train Loss: 0.2495, Train Steps/Sec: 0.28, Epoch: 0.26704236300038864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13743, "loss": 0.22023604810237885, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0237159729004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:43] (step=0013743) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.26706179556937426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13744, "loss": 0.33692193031311035, "memory_gb": 7.721559524536133, "step_time_ms": 3357.391119003296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:47] (step=0013744) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.2670812281383599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13745, "loss": 0.1171627789735794, "memory_gb": 7.721559524536133, "step_time_ms": 3357.820510864258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:51] (step=0013745) Train Loss: 0.1454, Train Steps/Sec: 0.28, Epoch: 0.2671006607073455, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13746, "loss": 0.10173805058002472, "memory_gb": 7.721559524536133, "step_time_ms": 3357.66863822937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:54] (step=0013746) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.2671200932763311, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:44:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13747, "loss": 0.26147177815437317, "memory_gb": 7.721559524536133, "step_time_ms": 3508.793354034424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:44:58] (step=0013747) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.26713952584531675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13748, "loss": 0.2859097123146057, "memory_gb": 7.721559524536133, "step_time_ms": 3351.834535598755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:01] (step=0013748) Train Loss: 0.2949, Train Steps/Sec: 0.28, Epoch: 0.26715895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13749, "loss": 0.2519838809967041, "memory_gb": 7.721559524536133, "step_time_ms": 3360.680341720581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:05] (step=0013749) Train Loss: 0.2326, Train Steps/Sec: 0.28, Epoch: 0.267178390983288, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13750, "loss": 0.2256256639957428, "memory_gb": 7.721559524536133, "step_time_ms": 3362.962245941162, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:08] (step=0013750) Train Loss: 0.1990, Train Steps/Sec: 0.28, Epoch: 0.2671978235522736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13751, "loss": 0.1846361607313156, "memory_gb": 7.721559524536133, "step_time_ms": 3362.354040145874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:12] (step=0013751) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.26721725612125924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13752, "loss": 0.22951294481754303, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7609481811523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:15] (step=0013752) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.26723668869024486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13753, "loss": 0.2520447373390198, "memory_gb": 7.721559524536133, "step_time_ms": 3362.525701522827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:19] (step=0013753) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.2672561212592305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13754, "loss": 0.2994968593120575, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2987995147705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:23] (step=0013754) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.2672755538282161, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13755, "loss": 0.1634128838777542, "memory_gb": 7.721559524536133, "step_time_ms": 3363.947868347168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:26] (step=0013755) Train Loss: 0.2133, Train Steps/Sec: 0.28, Epoch: 0.26729498639720173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13756, "loss": 0.23190705478191376, "memory_gb": 7.721559524536133, "step_time_ms": 3355.942487716675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:30] (step=0013756) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.26731441896618735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13757, "loss": 0.22898167371749878, "memory_gb": 7.721559524536133, "step_time_ms": 3363.093376159668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:33] (step=0013757) Train Loss: 0.2008, Train Steps/Sec: 0.28, Epoch: 0.267333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13758, "loss": 0.30247676372528076, "memory_gb": 7.721559524536133, "step_time_ms": 3355.569839477539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:37] (step=0013758) Train Loss: 0.2910, Train Steps/Sec: 0.28, Epoch: 0.2673532841041586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13759, "loss": 0.16641205549240112, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5852127075195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:40] (step=0013759) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.2673727166731442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13760, "loss": 0.15835194289684296, "memory_gb": 7.721559524536133, "step_time_ms": 3345.8807468414307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:44] (step=0013760) Train Loss: 0.1822, Train Steps/Sec: 0.28, Epoch: 0.2673921492421298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13761, "loss": 0.3556033968925476, "memory_gb": 7.721559524536133, "step_time_ms": 3360.304832458496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:48] (step=0013761) Train Loss: 0.2487, Train Steps/Sec: 0.28, Epoch: 0.2674115818111154, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13762, "loss": 0.2717393934726715, "memory_gb": 7.721559524536133, "step_time_ms": 3359.102964401245, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:51] (step=0013762) Train Loss: 0.2714, Train Steps/Sec: 0.28, Epoch: 0.26743101438010103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13763, "loss": 0.28553643822669983, "memory_gb": 7.721559524536133, "step_time_ms": 3360.349416732788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:55] (step=0013763) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.26745044694908665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:45:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13764, "loss": 0.2895102798938751, "memory_gb": 7.721559524536133, "step_time_ms": 3362.353563308716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:45:58] (step=0013764) Train Loss: 0.2623, Train Steps/Sec: 0.28, Epoch: 0.2674698795180723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13765, "loss": 0.2917172908782959, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0351600646973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:02] (step=0013765) Train Loss: 0.2568, Train Steps/Sec: 0.28, Epoch: 0.2674893120870579, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13766, "loss": 0.30505239963531494, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7962131500244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:05] (step=0013766) Train Loss: 0.3135, Train Steps/Sec: 0.28, Epoch: 0.2675087446560435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13767, "loss": 0.2297307699918747, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4567260742188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:09] (step=0013767) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.26752817722502914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13768, "loss": 0.12718859314918518, "memory_gb": 7.721559524536133, "step_time_ms": 3359.677314758301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:13] (step=0013768) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.26754760979401476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13769, "loss": 0.29870009422302246, "memory_gb": 7.721559524536133, "step_time_ms": 3361.886501312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:16] (step=0013769) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.2675670423630004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13770, "loss": 0.12783515453338623, "memory_gb": 7.721559524536133, "step_time_ms": 3364.800453186035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:20] (step=0013770) Train Loss: 0.1990, Train Steps/Sec: 0.28, Epoch: 0.267586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13771, "loss": 0.24730080366134644, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8995723724365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:23] (step=0013771) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.26760590750097163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13772, "loss": 0.3683735132217407, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3038482666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:27] (step=0013772) Train Loss: 0.3493, Train Steps/Sec: 0.28, Epoch: 0.26762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13773, "loss": 0.23233403265476227, "memory_gb": 7.721559524536133, "step_time_ms": 3368.255615234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:30] (step=0013773) Train Loss: 0.1757, Train Steps/Sec: 0.28, Epoch: 0.2676447726389429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13774, "loss": 0.20778727531433105, "memory_gb": 7.721559524536133, "step_time_ms": 3372.105598449707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:34] (step=0013774) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.2676642052079285, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13775, "loss": 0.1640225052833557, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9986419677734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:38] (step=0013775) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.2676836377769141, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13776, "loss": 0.2813968360424042, "memory_gb": 7.721559524536133, "step_time_ms": 3370.3227043151855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:41] (step=0013776) Train Loss: 0.1992, Train Steps/Sec: 0.27, Epoch: 0.26770307034589974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13777, "loss": 0.23363062739372253, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5731468200684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:45] (step=0013777) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.26772250291488536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13778, "loss": 0.1725205034017563, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2095279693604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:48] (step=0013778) Train Loss: 0.1897, Train Steps/Sec: 0.28, Epoch: 0.267741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13779, "loss": 0.24633711576461792, "memory_gb": 7.721559524536133, "step_time_ms": 3365.52357673645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:52] (step=0013779) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.2677613680528566, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13780, "loss": 0.169522225856781, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3009967803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:55] (step=0013780) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.26778080062184223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:46:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13781, "loss": 0.21066536009311676, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2544116973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:46:59] (step=0013781) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.26780023319082785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13782, "loss": 0.17387908697128296, "memory_gb": 7.721559524536133, "step_time_ms": 3360.664129257202, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:03] (step=0013782) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.2678196657598135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13783, "loss": 0.2426827847957611, "memory_gb": 7.721559524536133, "step_time_ms": 3356.954574584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:06] (step=0013783) Train Loss: 0.1927, Train Steps/Sec: 0.28, Epoch: 0.26783909832879904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13784, "loss": 0.2753218710422516, "memory_gb": 7.721559524536133, "step_time_ms": 3357.600212097168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:10] (step=0013784) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.26785853089778466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13785, "loss": 0.23129861056804657, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1935119628906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:13] (step=0013785) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.2678779634667703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13786, "loss": 0.17914274334907532, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1134033203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:17] (step=0013786) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.2678973960357559, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13787, "loss": 0.2556953430175781, "memory_gb": 7.721559524536133, "step_time_ms": 3363.483428955078, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:20] (step=0013787) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.26791682860474153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13788, "loss": 0.18744158744812012, "memory_gb": 7.721559524536133, "step_time_ms": 3510.41316986084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:24] (step=0013788) Train Loss: 0.1793, Train Steps/Sec: 0.28, Epoch: 0.26793626117372715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13789, "loss": 0.2065310776233673, "memory_gb": 7.721559524536133, "step_time_ms": 3352.799654006958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:28] (step=0013789) Train Loss: 0.2185, Train Steps/Sec: 0.28, Epoch: 0.2679556937427128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13790, "loss": 0.22136223316192627, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8955402374268, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:31] (step=0013790) Train Loss: 0.2047, Train Steps/Sec: 0.28, Epoch: 0.2679751263116984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13791, "loss": 0.1653110235929489, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6761226654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:35] (step=0013791) Train Loss: 0.1519, Train Steps/Sec: 0.28, Epoch: 0.267994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13792, "loss": 0.2855294346809387, "memory_gb": 7.721559524536133, "step_time_ms": 3343.810796737671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:38] (step=0013792) Train Loss: 0.3174, Train Steps/Sec: 0.28, Epoch: 0.26801399144966964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13793, "loss": 0.20425266027450562, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8740615844727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:42] (step=0013793) Train Loss: 0.2261, Train Steps/Sec: 0.28, Epoch: 0.26803342401865526, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13794, "loss": 0.23341023921966553, "memory_gb": 7.721559524536133, "step_time_ms": 3359.116554260254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:46] (step=0013794) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.2680528565876409, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13795, "loss": 0.21504266560077667, "memory_gb": 7.721559524536133, "step_time_ms": 3356.72664642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:49] (step=0013795) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.2680722891566265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13796, "loss": 0.1921960413455963, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0069541931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:53] (step=0013796) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.26809172172561213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:47:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13797, "loss": 0.1910213828086853, "memory_gb": 7.721559524536133, "step_time_ms": 3355.924606323242, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:47:56] (step=0013797) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.26811115429459775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13798, "loss": 0.15317165851593018, "memory_gb": 7.721559524536133, "step_time_ms": 3357.184410095215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:00] (step=0013798) Train Loss: 0.2269, Train Steps/Sec: 0.28, Epoch: 0.2681305868635834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13799, "loss": 0.17233771085739136, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4276905059814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:03] (step=0013799) Train Loss: 0.1876, Train Steps/Sec: 0.28, Epoch: 0.268150019432569, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13800, "loss": 0.2549865245819092, "memory_gb": 7.721559524536133, "step_time_ms": 3344.142198562622, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:07] (step=0013800) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.2681694520015546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13801, "loss": 0.23619531095027924, "memory_gb": 7.721559524536133, "step_time_ms": 3349.119186401367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:11] (step=0013801) Train Loss: 0.2753, Train Steps/Sec: 0.28, Epoch: 0.26818888457054024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13802, "loss": 0.18307623267173767, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2509803771973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:14] (step=0013802) Train Loss: 0.1540, Train Steps/Sec: 0.28, Epoch: 0.26820831713952586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13803, "loss": 0.21085435152053833, "memory_gb": 7.721559524536133, "step_time_ms": 3350.8853912353516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:18] (step=0013803) Train Loss: 0.1990, Train Steps/Sec: 0.28, Epoch: 0.2682277497085115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13804, "loss": 0.3092847168445587, "memory_gb": 7.721559524536133, "step_time_ms": 3358.483076095581, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:21] (step=0013804) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.2682471822774971, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13805, "loss": 0.19994238018989563, "memory_gb": 7.721559524536133, "step_time_ms": 3357.158899307251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:25] (step=0013805) Train Loss: 0.2383, Train Steps/Sec: 0.28, Epoch: 0.26826661484648273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13806, "loss": 0.23707613348960876, "memory_gb": 7.721559524536133, "step_time_ms": 3355.517625808716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:28] (step=0013806) Train Loss: 0.2296, Train Steps/Sec: 0.28, Epoch: 0.2682860474154683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13807, "loss": 0.2939276695251465, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7512550354004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:32] (step=0013807) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.2683054799844539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13808, "loss": 0.3099174201488495, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5753746032715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:36] (step=0013808) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.26832491255343954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13809, "loss": 0.35403063893318176, "memory_gb": 7.721559524536133, "step_time_ms": 3352.363109588623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:39] (step=0013809) Train Loss: 0.2935, Train Steps/Sec: 0.28, Epoch: 0.26834434512242517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13810, "loss": 0.22990678250789642, "memory_gb": 7.721559524536133, "step_time_ms": 3346.74072265625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:43] (step=0013810) Train Loss: 0.2740, Train Steps/Sec: 0.28, Epoch: 0.2683637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13811, "loss": 0.2739298939704895, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3239154815674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:46] (step=0013811) Train Loss: 0.2244, Train Steps/Sec: 0.28, Epoch: 0.2683832102603964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13812, "loss": 0.2351285219192505, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7076454162598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:50] (step=0013812) Train Loss: 0.2231, Train Steps/Sec: 0.28, Epoch: 0.26840264282938203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13813, "loss": 0.2327008843421936, "memory_gb": 7.721559524536133, "step_time_ms": 3354.421854019165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:53] (step=0013813) Train Loss: 0.2461, Train Steps/Sec: 0.28, Epoch: 0.26842207539836765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:48:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13814, "loss": 0.3414490222930908, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6137294769287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:48:57] (step=0013814) Train Loss: 0.2860, Train Steps/Sec: 0.28, Epoch: 0.2684415079673533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13815, "loss": 0.1728389859199524, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4111251831055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:01] (step=0013815) Train Loss: 0.1940, Train Steps/Sec: 0.28, Epoch: 0.2684609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13816, "loss": 0.1584787666797638, "memory_gb": 7.721559524536133, "step_time_ms": 3344.395399093628, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:04] (step=0013816) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.2684803731053245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13817, "loss": 0.10646138340234756, "memory_gb": 7.721559524536133, "step_time_ms": 3351.245164871216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:08] (step=0013817) Train Loss: 0.1653, Train Steps/Sec: 0.28, Epoch: 0.26849980567431014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13818, "loss": 0.26922500133514404, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5808067321777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:11] (step=0013818) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.26851923824329577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13819, "loss": 0.335436075925827, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3666343688965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:15] (step=0013819) Train Loss: 0.3302, Train Steps/Sec: 0.28, Epoch: 0.2685386708122814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13820, "loss": 0.1436392068862915, "memory_gb": 7.721559524536133, "step_time_ms": 3348.811388015747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:18] (step=0013820) Train Loss: 0.1982, Train Steps/Sec: 0.28, Epoch: 0.268558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13821, "loss": 0.1418224275112152, "memory_gb": 7.721559524536133, "step_time_ms": 3356.555223464966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:22] (step=0013821) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.26857753595025263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13822, "loss": 0.28680890798568726, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8785190582275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:26] (step=0013822) Train Loss: 0.2798, Train Steps/Sec: 0.28, Epoch: 0.26859696851923826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13823, "loss": 0.3089445233345032, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2636375427246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:29] (step=0013823) Train Loss: 0.2770, Train Steps/Sec: 0.27, Epoch: 0.2686164010882239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13824, "loss": 0.13262757658958435, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6315898895264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:33] (step=0013824) Train Loss: 0.1981, Train Steps/Sec: 0.28, Epoch: 0.2686358336572095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13825, "loss": 0.2550050616264343, "memory_gb": 7.721559524536133, "step_time_ms": 3351.942300796509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:36] (step=0013825) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.2686552662261951, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13826, "loss": 0.15715064108371735, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1339683532715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:40] (step=0013826) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.26867469879518074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13827, "loss": 0.22916197776794434, "memory_gb": 7.721559524536133, "step_time_ms": 3353.087902069092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:43] (step=0013827) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.26869413136416637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13828, "loss": 0.22984428703784943, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1160774230957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:47] (step=0013828) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.268713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13829, "loss": 0.2980530858039856, "memory_gb": 7.721559524536133, "step_time_ms": 3355.344772338867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:51] (step=0013829) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.26873299650213756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13830, "loss": 0.2978152632713318, "memory_gb": 7.721559524536133, "step_time_ms": 3353.278398513794, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:54] (step=0013830) Train Loss: 0.2578, Train Steps/Sec: 0.28, Epoch: 0.2687524290711232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:49:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13831, "loss": 0.3039182424545288, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7398109436035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:49:58] (step=0013831) Train Loss: 0.3256, Train Steps/Sec: 0.28, Epoch: 0.2687718616401088, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13832, "loss": 0.2879391312599182, "memory_gb": 7.721559524536133, "step_time_ms": 3350.3620624542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:01] (step=0013832) Train Loss: 0.2838, Train Steps/Sec: 0.28, Epoch: 0.2687912942090944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13833, "loss": 0.2712355852127075, "memory_gb": 7.721559524536133, "step_time_ms": 3354.562997817993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:05] (step=0013833) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.26881072677808004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13834, "loss": 0.20957723259925842, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7204780578613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:08] (step=0013834) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.26883015934706567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13835, "loss": 0.24375183880329132, "memory_gb": 7.721559524536133, "step_time_ms": 3352.778434753418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:12] (step=0013835) Train Loss: 0.1999, Train Steps/Sec: 0.28, Epoch: 0.2688495919160513, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13836, "loss": 0.17732161283493042, "memory_gb": 7.721559524536133, "step_time_ms": 3502.1235942840576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:16] (step=0013836) Train Loss: 0.1866, Train Steps/Sec: 0.28, Epoch: 0.2688690244850369, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13837, "loss": 0.25586679577827454, "memory_gb": 7.721559524536133, "step_time_ms": 3353.1064987182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:19] (step=0013837) Train Loss: 0.2648, Train Steps/Sec: 0.28, Epoch: 0.26888845705402253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13838, "loss": 0.1656673401594162, "memory_gb": 7.721559524536133, "step_time_ms": 3351.0122299194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:23] (step=0013838) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.26890788962300816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13839, "loss": 0.18910229206085205, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8378944396973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:26] (step=0013839) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.2689273221919938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13840, "loss": 0.28686246275901794, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6505699157715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:30] (step=0013840) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.2689467547609794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13841, "loss": 0.27247223258018494, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8995723724365, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:33] (step=0013841) Train Loss: 0.3431, Train Steps/Sec: 0.28, Epoch: 0.268966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13842, "loss": 0.28098809719085693, "memory_gb": 7.721559524536133, "step_time_ms": 3353.602409362793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:37] (step=0013842) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.26898561989895065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13843, "loss": 0.36478638648986816, "memory_gb": 7.715639114379883, "step_time_ms": 3323.251485824585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:41] (step=0013843) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.26900505246793627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13844, "loss": 0.11121668666601181, "memory_gb": 7.721559524536133, "step_time_ms": 3353.883743286133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:44] (step=0013844) Train Loss: 0.2017, Train Steps/Sec: 0.28, Epoch: 0.2690244850369219, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13845, "loss": 0.1573520004749298, "memory_gb": 7.721559524536133, "step_time_ms": 3344.5279598236084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:48] (step=0013845) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.2690439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13846, "loss": 0.27150222659111023, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0634174346924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:51] (step=0013846) Train Loss: 0.2950, Train Steps/Sec: 0.28, Epoch: 0.26906335017489313, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13847, "loss": 0.17336536943912506, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3521633148193, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:55] (step=0013847) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.26908278274387876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13848, "loss": 0.2318565398454666, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8384857177734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:50:58] (step=0013848) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.2691022153128644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13849, "loss": 0.21917515993118286, "memory_gb": 7.721559524536133, "step_time_ms": 3360.706090927124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:02] (step=0013849) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.26912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13850, "loss": 0.22360283136367798, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3759326934814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:06] (step=0013850) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.2691410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13851, "loss": 0.2324698269367218, "memory_gb": 7.721559524536133, "step_time_ms": 3352.7069091796875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:09] (step=0013851) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.26916051301982125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13852, "loss": 0.17689338326454163, "memory_gb": 7.721559524536133, "step_time_ms": 3354.957342147827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:13] (step=0013852) Train Loss: 0.1588, Train Steps/Sec: 0.28, Epoch: 0.2691799455888068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13853, "loss": 0.35880187153816223, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9225540161133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:16] (step=0013853) Train Loss: 0.3033, Train Steps/Sec: 0.28, Epoch: 0.26919937815779243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13854, "loss": 0.35090431571006775, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2963943481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:20] (step=0013854) Train Loss: 0.3481, Train Steps/Sec: 0.28, Epoch: 0.26921881072677806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13855, "loss": 0.1921783685684204, "memory_gb": 7.721559524536133, "step_time_ms": 3357.712507247925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:23] (step=0013855) Train Loss: 0.2854, Train Steps/Sec: 0.28, Epoch: 0.2692382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13856, "loss": 0.36166849732398987, "memory_gb": 7.721559524536133, "step_time_ms": 3357.990264892578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:27] (step=0013856) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.2692576758647493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13857, "loss": 0.194627583026886, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3500385284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:31] (step=0013857) Train Loss: 0.2236, Train Steps/Sec: 0.28, Epoch: 0.2692771084337349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13858, "loss": 0.13596037030220032, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5668544769287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:34] (step=0013858) Train Loss: 0.1484, Train Steps/Sec: 0.28, Epoch: 0.26929654100272055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13859, "loss": 0.2002900093793869, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0683212280273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:38] (step=0013859) Train Loss: 0.1493, Train Steps/Sec: 0.28, Epoch: 0.26931597357170617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13860, "loss": 0.21846213936805725, "memory_gb": 7.715639114379883, "step_time_ms": 3323.6353397369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:41] (step=0013860) Train Loss: 0.2600, Train Steps/Sec: 0.28, Epoch: 0.2693354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13861, "loss": 0.30472826957702637, "memory_gb": 7.721559524536133, "step_time_ms": 3364.191770553589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:45] (step=0013861) Train Loss: 0.2458, Train Steps/Sec: 0.28, Epoch: 0.2693548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13862, "loss": 0.31673121452331543, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2169799804688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:48] (step=0013862) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.26937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13863, "loss": 0.18811863660812378, "memory_gb": 7.721559524536133, "step_time_ms": 3362.291097640991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:52] (step=0013863) Train Loss: 0.1669, Train Steps/Sec: 0.28, Epoch: 0.26939370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13864, "loss": 0.26412200927734375, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2893352508545, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:56] (step=0013864) Train Loss: 0.2368, Train Steps/Sec: 0.27, Epoch: 0.2694131364166343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:51:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13865, "loss": 0.24418002367019653, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0477962493896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:51:59] (step=0013865) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.2694325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13866, "loss": 0.23833683133125305, "memory_gb": 7.721559524536133, "step_time_ms": 3355.2918434143066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:03] (step=0013866) Train Loss: 0.2067, Train Steps/Sec: 0.28, Epoch: 0.2694520015546055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13867, "loss": 0.1945933699607849, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7658710479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:06] (step=0013867) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.26947143412359115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13868, "loss": 0.16626161336898804, "memory_gb": 7.721559524536133, "step_time_ms": 3362.185478210449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:10] (step=0013868) Train Loss: 0.2111, Train Steps/Sec: 0.28, Epoch: 0.26949086669257677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13869, "loss": 0.18883949518203735, "memory_gb": 7.721559524536133, "step_time_ms": 3361.557722091675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:14] (step=0013869) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.2695102992615624, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13870, "loss": 0.20208869874477386, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5942516326904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:17] (step=0013870) Train Loss: 0.1808, Train Steps/Sec: 0.28, Epoch: 0.269529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13871, "loss": 0.20498162508010864, "memory_gb": 7.721559524536133, "step_time_ms": 3359.726905822754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:21] (step=0013871) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.26954916439953364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13872, "loss": 0.23713622987270355, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9350662231445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:24] (step=0013872) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.26956859696851926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13873, "loss": 0.23127302527427673, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7114181518555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:28] (step=0013873) Train Loss: 0.2242, Train Steps/Sec: 0.28, Epoch: 0.2695880295375049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13874, "loss": 0.352874755859375, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6954135894775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:31] (step=0013874) Train Loss: 0.3030, Train Steps/Sec: 0.28, Epoch: 0.2696074621064905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13875, "loss": 0.1491178274154663, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6864891052246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:35] (step=0013875) Train Loss: 0.1726, Train Steps/Sec: 0.28, Epoch: 0.2696268946754761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13876, "loss": 0.2486031949520111, "memory_gb": 7.721559524536133, "step_time_ms": 3501.997709274292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:39] (step=0013876) Train Loss: 0.2561, Train Steps/Sec: 0.28, Epoch: 0.2696463272444617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13877, "loss": 0.2026955485343933, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0761165618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:42] (step=0013877) Train Loss: 0.1994, Train Steps/Sec: 0.28, Epoch: 0.2696657598134473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13878, "loss": 0.17348206043243408, "memory_gb": 7.721559524536133, "step_time_ms": 3361.269950866699, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:46] (step=0013878) Train Loss: 0.2398, Train Steps/Sec: 0.28, Epoch: 0.26968519238243294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13879, "loss": 0.20738834142684937, "memory_gb": 7.721559524536133, "step_time_ms": 3365.703344345093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:49] (step=0013879) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.26970462495141856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13880, "loss": 0.17586424946784973, "memory_gb": 7.721559524536133, "step_time_ms": 3367.675304412842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:53] (step=0013880) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.2697240575204042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:52:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13881, "loss": 0.33610254526138306, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8008365631104, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:52:56] (step=0013881) Train Loss: 0.3676, Train Steps/Sec: 0.28, Epoch: 0.2697434900893898, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13882, "loss": 0.1622762531042099, "memory_gb": 7.721559524536133, "step_time_ms": 3367.224931716919, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:00] (step=0013882) Train Loss: 0.1811, Train Steps/Sec: 0.28, Epoch: 0.2697629226583754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13883, "loss": 0.16058188676834106, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4795417785645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:04] (step=0013883) Train Loss: 0.1855, Train Steps/Sec: 0.28, Epoch: 0.26978235522736105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13884, "loss": 0.11891971528530121, "memory_gb": 7.721559524536133, "step_time_ms": 3365.143060684204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:07] (step=0013884) Train Loss: 0.1598, Train Steps/Sec: 0.28, Epoch: 0.26980178779634667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13885, "loss": 0.22173643112182617, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2374534606934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:11] (step=0013885) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.2698212203653323, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13886, "loss": 0.24241121113300323, "memory_gb": 7.721559524536133, "step_time_ms": 3365.245580673218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:14] (step=0013886) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.2698406529343179, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13887, "loss": 0.23329715430736542, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2301139831543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:18] (step=0013887) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.26986008550330354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13888, "loss": 0.3593097925186157, "memory_gb": 7.721559524536133, "step_time_ms": 3364.703416824341, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:22] (step=0013888) Train Loss: 0.2972, Train Steps/Sec: 0.28, Epoch: 0.26987951807228916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13889, "loss": 0.3119204044342041, "memory_gb": 7.721559524536133, "step_time_ms": 3361.622095108032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:25] (step=0013889) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.2698989506412748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13890, "loss": 0.20688895881175995, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9353351593018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:29] (step=0013890) Train Loss: 0.1961, Train Steps/Sec: 0.28, Epoch: 0.2699183832102604, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13891, "loss": 0.23882874846458435, "memory_gb": 7.721559524536133, "step_time_ms": 3360.684394836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:32] (step=0013891) Train Loss: 0.2024, Train Steps/Sec: 0.28, Epoch: 0.269937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13892, "loss": 0.25426581501960754, "memory_gb": 7.715639114379883, "step_time_ms": 3326.702833175659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:36] (step=0013892) Train Loss: 0.2837, Train Steps/Sec: 0.28, Epoch: 0.26995724834823165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13893, "loss": 0.23987352848052979, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2114658355713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:39] (step=0013893) Train Loss: 0.1924, Train Steps/Sec: 0.28, Epoch: 0.26997668091721727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13894, "loss": 0.29981738328933716, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0353775024414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:43] (step=0013894) Train Loss: 0.2407, Train Steps/Sec: 0.28, Epoch: 0.2699961134862029, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13895, "loss": 0.25525832176208496, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3766689300537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:47] (step=0013895) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.2700155460551885, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13896, "loss": 0.24615375697612762, "memory_gb": 7.721559524536133, "step_time_ms": 3355.180263519287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:50] (step=0013896) Train Loss: 0.2489, Train Steps/Sec: 0.28, Epoch: 0.27003497862417414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13897, "loss": 0.13432097434997559, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6436557769775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:54] (step=0013897) Train Loss: 0.1353, Train Steps/Sec: 0.28, Epoch: 0.27005441119315976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:53:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13898, "loss": 0.1780681610107422, "memory_gb": 7.721559524536133, "step_time_ms": 3360.353946685791, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:53:57] (step=0013898) Train Loss: 0.2120, Train Steps/Sec: 0.28, Epoch: 0.2700738437621454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13899, "loss": 0.2726139426231384, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8686714172363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:01] (step=0013899) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.27009327633113095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13900, "loss": 0.253095418214798, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7914447784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:05] (step=0013900) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.27011270890011657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13901, "loss": 0.2475982904434204, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2170734405518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:08] (step=0013901) Train Loss: 0.2126, Train Steps/Sec: 0.28, Epoch: 0.2701321414691022, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13902, "loss": 0.23644371330738068, "memory_gb": 7.721559524536133, "step_time_ms": 3353.872537612915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:12] (step=0013902) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.2701515740380878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13903, "loss": 0.21569013595581055, "memory_gb": 7.721559524536133, "step_time_ms": 3356.388568878174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:15] (step=0013903) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.27017100660707344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13904, "loss": 0.2754267156124115, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0634593963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:19] (step=0013904) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.27019043917605906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13905, "loss": 0.2893795967102051, "memory_gb": 7.721559524536133, "step_time_ms": 3355.569839477539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:22] (step=0013905) Train Loss: 0.2908, Train Steps/Sec: 0.28, Epoch: 0.2702098717450447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13906, "loss": 0.17419075965881348, "memory_gb": 7.721559524536133, "step_time_ms": 3357.015609741211, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:26] (step=0013906) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.2702293043140303, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13907, "loss": 0.21202345192432404, "memory_gb": 7.721559524536133, "step_time_ms": 3354.902982711792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:30] (step=0013907) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.2702487368830159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13908, "loss": 0.286102294921875, "memory_gb": 7.721559524536133, "step_time_ms": 3357.46431350708, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:33] (step=0013908) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.27026816945200155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13909, "loss": 0.2869842052459717, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1769256591797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:37] (step=0013909) Train Loss: 0.2448, Train Steps/Sec: 0.28, Epoch: 0.27028760202098717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13910, "loss": 0.35659343004226685, "memory_gb": 7.721559524536133, "step_time_ms": 3355.759382247925, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:40] (step=0013910) Train Loss: 0.3070, Train Steps/Sec: 0.28, Epoch: 0.2703070345899728, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13911, "loss": 0.18975500762462616, "memory_gb": 7.721559524536133, "step_time_ms": 3351.966381072998, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:44] (step=0013911) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.2703264671589584, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13912, "loss": 0.26656919717788696, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5094470977783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:48] (step=0013912) Train Loss: 0.2702, Train Steps/Sec: 0.27, Epoch: 0.27034589972794404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13913, "loss": 0.19391292333602905, "memory_gb": 7.721559524536133, "step_time_ms": 3351.850986480713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:51] (step=0013913) Train Loss: 0.2361, Train Steps/Sec: 0.28, Epoch: 0.27036533229692966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13914, "loss": 0.1871754229068756, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6739559173584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:55] (step=0013914) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.2703847648659153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:54:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13915, "loss": 0.22438517212867737, "memory_gb": 7.721559524536133, "step_time_ms": 3351.924180984497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:54:58] (step=0013915) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.2704041974349009, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 13916, "loss": 0.32926201820373535, "memory_gb": 7.721559524536133, "step_time_ms": 3354.403257369995, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:02] (step=0013916) Train Loss: 0.3050, Train Steps/Sec: 0.28, Epoch: 0.27042363000388653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13917, "loss": 0.28118008375167847, "memory_gb": 7.721559524536133, "step_time_ms": 3352.952003479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:05] (step=0013917) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.27044306257287215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 13918, "loss": 0.27227646112442017, "memory_gb": 7.721559524536133, "step_time_ms": 3354.975938796997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:09] (step=0013918) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.2704624951418578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13919, "loss": 0.23966199159622192, "memory_gb": 7.721559524536133, "step_time_ms": 3353.469133377075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:13] (step=0013919) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.2704819277108434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13920, "loss": 0.17769764363765717, "memory_gb": 7.721559524536133, "step_time_ms": 3352.292776107788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:16] (step=0013920) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.270501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13921, "loss": 0.2689938247203827, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2166481018066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:20] (step=0013921) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.27052079284881464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13922, "loss": 0.12427160143852234, "memory_gb": 7.721559524536133, "step_time_ms": 3353.107690811157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:23] (step=0013922) Train Loss: 0.1621, Train Steps/Sec: 0.28, Epoch: 0.2705402254178002, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 13923, "loss": 0.23839876055717468, "memory_gb": 7.721559524536133, "step_time_ms": 3495.990037918091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:27] (step=0013923) Train Loss: 0.2159, Train Steps/Sec: 0.28, Epoch: 0.27055965798678583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13924, "loss": 0.31621506810188293, "memory_gb": 7.721559524536133, "step_time_ms": 3355.88002204895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:30] (step=0013924) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.27057909055577145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13925, "loss": 0.1868726909160614, "memory_gb": 7.715639114379883, "step_time_ms": 3317.0015811920166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:34] (step=0013925) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.2705985231247571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13926, "loss": 0.25394314527511597, "memory_gb": 7.721559524536133, "step_time_ms": 3350.5194187164307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:38] (step=0013926) Train Loss: 0.2562, Train Steps/Sec: 0.28, Epoch: 0.2706179556937427, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13927, "loss": 0.24475941061973572, "memory_gb": 7.721559524536133, "step_time_ms": 3356.109380722046, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:41] (step=0013927) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.2706373882627283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13928, "loss": 0.21972614526748657, "memory_gb": 7.721559524536133, "step_time_ms": 3353.072166442871, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:45] (step=0013928) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.27065682083171394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13929, "loss": 0.27506017684936523, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8928031921387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:48] (step=0013929) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.27067625340069956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13930, "loss": 0.24552759528160095, "memory_gb": 7.721559524536133, "step_time_ms": 3348.0172157287598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:52] (step=0013930) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.2706956859696852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13931, "loss": 0.2681330740451813, "memory_gb": 7.721559524536133, "step_time_ms": 3351.4490127563477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:55] (step=0013931) Train Loss: 0.2840, Train Steps/Sec: 0.28, Epoch: 0.2707151185386708, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:55:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13932, "loss": 0.21649709343910217, "memory_gb": 7.721559524536133, "step_time_ms": 3352.926731109619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:55:59] (step=0013932) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.27073455110765643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13933, "loss": 0.21264788508415222, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5706481933594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:03] (step=0013933) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.27075398367664205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 13934, "loss": 0.18439875543117523, "memory_gb": 7.721559524536133, "step_time_ms": 3352.4930477142334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:06] (step=0013934) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.2707734162456277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13935, "loss": 0.19676655530929565, "memory_gb": 7.721559524536133, "step_time_ms": 3352.898359298706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:10] (step=0013935) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.2707928488146133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 13936, "loss": 0.20355135202407837, "memory_gb": 7.721559524536133, "step_time_ms": 3353.705644607544, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:13] (step=0013936) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.2708122813835989, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 13937, "loss": 0.21188032627105713, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7135334014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:17] (step=0013937) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.27083171395258454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 13938, "loss": 0.24210453033447266, "memory_gb": 7.721559524536133, "step_time_ms": 3347.917079925537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:20] (step=0013938) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.27085114652157016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 13939, "loss": 0.2394966185092926, "memory_gb": 7.721559524536133, "step_time_ms": 3347.0301628112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:24] (step=0013939) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.2708705790905558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13940, "loss": 0.1839507520198822, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5779457092285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:28] (step=0013940) Train Loss: 0.2393, Train Steps/Sec: 0.28, Epoch: 0.2708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 13941, "loss": 0.3008018136024475, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5797080993652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:31] (step=0013941) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.27090944422852703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13942, "loss": 0.18574713170528412, "memory_gb": 7.721559524536133, "step_time_ms": 3338.8686180114746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:35] (step=0013942) Train Loss: 0.2543, Train Steps/Sec: 0.28, Epoch: 0.27092887679751265, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 13943, "loss": 0.2597258687019348, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5215129852295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:38] (step=0013943) Train Loss: 0.2888, Train Steps/Sec: 0.28, Epoch: 0.2709483093664983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 13944, "loss": 0.25196123123168945, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6407947540283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:42] (step=0013944) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.2709677419354839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 13945, "loss": 0.31250590085983276, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5869331359863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:45] (step=0013945) Train Loss: 0.2636, Train Steps/Sec: 0.28, Epoch: 0.27098717450446946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 13946, "loss": 0.17923086881637573, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0539741516113, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:49] (step=0013946) Train Loss: 0.1857, Train Steps/Sec: 0.28, Epoch: 0.2710066070734551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13947, "loss": 0.3444596529006958, "memory_gb": 7.721559524536133, "step_time_ms": 3350.6290912628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:52] (step=0013947) Train Loss: 0.3047, Train Steps/Sec: 0.28, Epoch: 0.2710260396424407, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:56:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 13948, "loss": 0.32733675837516785, "memory_gb": 7.721559524536133, "step_time_ms": 3350.7466316223145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:56:56] (step=0013948) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.27104547221142633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13949, "loss": 0.22593024373054504, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3248176574707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:00] (step=0013949) Train Loss: 0.1690, Train Steps/Sec: 0.28, Epoch: 0.27106490478041195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 13950, "loss": 0.2041015774011612, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8854541778564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:03] (step=0013950) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.2710843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 13951, "loss": 0.17849504947662354, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7333011627197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:07] (step=0013951) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.2711037699183832, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 13952, "loss": 0.24721677601337433, "memory_gb": 7.721559524536133, "step_time_ms": 3360.200881958008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:10] (step=0013952) Train Loss: 0.2370, Train Steps/Sec: 0.27, Epoch: 0.2711232024873688, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 13953, "loss": 0.261671781539917, "memory_gb": 7.721559524536133, "step_time_ms": 3356.982946395874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:14] (step=0013953) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.27114263505635444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13954, "loss": 0.2364206165075302, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4375171661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:18] (step=0013954) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.27116206762534006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 13955, "loss": 0.1888984739780426, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7945232391357, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:21] (step=0013955) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.2711815001943257, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13956, "loss": 0.2489566206932068, "memory_gb": 7.721559524536133, "step_time_ms": 3361.374855041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:25] (step=0013956) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.2712009327633113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 13957, "loss": 0.20435625314712524, "memory_gb": 7.721559524536133, "step_time_ms": 3361.093759536743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:28] (step=0013957) Train Loss: 0.2076, Train Steps/Sec: 0.28, Epoch: 0.27122036533229693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 13958, "loss": 0.2883867621421814, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5616607666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:32] (step=0013958) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.27123979790128255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 13959, "loss": 0.27784115076065063, "memory_gb": 7.721559524536133, "step_time_ms": 3356.868267059326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:35] (step=0013959) Train Loss: 0.2611, Train Steps/Sec: 0.28, Epoch: 0.2712592304702682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 13960, "loss": 0.20747312903404236, "memory_gb": 7.721559524536133, "step_time_ms": 3343.9056873321533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:39] (step=0013960) Train Loss: 0.2833, Train Steps/Sec: 0.28, Epoch: 0.2712786630392538, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13961, "loss": 0.2835831046104431, "memory_gb": 7.721559524536133, "step_time_ms": 3360.034704208374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:43] (step=0013961) Train Loss: 0.2424, Train Steps/Sec: 0.28, Epoch: 0.2712980956082394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 13962, "loss": 0.24892358481884003, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0452156066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:46] (step=0013962) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.27131752817722504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 13963, "loss": 0.2725992798805237, "memory_gb": 7.721559524536133, "step_time_ms": 3354.5119762420654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:50] (step=0013963) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.27133696074621066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 13964, "loss": 0.20401372015476227, "memory_gb": 7.721559524536133, "step_time_ms": 3359.527826309204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:53] (step=0013964) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.2713563933151963, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:57:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 13965, "loss": 0.24779373407363892, "memory_gb": 7.715639114379883, "step_time_ms": 3319.8299407958984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:57:57] (step=0013965) Train Loss: 0.2054, Train Steps/Sec: 0.28, Epoch: 0.2713758258841819, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 13966, "loss": 0.21782749891281128, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9906902313232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:00] (step=0013966) Train Loss: 0.2023, Train Steps/Sec: 0.28, Epoch: 0.27139525845316753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 13967, "loss": 0.3900284767150879, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2464714050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:04] (step=0013967) Train Loss: 0.3174, Train Steps/Sec: 0.28, Epoch: 0.27141469102215315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13968, "loss": 0.2154940664768219, "memory_gb": 7.721559524536133, "step_time_ms": 3359.361410140991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:08] (step=0013968) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.2714341235911388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 13969, "loss": 0.21602174639701843, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6981506347656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:11] (step=0013969) Train Loss: 0.1926, Train Steps/Sec: 0.28, Epoch: 0.27145355616012434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 13970, "loss": 0.30161216855049133, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0503673553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:15] (step=0013970) Train Loss: 0.2245, Train Steps/Sec: 0.28, Epoch: 0.27147298872910997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 13971, "loss": 0.24400536715984344, "memory_gb": 7.721559524536133, "step_time_ms": 3500.7150173187256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:18] (step=0013971) Train Loss: 0.2089, Train Steps/Sec: 0.28, Epoch: 0.2714924212980956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 13972, "loss": 0.25761112570762634, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8628673553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:22] (step=0013972) Train Loss: 0.2595, Train Steps/Sec: 0.28, Epoch: 0.2715118538670812, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 13973, "loss": 0.16281136870384216, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1134033203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:25] (step=0013973) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.27153128643606683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 13974, "loss": 0.19037121534347534, "memory_gb": 7.721559524536133, "step_time_ms": 3365.248918533325, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:29] (step=0013974) Train Loss: 0.1638, Train Steps/Sec: 0.28, Epoch: 0.27155071900505245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 13975, "loss": 0.24561405181884766, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7820529937744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:33] (step=0013975) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.2715701515740381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 13976, "loss": 0.2136429101228714, "memory_gb": 7.721559524536133, "step_time_ms": 3362.947463989258, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:36] (step=0013976) Train Loss: 0.2047, Train Steps/Sec: 0.28, Epoch: 0.2715895841430237, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 13977, "loss": 0.21485435962677002, "memory_gb": 7.721559524536133, "step_time_ms": 3360.823631286621, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:40] (step=0013977) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.2716090167120093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 13978, "loss": 0.24099764227867126, "memory_gb": 7.721559524536133, "step_time_ms": 3366.582155227661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:43] (step=0013978) Train Loss: 0.1992, Train Steps/Sec: 0.28, Epoch: 0.27162844928099494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 13979, "loss": 0.264071524143219, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8316440582275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:47] (step=0013979) Train Loss: 0.2580, Train Steps/Sec: 0.28, Epoch: 0.27164788184998057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 13980, "loss": 0.26640385389328003, "memory_gb": 7.721559524536133, "step_time_ms": 3365.126132965088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:51] (step=0013980) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.2716673144189662, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 13981, "loss": 0.2245059609413147, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1787719726562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:54] (step=0013981) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.2716867469879518, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:58:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 13982, "loss": 0.2967170178890228, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4818325042725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:58:58] (step=0013982) Train Loss: 0.2944, Train Steps/Sec: 0.28, Epoch: 0.27170617955693743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 13983, "loss": 0.14017771184444427, "memory_gb": 7.721559524536133, "step_time_ms": 3360.152006149292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:01] (step=0013983) Train Loss: 0.1622, Train Steps/Sec: 0.28, Epoch: 0.27172561212592306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 13984, "loss": 0.22182916104793549, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1197681427, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:05] (step=0013984) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.2717450446949087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 13985, "loss": 0.20263925194740295, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0072135925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:08] (step=0013985) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.2717644772638943, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 13986, "loss": 0.15366974472999573, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8291778564453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:12] (step=0013986) Train Loss: 0.1902, Train Steps/Sec: 0.28, Epoch: 0.2717839098328799, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 13987, "loss": 0.18077872693538666, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7874336242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:16] (step=0013987) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.27180334240186554, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 13988, "loss": 0.2514023184776306, "memory_gb": 7.721559524536133, "step_time_ms": 3363.06095123291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:19] (step=0013988) Train Loss: 0.2222, Train Steps/Sec: 0.28, Epoch: 0.27182277497085117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 13989, "loss": 0.2196345031261444, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1859760284424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:23] (step=0013989) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.2718422075398368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 13990, "loss": 0.32545989751815796, "memory_gb": 7.721559524536133, "step_time_ms": 3366.298198699951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:26] (step=0013990) Train Loss: 0.2758, Train Steps/Sec: 0.28, Epoch: 0.2718616401088224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 13991, "loss": 0.17814689874649048, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5638694763184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:30] (step=0013991) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.27188107267780803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 13992, "loss": 0.28434962034225464, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6231422424316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:34] (step=0013992) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.2719005052467936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 13993, "loss": 0.28292715549468994, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6364212036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:37] (step=0013993) Train Loss: 0.2282, Train Steps/Sec: 0.28, Epoch: 0.2719199378157792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 13994, "loss": 0.13935726881027222, "memory_gb": 7.721559524536133, "step_time_ms": 3364.867687225342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:41] (step=0013994) Train Loss: 0.1836, Train Steps/Sec: 0.28, Epoch: 0.27193937038476484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 13995, "loss": 0.1305682361125946, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6465587615967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:44] (step=0013995) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.27195880295375047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 13996, "loss": 0.3414064645767212, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4764728546143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:48] (step=0013996) Train Loss: 0.2761, Train Steps/Sec: 0.28, Epoch: 0.2719782355227361, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 13997, "loss": 0.2559412121772766, "memory_gb": 7.721559524536133, "step_time_ms": 3364.701747894287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:52] (step=0013997) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.2719976680917217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 13998, "loss": 0.3072201907634735, "memory_gb": 7.721559524536133, "step_time_ms": 3364.23397064209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:55] (step=0013998) Train Loss: 0.2507, Train Steps/Sec: 0.28, Epoch: 0.27201710066070733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 13:59:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 13999, "loss": 0.25689083337783813, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2468452453613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 13:59:59] (step=0013999) Train Loss: 0.2824, Train Steps/Sec: 0.27, Epoch: 0.27203653322969296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14000, "loss": 0.25546300411224365, "memory_gb": 7.721559524536133, "step_time_ms": 3357.194662094116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:02] (step=0014000) Train Loss: 0.2474, Train Steps/Sec: 0.28, Epoch: 0.2720559657986786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:03] Saved checkpoint to /nvme-data/Komal/documents/results/VisualCloze/lora/depth/checkpoints/0014000/ [2025-07-29 14:00:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14001, "loss": 0.237516850233078, "memory_gb": 7.721559524536133, "step_time_ms": 3361.403226852417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:06] (step=0014001) Train Loss: 0.2698, Train Steps/Sec: 0.27, Epoch: 0.2720753983676642, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14002, "loss": 0.2890864908695221, "memory_gb": 7.721559524536133, "step_time_ms": 3359.889030456543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:10] (step=0014002) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.2720948309366498, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14003, "loss": 0.17769218981266022, "memory_gb": 7.721559524536133, "step_time_ms": 3358.539581298828, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:13] (step=0014003) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.27211426350563545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14004, "loss": 0.14307215809822083, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9272499084473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:17] (step=0014004) Train Loss: 0.1710, Train Steps/Sec: 0.28, Epoch: 0.27213369607462107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14005, "loss": 0.3047332763671875, "memory_gb": 7.721559524536133, "step_time_ms": 3362.720012664795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:20] (step=0014005) Train Loss: 0.3179, Train Steps/Sec: 0.28, Epoch: 0.2721531286436067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14006, "loss": 0.25136974453926086, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1810932159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:24] (step=0014006) Train Loss: 0.2773, Train Steps/Sec: 0.28, Epoch: 0.2721725612125923, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14007, "loss": 0.2208418846130371, "memory_gb": 7.721559524536133, "step_time_ms": 3357.102155685425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:28] (step=0014007) Train Loss: 0.2416, Train Steps/Sec: 0.28, Epoch: 0.27219199378157793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14008, "loss": 0.29476237297058105, "memory_gb": 7.721559524536133, "step_time_ms": 3358.083724975586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:31] (step=0014008) Train Loss: 0.2274, Train Steps/Sec: 0.28, Epoch: 0.27221142635056356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14009, "loss": 0.22244864702224731, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7895889282227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:35] (step=0014009) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.2722308589195492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14010, "loss": 0.34381431341171265, "memory_gb": 7.715639114379883, "step_time_ms": 3323.6467838287354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:38] (step=0014010) Train Loss: 0.3135, Train Steps/Sec: 0.28, Epoch: 0.2722502914885348, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14011, "loss": 0.1950111985206604, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8313846588135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:42] (step=0014011) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.2722697240575204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14012, "loss": 0.31530633568763733, "memory_gb": 7.721559524536133, "step_time_ms": 3361.063241958618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:46] (step=0014012) Train Loss: 0.3066, Train Steps/Sec: 0.28, Epoch: 0.27228915662650605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14013, "loss": 0.239349365234375, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1926822662354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:49] (step=0014013) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.27230858919549167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14014, "loss": 0.23975825309753418, "memory_gb": 7.721559524536133, "step_time_ms": 3493.6513900756836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:53] (step=0014014) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.2723280217644773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:00:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14015, "loss": 0.12010011821985245, "memory_gb": 7.721559524536133, "step_time_ms": 3352.404832839966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:00:56] (step=0014015) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.27234745433346286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14016, "loss": 0.26515626907348633, "memory_gb": 7.721559524536133, "step_time_ms": 3350.966453552246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:00] (step=0014016) Train Loss: 0.2374, Train Steps/Sec: 0.28, Epoch: 0.2723668869024485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14017, "loss": 0.19245001673698425, "memory_gb": 7.721559524536133, "step_time_ms": 3355.678081512451, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:04] (step=0014017) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.2723863194714341, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14018, "loss": 0.3391787111759186, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6524982452393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:07] (step=0014018) Train Loss: 0.3131, Train Steps/Sec: 0.28, Epoch: 0.2724057520404197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14019, "loss": 0.15876361727714539, "memory_gb": 7.721559524536133, "step_time_ms": 3351.5326976776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:11] (step=0014019) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.27242518460940535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14020, "loss": 0.2934102714061737, "memory_gb": 7.721559524536133, "step_time_ms": 3349.623203277588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:14] (step=0014020) Train Loss: 0.2682, Train Steps/Sec: 0.28, Epoch: 0.27244461717839097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14021, "loss": 0.2705814242362976, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3435077667236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:18] (step=0014021) Train Loss: 0.2955, Train Steps/Sec: 0.28, Epoch: 0.2724640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14022, "loss": 0.2100188136100769, "memory_gb": 7.721559524536133, "step_time_ms": 3355.653762817383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:21] (step=0014022) Train Loss: 0.2704, Train Steps/Sec: 0.28, Epoch: 0.2724834823163622, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14023, "loss": 0.20399898290634155, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1806678771973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:25] (step=0014023) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.27250291488534784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14024, "loss": 0.18432117998600006, "memory_gb": 7.721559524536133, "step_time_ms": 3352.409601211548, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:29] (step=0014024) Train Loss: 0.2135, Train Steps/Sec: 0.28, Epoch: 0.27252234745433346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14025, "loss": 0.2190362960100174, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7676544189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:32] (step=0014025) Train Loss: 0.2139, Train Steps/Sec: 0.28, Epoch: 0.2725417800233191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14026, "loss": 0.2546684145927429, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0050983428955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:36] (step=0014026) Train Loss: 0.2323, Train Steps/Sec: 0.28, Epoch: 0.2725612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14027, "loss": 0.2266903817653656, "memory_gb": 7.721559524536133, "step_time_ms": 3334.981918334961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:39] (step=0014027) Train Loss: 0.2853, Train Steps/Sec: 0.28, Epoch: 0.2725806451612903, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14028, "loss": 0.1932649463415146, "memory_gb": 7.721559524536133, "step_time_ms": 3351.62615776062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:43] (step=0014028) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.27260007773027595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14029, "loss": 0.17573249340057373, "memory_gb": 7.721559524536133, "step_time_ms": 3350.491762161255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:47] (step=0014029) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.27261951029926157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14030, "loss": 0.2604835629463196, "memory_gb": 7.721559524536133, "step_time_ms": 3348.9155769348145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:50] (step=0014030) Train Loss: 0.2797, Train Steps/Sec: 0.28, Epoch: 0.2726389428682472, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14031, "loss": 0.17217481136322021, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6956310272217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:54] (step=0014031) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.2726583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:01:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14032, "loss": 0.2826346158981323, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3613891601562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:01:57] (step=0014032) Train Loss: 0.2868, Train Steps/Sec: 0.28, Epoch: 0.27267780800621844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14033, "loss": 0.2061639130115509, "memory_gb": 7.721559524536133, "step_time_ms": 3352.872371673584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:01] (step=0014033) Train Loss: 0.1648, Train Steps/Sec: 0.28, Epoch: 0.27269724057520406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14034, "loss": 0.24134810268878937, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2285900115967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:04] (step=0014034) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.2727166731441897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14035, "loss": 0.2958607077598572, "memory_gb": 7.721559524536133, "step_time_ms": 3353.0235290527344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:08] (step=0014035) Train Loss: 0.2649, Train Steps/Sec: 0.28, Epoch: 0.2727361057131753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14036, "loss": 0.21831804513931274, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7821044921875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:12] (step=0014036) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.2727555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14037, "loss": 0.2901606559753418, "memory_gb": 7.721559524536133, "step_time_ms": 3357.258081436157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:15] (step=0014037) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.27277497085114655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14038, "loss": 0.24619176983833313, "memory_gb": 7.721559524536133, "step_time_ms": 3357.090473175049, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:19] (step=0014038) Train Loss: 0.2101, Train Steps/Sec: 0.28, Epoch: 0.2727944034201321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14039, "loss": 0.213296577334404, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4415912628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:22] (step=0014039) Train Loss: 0.2280, Train Steps/Sec: 0.27, Epoch: 0.27281383598911774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14040, "loss": 0.1923658549785614, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8853397369385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:26] (step=0014040) Train Loss: 0.2203, Train Steps/Sec: 0.28, Epoch: 0.27283326855810336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14041, "loss": 0.24812577664852142, "memory_gb": 7.721559524536133, "step_time_ms": 3357.659339904785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:30] (step=0014041) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.272852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14042, "loss": 0.22862225770950317, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3246002197266, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:33] (step=0014042) Train Loss: 0.1944, Train Steps/Sec: 0.28, Epoch: 0.2728721336960746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14043, "loss": 0.33644378185272217, "memory_gb": 7.721559524536133, "step_time_ms": 3349.7891426086426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:37] (step=0014043) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.2728915662650602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14044, "loss": 0.2552304267883301, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4717769622803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:40] (step=0014044) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.27291099883404585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14045, "loss": 0.34553205966949463, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9909801483154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:44] (step=0014045) Train Loss: 0.3156, Train Steps/Sec: 0.28, Epoch: 0.27293043140303147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14046, "loss": 0.22280697524547577, "memory_gb": 7.721559524536133, "step_time_ms": 3359.562873840332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:47] (step=0014046) Train Loss: 0.1869, Train Steps/Sec: 0.28, Epoch: 0.2729498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14047, "loss": 0.2030528038740158, "memory_gb": 7.721559524536133, "step_time_ms": 3357.710123062134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:51] (step=0014047) Train Loss: 0.2632, Train Steps/Sec: 0.28, Epoch: 0.2729692965410027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14048, "loss": 0.2487737238407135, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4157485961914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:55] (step=0014048) Train Loss: 0.1882, Train Steps/Sec: 0.28, Epoch: 0.27298872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:02:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14049, "loss": 0.2744114398956299, "memory_gb": 7.715639114379883, "step_time_ms": 3317.413806915283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:02:58] (step=0014049) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.27300816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14050, "loss": 0.168285071849823, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4533252716064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:02] (step=0014050) Train Loss: 0.1575, Train Steps/Sec: 0.28, Epoch: 0.2730275942479596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14051, "loss": 0.310748815536499, "memory_gb": 7.721559524536133, "step_time_ms": 3351.755380630493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:05] (step=0014051) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.2730470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14052, "loss": 0.31047388911247253, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3473224639893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:09] (step=0014052) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.2730664593859308, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14053, "loss": 0.16098491847515106, "memory_gb": 7.721559524536133, "step_time_ms": 3361.267328262329, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:13] (step=0014053) Train Loss: 0.1627, Train Steps/Sec: 0.28, Epoch: 0.27308589195491645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14054, "loss": 0.33583611249923706, "memory_gb": 7.721559524536133, "step_time_ms": 3494.579792022705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:16] (step=0014054) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.27310532452390207, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14055, "loss": 0.22039544582366943, "memory_gb": 7.721559524536133, "step_time_ms": 3360.743999481201, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:20] (step=0014055) Train Loss: 0.1749, Train Steps/Sec: 0.28, Epoch: 0.2731247570928877, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14056, "loss": 0.2643668055534363, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7286987304688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:23] (step=0014056) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.2731441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14057, "loss": 0.2524065375328064, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1937294006348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:27] (step=0014057) Train Loss: 0.1902, Train Steps/Sec: 0.28, Epoch: 0.27316362223085894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14058, "loss": 0.28941187262535095, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7635803222656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:30] (step=0014058) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.27318305479984456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14059, "loss": 0.29642176628112793, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7686290740967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:34] (step=0014059) Train Loss: 0.3049, Train Steps/Sec: 0.28, Epoch: 0.2732024873688302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14060, "loss": 0.2121119201183319, "memory_gb": 7.721559524536133, "step_time_ms": 3361.478328704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:38] (step=0014060) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.2732219199378158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14061, "loss": 0.2754385471343994, "memory_gb": 7.721559524536133, "step_time_ms": 3360.81862449646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:41] (step=0014061) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.27324135250680137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14062, "loss": 0.19119849801063538, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0668392181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:45] (step=0014062) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.273260785075787, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14063, "loss": 0.2714058756828308, "memory_gb": 7.721559524536133, "step_time_ms": 3359.161615371704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:48] (step=0014063) Train Loss: 0.2280, Train Steps/Sec: 0.28, Epoch: 0.2732802176447726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14064, "loss": 0.301864355802536, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7408485412598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:52] (step=0014064) Train Loss: 0.2897, Train Steps/Sec: 0.28, Epoch: 0.27329965021375824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14065, "loss": 0.17940038442611694, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6628131866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:55] (step=0014065) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.27331908278274386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:03:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14066, "loss": 0.17001238465309143, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1745319366455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:03:59] (step=0014066) Train Loss: 0.1821, Train Steps/Sec: 0.28, Epoch: 0.2733385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14067, "loss": 0.24677439033985138, "memory_gb": 7.721559524536133, "step_time_ms": 3364.483594894409, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:03] (step=0014067) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2733579479207151, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14068, "loss": 0.14990058541297913, "memory_gb": 7.721559524536133, "step_time_ms": 3357.72705078125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:06] (step=0014068) Train Loss: 0.1731, Train Steps/Sec: 0.28, Epoch: 0.2733773804897007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14069, "loss": 0.25236639380455017, "memory_gb": 7.721559524536133, "step_time_ms": 3365.973949432373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:10] (step=0014069) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.27339681305868635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14070, "loss": 0.22396138310432434, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8499507904053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:13] (step=0014070) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.27341624562767197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14071, "loss": 0.12952753901481628, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2873039245605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:17] (step=0014071) Train Loss: 0.1839, Train Steps/Sec: 0.28, Epoch: 0.2734356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14072, "loss": 0.2574078142642975, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7737712860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:21] (step=0014072) Train Loss: 0.3017, Train Steps/Sec: 0.28, Epoch: 0.2734551107656432, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14073, "loss": 0.16947267949581146, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2916679382324, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:24] (step=0014073) Train Loss: 0.2005, Train Steps/Sec: 0.28, Epoch: 0.27347454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14074, "loss": 0.2500458359718323, "memory_gb": 7.721559524536133, "step_time_ms": 3362.647771835327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:28] (step=0014074) Train Loss: 0.2081, Train Steps/Sec: 0.28, Epoch: 0.27349397590361446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14075, "loss": 0.23929719626903534, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1794662475586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:31] (step=0014075) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.2735134084726001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14076, "loss": 0.2684696912765503, "memory_gb": 7.721559524536133, "step_time_ms": 3364.180088043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:35] (step=0014076) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.2735328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14077, "loss": 0.1291193664073944, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8461570739746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:39] (step=0014077) Train Loss: 0.1802, Train Steps/Sec: 0.28, Epoch: 0.27355227361057133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14078, "loss": 0.22786682844161987, "memory_gb": 7.721559524536133, "step_time_ms": 3358.039140701294, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:42] (step=0014078) Train Loss: 0.2069, Train Steps/Sec: 0.28, Epoch: 0.27357170617955695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14079, "loss": 0.26644259691238403, "memory_gb": 7.721559524536133, "step_time_ms": 3369.826316833496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:46] (step=0014079) Train Loss: 0.2806, Train Steps/Sec: 0.28, Epoch: 0.2735911387485426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14080, "loss": 0.24006131291389465, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3100261688232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:49] (step=0014080) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.2736105713175282, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14081, "loss": 0.1740608662366867, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6158447265625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:53] (step=0014081) Train Loss: 0.2262, Train Steps/Sec: 0.28, Epoch: 0.2736300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:04:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14082, "loss": 0.2536168098449707, "memory_gb": 7.721559524536133, "step_time_ms": 3368.216037750244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:04:57] (step=0014082) Train Loss: 0.2825, Train Steps/Sec: 0.28, Epoch: 0.27364943645549944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14083, "loss": 0.19859278202056885, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3598346710205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:00] (step=0014083) Train Loss: 0.1970, Train Steps/Sec: 0.28, Epoch: 0.27366886902448506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14084, "loss": 0.3103867769241333, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1929111480713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:04] (step=0014084) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.2736883015934707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14085, "loss": 0.20355525612831116, "memory_gb": 7.721559524536133, "step_time_ms": 3371.72532081604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:07] (step=0014085) Train Loss: 0.2371, Train Steps/Sec: 0.27, Epoch: 0.27370773416245625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14086, "loss": 0.2097354531288147, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8415546417236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:11] (step=0014086) Train Loss: 0.2304, Train Steps/Sec: 0.28, Epoch: 0.2737271667314419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14087, "loss": 0.22085805237293243, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6316528320312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:15] (step=0014087) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.2737465993004275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14088, "loss": 0.25479984283447266, "memory_gb": 7.721559524536133, "step_time_ms": 3359.142065048218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:18] (step=0014088) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.2737660318694131, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14089, "loss": 0.22873422503471375, "memory_gb": 7.721559524536133, "step_time_ms": 3359.246015548706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:22] (step=0014089) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.27378546443839874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14090, "loss": 0.2929927706718445, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5402660369873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:26] (step=0014090) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.27380489700738436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14091, "loss": 0.18413224816322327, "memory_gb": 7.721559524536133, "step_time_ms": 3362.110376358032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:29] (step=0014091) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.27382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14092, "loss": 0.23192790150642395, "memory_gb": 7.721559524536133, "step_time_ms": 3359.099864959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:33] (step=0014092) Train Loss: 0.2741, Train Steps/Sec: 0.28, Epoch: 0.2738437621453556, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14093, "loss": 0.2542904019355774, "memory_gb": 7.721559524536133, "step_time_ms": 3364.518642425537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:36] (step=0014093) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.27386319471434123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14094, "loss": 0.2544226050376892, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0761585235596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:40] (step=0014094) Train Loss: 0.2351, Train Steps/Sec: 0.28, Epoch: 0.27388262728332685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14095, "loss": 0.2689545154571533, "memory_gb": 7.721559524536133, "step_time_ms": 3369.3301677703857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:44] (step=0014095) Train Loss: 0.2857, Train Steps/Sec: 0.28, Epoch: 0.2739020598523125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14096, "loss": 0.24622464179992676, "memory_gb": 7.721559524536133, "step_time_ms": 3360.405921936035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:47] (step=0014096) Train Loss: 0.2221, Train Steps/Sec: 0.28, Epoch: 0.2739214924212981, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14097, "loss": 0.27748167514801025, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8297576904297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:51] (step=0014097) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.2739409249902837, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14098, "loss": 0.26479554176330566, "memory_gb": 7.721559524536133, "step_time_ms": 3362.168073654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:54] (step=0014098) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.27396035755926934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:05:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14099, "loss": 0.3326510190963745, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1820678710938, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:05:58] (step=0014099) Train Loss: 0.3149, Train Steps/Sec: 0.28, Epoch: 0.27397979012825496, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14100, "loss": 0.22291187942028046, "memory_gb": 7.721559524536133, "step_time_ms": 3366.556406021118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:02] (step=0014100) Train Loss: 0.1632, Train Steps/Sec: 0.28, Epoch: 0.2739992226972406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14101, "loss": 0.27917706966400146, "memory_gb": 7.721559524536133, "step_time_ms": 3506.0484409332275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:05] (step=0014101) Train Loss: 0.2759, Train Steps/Sec: 0.28, Epoch: 0.2740186552662262, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14102, "loss": 0.2669167220592499, "memory_gb": 7.721559524536133, "step_time_ms": 3347.252130508423, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:09] (step=0014102) Train Loss: 0.2535, Train Steps/Sec: 0.28, Epoch: 0.27403808783521183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14103, "loss": 0.2375912368297577, "memory_gb": 7.721559524536133, "step_time_ms": 3361.602783203125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:12] (step=0014103) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.27405752040419745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14104, "loss": 0.24024207890033722, "memory_gb": 7.721559524536133, "step_time_ms": 3356.320381164551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:16] (step=0014104) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.2740769529731831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14105, "loss": 0.3233911097049713, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8580360412598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:19] (step=0014105) Train Loss: 0.2726, Train Steps/Sec: 0.28, Epoch: 0.2740963855421687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14106, "loss": 0.19410735368728638, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3829917907715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:23] (step=0014106) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.2741158181111543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14107, "loss": 0.16221977770328522, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2681159973145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:27] (step=0014107) Train Loss: 0.2020, Train Steps/Sec: 0.28, Epoch: 0.27413525068013994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14108, "loss": 0.31849175691604614, "memory_gb": 7.721559524536133, "step_time_ms": 3353.3904552459717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:30] (step=0014108) Train Loss: 0.2834, Train Steps/Sec: 0.28, Epoch: 0.2741546832491255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14109, "loss": 0.21860335767269135, "memory_gb": 7.721559524536133, "step_time_ms": 3356.037139892578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:34] (step=0014109) Train Loss: 0.2172, Train Steps/Sec: 0.28, Epoch: 0.27417411581811113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14110, "loss": 0.3278282880783081, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6761951446533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:37] (step=0014110) Train Loss: 0.3278, Train Steps/Sec: 0.28, Epoch: 0.27419354838709675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14111, "loss": 0.28142088651657104, "memory_gb": 7.721559524536133, "step_time_ms": 3355.785131454468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:41] (step=0014111) Train Loss: 0.2302, Train Steps/Sec: 0.28, Epoch: 0.2742129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14112, "loss": 0.18887698650360107, "memory_gb": 7.721559524536133, "step_time_ms": 3356.013059616089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:45] (step=0014112) Train Loss: 0.1582, Train Steps/Sec: 0.28, Epoch: 0.274232413525068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14113, "loss": 0.2528850734233856, "memory_gb": 7.721559524536133, "step_time_ms": 3357.079029083252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:48] (step=0014113) Train Loss: 0.2520, Train Steps/Sec: 0.28, Epoch: 0.2742518460940536, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14114, "loss": 0.17112380266189575, "memory_gb": 7.721559524536133, "step_time_ms": 3350.1861095428467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:52] (step=0014114) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.27427127866303924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14115, "loss": 0.21362268924713135, "memory_gb": 7.721559524536133, "step_time_ms": 3354.947328567505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:55] (step=0014115) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.27429071123202486, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:06:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14116, "loss": 0.28071099519729614, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2564849853516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:06:59] (step=0014116) Train Loss: 0.2291, Train Steps/Sec: 0.28, Epoch: 0.2743101438010105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14117, "loss": 0.2647809684276581, "memory_gb": 7.721559524536133, "step_time_ms": 3355.132579803467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:03] (step=0014117) Train Loss: 0.2684, Train Steps/Sec: 0.28, Epoch: 0.2743295763699961, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14118, "loss": 0.1253366768360138, "memory_gb": 7.721559524536133, "step_time_ms": 3356.537342071533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:06] (step=0014118) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.27434900893898173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14119, "loss": 0.16446912288665771, "memory_gb": 7.721559524536133, "step_time_ms": 3354.078769683838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:10] (step=0014119) Train Loss: 0.1871, Train Steps/Sec: 0.28, Epoch: 0.27436844150796735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14120, "loss": 0.2583789527416229, "memory_gb": 7.721559524536133, "step_time_ms": 3358.201503753662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:13] (step=0014120) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.274387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14121, "loss": 0.13165651261806488, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8378944396973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:17] (step=0014121) Train Loss: 0.1367, Train Steps/Sec: 0.28, Epoch: 0.2744073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14122, "loss": 0.21245954930782318, "memory_gb": 7.721559524536133, "step_time_ms": 3354.119062423706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:21] (step=0014122) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.2744267392149242, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14123, "loss": 0.30548304319381714, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0664863586426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:24] (step=0014123) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.27444617178390984, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14124, "loss": 0.14067144691944122, "memory_gb": 7.721559524536133, "step_time_ms": 3351.759433746338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:28] (step=0014124) Train Loss: 0.1388, Train Steps/Sec: 0.28, Epoch: 0.27446560435289546, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14125, "loss": 0.27885758876800537, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5170040130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:31] (step=0014125) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.2744850369218811, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14126, "loss": 0.29285579919815063, "memory_gb": 7.721559524536133, "step_time_ms": 3357.165813446045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:35] (step=0014126) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.2745044694908667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14127, "loss": 0.2685946524143219, "memory_gb": 7.721559524536133, "step_time_ms": 3354.4230461120605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:38] (step=0014127) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.27452390205985233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14128, "loss": 0.30305173993110657, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0478172302246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:42] (step=0014128) Train Loss: 0.2576, Train Steps/Sec: 0.28, Epoch: 0.27454333462883795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14129, "loss": 0.3320843577384949, "memory_gb": 7.721559524536133, "step_time_ms": 3353.623151779175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:46] (step=0014129) Train Loss: 0.3118, Train Steps/Sec: 0.28, Epoch: 0.2745627671978236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14130, "loss": 0.19201847910881042, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9536418914795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:49] (step=0014130) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.2745821997668092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14131, "loss": 0.2281697392463684, "memory_gb": 7.721559524536133, "step_time_ms": 3354.891777038574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:53] (step=0014131) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.27460163233579477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:07:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14132, "loss": 0.2982231378555298, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9587421417236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:07:56] (step=0014132) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.2746210649047804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14133, "loss": 0.19112087786197662, "memory_gb": 7.721559524536133, "step_time_ms": 3358.349323272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:00] (step=0014133) Train Loss: 0.2110, Train Steps/Sec: 0.27, Epoch: 0.274640497473766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14134, "loss": 0.28918007016181946, "memory_gb": 7.721559524536133, "step_time_ms": 3354.753255844116, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:04] (step=0014134) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.27465993004275163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14135, "loss": 0.27114126086235046, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8317375183105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:07] (step=0014135) Train Loss: 0.2606, Train Steps/Sec: 0.28, Epoch: 0.27467936261173725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14136, "loss": 0.2661988437175751, "memory_gb": 7.721559524536133, "step_time_ms": 3343.2013988494873, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:11] (step=0014136) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2746987951807229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14137, "loss": 0.3027254343032837, "memory_gb": 7.721559524536133, "step_time_ms": 3359.516382217407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:14] (step=0014137) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.2747182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14138, "loss": 0.1930702030658722, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5500717163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:18] (step=0014138) Train Loss: 0.1535, Train Steps/Sec: 0.28, Epoch: 0.2747376603186941, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14139, "loss": 0.19335216283798218, "memory_gb": 7.721559524536133, "step_time_ms": 3354.0399074554443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:22] (step=0014139) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.27475709288767974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14140, "loss": 0.26587796211242676, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0775260925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:25] (step=0014140) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.27477652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14141, "loss": 0.23909761011600494, "memory_gb": 7.721559524536133, "step_time_ms": 3354.062557220459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:29] (step=0014141) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.274795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14142, "loss": 0.19472923874855042, "memory_gb": 7.721559524536133, "step_time_ms": 3494.652271270752, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:32] (step=0014142) Train Loss: 0.2472, Train Steps/Sec: 0.28, Epoch: 0.2748153905946366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14143, "loss": 0.18297898769378662, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0643920898438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:36] (step=0014143) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.27483482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14144, "loss": 0.19979003071784973, "memory_gb": 7.721559524536133, "step_time_ms": 3349.172830581665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:39] (step=0014144) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.27485425573260786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14145, "loss": 0.2098952829837799, "memory_gb": 7.721559524536133, "step_time_ms": 3355.586051940918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:43] (step=0014145) Train Loss: 0.1855, Train Steps/Sec: 0.28, Epoch: 0.2748736883015935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14146, "loss": 0.19323526322841644, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0547313690186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:47] (step=0014146) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.2748931208705791, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14147, "loss": 0.23366476595401764, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7862510681152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:50] (step=0014147) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.2749125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14148, "loss": 0.20609936118125916, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4144325256348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:54] (step=0014148) Train Loss: 0.2021, Train Steps/Sec: 0.28, Epoch: 0.27493198600855034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:08:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14149, "loss": 0.14083755016326904, "memory_gb": 7.721559524536133, "step_time_ms": 3361.572027206421, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:08:57] (step=0014149) Train Loss: 0.1695, Train Steps/Sec: 0.28, Epoch: 0.27495141857753597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14150, "loss": 0.28065162897109985, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9630546569824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:01] (step=0014150) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.2749708511465216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14151, "loss": 0.2538852393627167, "memory_gb": 7.721559524536133, "step_time_ms": 3363.969326019287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:05] (step=0014151) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.2749902837155072, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14152, "loss": 0.24001114070415497, "memory_gb": 7.721559524536133, "step_time_ms": 3360.168933868408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:08] (step=0014152) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.27500971628449283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14153, "loss": 0.24538585543632507, "memory_gb": 7.721559524536133, "step_time_ms": 3359.099864959717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:12] (step=0014153) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.27502914885347846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14154, "loss": 0.29767152667045593, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8721237182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:15] (step=0014154) Train Loss: 0.2233, Train Steps/Sec: 0.28, Epoch: 0.275048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14155, "loss": 0.28518927097320557, "memory_gb": 7.721559524536133, "step_time_ms": 3344.744920730591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:19] (step=0014155) Train Loss: 0.2336, Train Steps/Sec: 0.28, Epoch: 0.27506801399144964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14156, "loss": 0.2200477123260498, "memory_gb": 7.721559524536133, "step_time_ms": 3356.813669204712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:22] (step=0014156) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.27508744656043527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14157, "loss": 0.28604090213775635, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9885234832764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:26] (step=0014157) Train Loss: 0.2057, Train Steps/Sec: 0.28, Epoch: 0.2751068791294209, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14158, "loss": 0.228824183344841, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1879863739014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:30] (step=0014158) Train Loss: 0.2848, Train Steps/Sec: 0.28, Epoch: 0.2751263116984065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14159, "loss": 0.250286728143692, "memory_gb": 7.721559524536133, "step_time_ms": 3360.048294067383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:33] (step=0014159) Train Loss: 0.2329, Train Steps/Sec: 0.28, Epoch: 0.27514574426739213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14160, "loss": 0.2320832908153534, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1137866973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:37] (step=0014160) Train Loss: 0.2907, Train Steps/Sec: 0.28, Epoch: 0.27516517683637776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14161, "loss": 0.23366889357566833, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8101444244385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:40] (step=0014161) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.2751846094053634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14162, "loss": 0.21913321316242218, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8953742980957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:44] (step=0014162) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.275204041974349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14163, "loss": 0.28463655710220337, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7446212768555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:47] (step=0014163) Train Loss: 0.2423, Train Steps/Sec: 0.28, Epoch: 0.2752234745433346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14164, "loss": 0.2059553563594818, "memory_gb": 7.721559524536133, "step_time_ms": 3362.79296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:51] (step=0014164) Train Loss: 0.2904, Train Steps/Sec: 0.28, Epoch: 0.27524290711232025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14165, "loss": 0.2474159300327301, "memory_gb": 7.721559524536133, "step_time_ms": 3354.7074794769287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:55] (step=0014165) Train Loss: 0.2638, Train Steps/Sec: 0.28, Epoch: 0.27526233968130587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:09:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14166, "loss": 0.28587087988853455, "memory_gb": 7.721559524536133, "step_time_ms": 3355.520725250244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:09:58] (step=0014166) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.2752817722502915, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14167, "loss": 0.19804687798023224, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8176918029785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:02] (step=0014167) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.2753012048192771, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14168, "loss": 0.16604217886924744, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7988147735596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:05] (step=0014168) Train Loss: 0.1485, Train Steps/Sec: 0.28, Epoch: 0.27532063738826273, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14169, "loss": 0.22103199362754822, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0665378570557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:09] (step=0014169) Train Loss: 0.1757, Train Steps/Sec: 0.28, Epoch: 0.27534006995724836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14170, "loss": 0.28993695974349976, "memory_gb": 7.721559524536133, "step_time_ms": 3357.353925704956, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:13] (step=0014170) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.275359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14171, "loss": 0.14689499139785767, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2709770202637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:16] (step=0014171) Train Loss: 0.1911, Train Steps/Sec: 0.28, Epoch: 0.2753789350952196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14172, "loss": 0.24397259950637817, "memory_gb": 7.715639114379883, "step_time_ms": 3329.3426036834717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:20] (step=0014172) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.2753983676642052, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14173, "loss": 0.2645293176174164, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4659519195557, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:23] (step=0014173) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.27541780023319085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14174, "loss": 0.2322874814271927, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6530170440674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:27] (step=0014174) Train Loss: 0.2661, Train Steps/Sec: 0.27, Epoch: 0.27543723280217647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14175, "loss": 0.3446548581123352, "memory_gb": 7.721559524536133, "step_time_ms": 3361.485481262207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:31] (step=0014175) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.2754566653711621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14176, "loss": 0.2486010491847992, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1459426879883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:34] (step=0014176) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.2754760979401477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14177, "loss": 0.22285205125808716, "memory_gb": 7.721559524536133, "step_time_ms": 3363.028049468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:38] (step=0014177) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.27549553050913334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14178, "loss": 0.23817762732505798, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0502433776855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:41] (step=0014178) Train Loss: 0.2285, Train Steps/Sec: 0.28, Epoch: 0.2755149630781189, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14179, "loss": 0.2603302597999573, "memory_gb": 7.721559524536133, "step_time_ms": 3362.727165222168, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:45] (step=0014179) Train Loss: 0.2605, Train Steps/Sec: 0.28, Epoch: 0.2755343956471045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14180, "loss": 0.17869016528129578, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1025390625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:49] (step=0014180) Train Loss: 0.2097, Train Steps/Sec: 0.28, Epoch: 0.27555382821609015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14181, "loss": 0.25306519865989685, "memory_gb": 7.721559524536133, "step_time_ms": 3364.305257797241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:52] (step=0014181) Train Loss: 0.2156, Train Steps/Sec: 0.28, Epoch: 0.27557326078507577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14182, "loss": 0.30259597301483154, "memory_gb": 7.721559524536133, "step_time_ms": 3361.769199371338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:56] (step=0014182) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.2755926933540614, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14183, "loss": 0.2352066934108734, "memory_gb": 7.721559524536133, "step_time_ms": 3363.321542739868, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:10:59] (step=0014183) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.275612125923047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14184, "loss": 0.34414711594581604, "memory_gb": 7.721559524536133, "step_time_ms": 3363.447904586792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:03] (step=0014184) Train Loss: 0.2700, Train Steps/Sec: 0.28, Epoch: 0.27563155849203264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14185, "loss": 0.23655709624290466, "memory_gb": 7.721559524536133, "step_time_ms": 3372.128963470459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:07] (step=0014185) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.27565099106101826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14186, "loss": 0.16121062636375427, "memory_gb": 7.721559524536133, "step_time_ms": 3368.572235107422, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:10] (step=0014186) Train Loss: 0.1957, Train Steps/Sec: 0.28, Epoch: 0.2756704236300039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14187, "loss": 0.3462088704109192, "memory_gb": 7.721559524536133, "step_time_ms": 3362.882614135742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:14] (step=0014187) Train Loss: 0.3056, Train Steps/Sec: 0.28, Epoch: 0.2756898561989895, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14188, "loss": 0.3371417820453644, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0104999542236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:17] (step=0014188) Train Loss: 0.2847, Train Steps/Sec: 0.28, Epoch: 0.2757092887679751, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14189, "loss": 0.17154070734977722, "memory_gb": 7.721559524536133, "step_time_ms": 3363.567352294922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:21] (step=0014189) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.27572872133696075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14190, "loss": 0.1728818416595459, "memory_gb": 7.721559524536133, "step_time_ms": 3500.5710124969482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:25] (step=0014190) Train Loss: 0.1816, Train Steps/Sec: 0.28, Epoch: 0.27574815390594637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14191, "loss": 0.19012558460235596, "memory_gb": 7.721559524536133, "step_time_ms": 3362.616539001465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:28] (step=0014191) Train Loss: 0.2199, Train Steps/Sec: 0.28, Epoch: 0.275767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14192, "loss": 0.22020074725151062, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8411502838135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:32] (step=0014192) Train Loss: 0.1885, Train Steps/Sec: 0.28, Epoch: 0.2757870190439176, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14193, "loss": 0.1650189757347107, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0525341033936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:35] (step=0014193) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.27580645161290324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14194, "loss": 0.3036803901195526, "memory_gb": 7.721559524536133, "step_time_ms": 3356.586217880249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:39] (step=0014194) Train Loss: 0.3121, Train Steps/Sec: 0.28, Epoch: 0.27582588418188886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14195, "loss": 0.29240304231643677, "memory_gb": 7.715639114379883, "step_time_ms": 3329.716920852661, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:43] (step=0014195) Train Loss: 0.2594, Train Steps/Sec: 0.28, Epoch: 0.2758453167508745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14196, "loss": 0.29702669382095337, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9468002319336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:46] (step=0014196) Train Loss: 0.3281, Train Steps/Sec: 0.28, Epoch: 0.2758647493198601, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14197, "loss": 0.2188253402709961, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1112365722656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:50] (step=0014197) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.2758841818888457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14198, "loss": 0.17525607347488403, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7584190368652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:53] (step=0014198) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.27590361445783135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:11:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14199, "loss": 0.2961980104446411, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1435165405273, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:11:57] (step=0014199) Train Loss: 0.2455, Train Steps/Sec: 0.28, Epoch: 0.27592304702681697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14200, "loss": 0.276211678981781, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2772064208984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:01] (step=0014200) Train Loss: 0.2642, Train Steps/Sec: 0.28, Epoch: 0.2759424795958026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14201, "loss": 0.22880800068378448, "memory_gb": 7.721559524536133, "step_time_ms": 3364.319324493408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:04] (step=0014201) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.27596191216478816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14202, "loss": 0.2603829503059387, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5801544189453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:08] (step=0014202) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.2759813447337738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14203, "loss": 0.3889043629169464, "memory_gb": 7.721559524536133, "step_time_ms": 3362.079381942749, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:11] (step=0014203) Train Loss: 0.2913, Train Steps/Sec: 0.28, Epoch: 0.2760007773027594, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14204, "loss": 0.2947232127189636, "memory_gb": 7.721559524536133, "step_time_ms": 3354.1128635406494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:15] (step=0014204) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.276020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14205, "loss": 0.22345396876335144, "memory_gb": 7.721559524536133, "step_time_ms": 3361.978530883789, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:19] (step=0014205) Train Loss: 0.2192, Train Steps/Sec: 0.28, Epoch: 0.27603964244073065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14206, "loss": 0.1964285671710968, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6277027130127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:22] (step=0014206) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.27605907500971627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14207, "loss": 0.2696945071220398, "memory_gb": 7.721559524536133, "step_time_ms": 3361.553430557251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:26] (step=0014207) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.2760785075787019, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14208, "loss": 0.3117297291755676, "memory_gb": 7.721559524536133, "step_time_ms": 3355.046510696411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:29] (step=0014208) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.2760979401476875, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14209, "loss": 0.2611795961856842, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2652130126953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:33] (step=0014209) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.27611737271667314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14210, "loss": 0.12977316975593567, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5455837249756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:37] (step=0014210) Train Loss: 0.1676, Train Steps/Sec: 0.28, Epoch: 0.27613680528565876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14211, "loss": 0.305213987827301, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9331588745117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:40] (step=0014211) Train Loss: 0.2783, Train Steps/Sec: 0.28, Epoch: 0.2761562378546444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14212, "loss": 0.23405198752880096, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7655086517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:44] (step=0014212) Train Loss: 0.2189, Train Steps/Sec: 0.28, Epoch: 0.27617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14213, "loss": 0.20294207334518433, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2656593322754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:47] (step=0014213) Train Loss: 0.2118, Train Steps/Sec: 0.28, Epoch: 0.2761951029926156, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14214, "loss": 0.20196834206581116, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9984645843506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:51] (step=0014214) Train Loss: 0.2025, Train Steps/Sec: 0.28, Epoch: 0.27621453556160125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14215, "loss": 0.1851814240217209, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2309017181396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:55] (step=0014215) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.27623396813058687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:12:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14216, "loss": 0.2551545202732086, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3459644317627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:12:58] (step=0014216) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.2762534006995725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14217, "loss": 0.26943254470825195, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4206314086914, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:02] (step=0014217) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.2762728332685581, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14218, "loss": 0.17738237977027893, "memory_gb": 7.721559524536133, "step_time_ms": 3357.3737144470215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:05] (step=0014218) Train Loss: 0.1668, Train Steps/Sec: 0.28, Epoch: 0.27629226583754374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14219, "loss": 0.2180730104446411, "memory_gb": 7.721559524536133, "step_time_ms": 3358.440637588501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:09] (step=0014219) Train Loss: 0.3126, Train Steps/Sec: 0.28, Epoch: 0.27631169840652936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14220, "loss": 0.33517730236053467, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8575592041016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:13] (step=0014220) Train Loss: 0.2426, Train Steps/Sec: 0.28, Epoch: 0.276331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14221, "loss": 0.2941482365131378, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8218479156494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:16] (step=0014221) Train Loss: 0.2447, Train Steps/Sec: 0.28, Epoch: 0.2763505635445006, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14222, "loss": 0.3323914408683777, "memory_gb": 7.715639114379883, "step_time_ms": 3317.18373298645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:20] (step=0014222) Train Loss: 0.3218, Train Steps/Sec: 0.27, Epoch: 0.2763699961134862, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14223, "loss": 0.23763424158096313, "memory_gb": 7.721559524536133, "step_time_ms": 3356.076240539551, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:23] (step=0014223) Train Loss: 0.2266, Train Steps/Sec: 0.28, Epoch: 0.27638942868247185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14224, "loss": 0.3549990653991699, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8152446746826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:27] (step=0014224) Train Loss: 0.3049, Train Steps/Sec: 0.28, Epoch: 0.2764088612514574, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14225, "loss": 0.2656829059123993, "memory_gb": 7.721559524536133, "step_time_ms": 3347.700357437134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:31] (step=0014225) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.27642829382044304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14226, "loss": 0.3011849522590637, "memory_gb": 7.721559524536133, "step_time_ms": 3361.640453338623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:34] (step=0014226) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.27644772638942866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14227, "loss": 0.18338313698768616, "memory_gb": 7.721559524536133, "step_time_ms": 3354.933738708496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:38] (step=0014227) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.2764671589584143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14228, "loss": 0.2124459147453308, "memory_gb": 7.721559524536133, "step_time_ms": 3358.781099319458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:41] (step=0014228) Train Loss: 0.2623, Train Steps/Sec: 0.28, Epoch: 0.2764865915273999, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14229, "loss": 0.238429993391037, "memory_gb": 7.721559524536133, "step_time_ms": 3358.2375049591064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:45] (step=0014229) Train Loss: 0.2739, Train Steps/Sec: 0.28, Epoch: 0.2765060240963855, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14230, "loss": 0.33991673588752747, "memory_gb": 7.721559524536133, "step_time_ms": 3495.2495098114014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:49] (step=0014230) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.27652545666537115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14231, "loss": 0.18737825751304626, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6018085479736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:52] (step=0014231) Train Loss: 0.2502, Train Steps/Sec: 0.28, Epoch: 0.27654488923435677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14232, "loss": 0.3134157061576843, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3358268737793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:56] (step=0014232) Train Loss: 0.2795, Train Steps/Sec: 0.28, Epoch: 0.2765643218033424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:13:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14233, "loss": 0.09203892946243286, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6971759796143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:13:59] (step=0014233) Train Loss: 0.1480, Train Steps/Sec: 0.28, Epoch: 0.276583754372328, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14234, "loss": 0.2068134844303131, "memory_gb": 7.721559524536133, "step_time_ms": 3349.6034145355225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:03] (step=0014234) Train Loss: 0.2150, Train Steps/Sec: 0.28, Epoch: 0.27660318694131364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14235, "loss": 0.2307395190000534, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2136840820312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:07] (step=0014235) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.27662261951029926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14236, "loss": 0.34016305208206177, "memory_gb": 7.721559524536133, "step_time_ms": 3360.203742980957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:10] (step=0014236) Train Loss: 0.3378, Train Steps/Sec: 0.28, Epoch: 0.2766420520792849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14237, "loss": 0.3410465717315674, "memory_gb": 7.721559524536133, "step_time_ms": 3356.325387954712, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:14] (step=0014237) Train Loss: 0.3018, Train Steps/Sec: 0.28, Epoch: 0.2766614846482705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14238, "loss": 0.1751953363418579, "memory_gb": 7.721559524536133, "step_time_ms": 3352.8337478637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:17] (step=0014238) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.27668091721725613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14239, "loss": 0.16528891026973724, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3340644836426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:21] (step=0014239) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.27670034978624175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14240, "loss": 0.14577996730804443, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1109981536865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:24] (step=0014240) Train Loss: 0.2191, Train Steps/Sec: 0.28, Epoch: 0.2767197823552274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14241, "loss": 0.2163795381784439, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1020832061768, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:28] (step=0014241) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.276739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14242, "loss": 0.3376457095146179, "memory_gb": 7.721559524536133, "step_time_ms": 3358.651638031006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:32] (step=0014242) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.2767586474931986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14243, "loss": 0.16087044775485992, "memory_gb": 7.721559524536133, "step_time_ms": 3359.180212020874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:35] (step=0014243) Train Loss: 0.1895, Train Steps/Sec: 0.28, Epoch: 0.27677808006218424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14244, "loss": 0.19491547346115112, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6951541900635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:39] (step=0014244) Train Loss: 0.2234, Train Steps/Sec: 0.28, Epoch: 0.27679751263116986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14245, "loss": 0.2739981412887573, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8667640686035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:42] (step=0014245) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.2768169452001555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14246, "loss": 0.20470455288887024, "memory_gb": 7.721559524536133, "step_time_ms": 3354.151725769043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:46] (step=0014246) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.2768363777691411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14247, "loss": 0.24578045308589935, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3718795776367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:50] (step=0014247) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.2768558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14248, "loss": 0.23095344007015228, "memory_gb": 7.721559524536133, "step_time_ms": 3353.00874710083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:53] (step=0014248) Train Loss: 0.2626, Train Steps/Sec: 0.28, Epoch: 0.2768752429071123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:14:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14249, "loss": 0.14912211894989014, "memory_gb": 7.721559524536133, "step_time_ms": 3355.318546295166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:14:57] (step=0014249) Train Loss: 0.1711, Train Steps/Sec: 0.28, Epoch: 0.2768946754760979, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14250, "loss": 0.17456720769405365, "memory_gb": 7.721559524536133, "step_time_ms": 3350.219488143921, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:00] (step=0014250) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.27691410804508354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14251, "loss": 0.24194268882274628, "memory_gb": 7.721559524536133, "step_time_ms": 3359.675168991089, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:04] (step=0014251) Train Loss: 0.2417, Train Steps/Sec: 0.28, Epoch: 0.27693354061406916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14252, "loss": 0.26336199045181274, "memory_gb": 7.721559524536133, "step_time_ms": 3358.430862426758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:07] (step=0014252) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.2769529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14253, "loss": 0.10699932277202606, "memory_gb": 7.721559524536133, "step_time_ms": 3339.667320251465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:11] (step=0014253) Train Loss: 0.2056, Train Steps/Sec: 0.28, Epoch: 0.2769724057520404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14254, "loss": 0.18391242623329163, "memory_gb": 7.721559524536133, "step_time_ms": 3356.973171234131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:15] (step=0014254) Train Loss: 0.1738, Train Steps/Sec: 0.28, Epoch: 0.27699183832102603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14255, "loss": 0.3712019920349121, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7089309692383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:18] (step=0014255) Train Loss: 0.2677, Train Steps/Sec: 0.28, Epoch: 0.27701127089001165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14256, "loss": 0.147216334939003, "memory_gb": 7.721559524536133, "step_time_ms": 3346.9338417053223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:22] (step=0014256) Train Loss: 0.1493, Train Steps/Sec: 0.28, Epoch: 0.2770307034589973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14257, "loss": 0.22068165242671967, "memory_gb": 7.721559524536133, "step_time_ms": 3339.214563369751, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:25] (step=0014257) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.2770501360279829, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14258, "loss": 0.21951410174369812, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6173992156982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:29] (step=0014258) Train Loss: 0.2784, Train Steps/Sec: 0.28, Epoch: 0.2770695685969685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14259, "loss": 0.2198287844657898, "memory_gb": 7.721559524536133, "step_time_ms": 3341.583728790283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:32] (step=0014259) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.27708900116595414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14260, "loss": 0.22544148564338684, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6763610839844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:36] (step=0014260) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.27710843373493976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14261, "loss": 0.22855642437934875, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8341217041016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:39] (step=0014261) Train Loss: 0.2328, Train Steps/Sec: 0.28, Epoch: 0.2771278663039254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14262, "loss": 0.1413346827030182, "memory_gb": 7.721559524536133, "step_time_ms": 3353.905439376831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:43] (step=0014262) Train Loss: 0.2355, Train Steps/Sec: 0.27, Epoch: 0.277147298872911, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14263, "loss": 0.19081875681877136, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0053062438965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:47] (step=0014263) Train Loss: 0.2222, Train Steps/Sec: 0.28, Epoch: 0.27716673144189663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14264, "loss": 0.34774109721183777, "memory_gb": 7.721559524536133, "step_time_ms": 3362.889528274536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:50] (step=0014264) Train Loss: 0.3153, Train Steps/Sec: 0.28, Epoch: 0.27718616401088225, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14265, "loss": 0.1326308399438858, "memory_gb": 7.721559524536133, "step_time_ms": 3355.717182159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:54] (step=0014265) Train Loss: 0.1716, Train Steps/Sec: 0.28, Epoch: 0.2772055965798679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:15:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14266, "loss": 0.18336674571037292, "memory_gb": 7.721559524536133, "step_time_ms": 3362.639904022217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:15:58] (step=0014266) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.2772250291488535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14267, "loss": 0.1727120280265808, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4660968780518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:01] (step=0014267) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2772444617178391, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14268, "loss": 0.2409178465604782, "memory_gb": 7.721559524536133, "step_time_ms": 3356.538772583008, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:05] (step=0014268) Train Loss: 0.2613, Train Steps/Sec: 0.28, Epoch: 0.27726389428682474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14269, "loss": 0.2518855631351471, "memory_gb": 7.721559524536133, "step_time_ms": 3361.576557159424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:08] (step=0014269) Train Loss: 0.2454, Train Steps/Sec: 0.28, Epoch: 0.27728332685581036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14270, "loss": 0.24467423558235168, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3617420196533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:12] (step=0014270) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.27730275942479593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14271, "loss": 0.2568381428718567, "memory_gb": 7.721559524536133, "step_time_ms": 3367.443561553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:15] (step=0014271) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.27732219199378155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14272, "loss": 0.2347201108932495, "memory_gb": 7.721559524536133, "step_time_ms": 3364.525079727173, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:19] (step=0014272) Train Loss: 0.2080, Train Steps/Sec: 0.28, Epoch: 0.2773416245627672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14273, "loss": 0.33844703435897827, "memory_gb": 7.721559524536133, "step_time_ms": 3365.447759628296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:23] (step=0014273) Train Loss: 0.2926, Train Steps/Sec: 0.28, Epoch: 0.2773610571317528, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14274, "loss": 0.21759533882141113, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1514778137207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:26] (step=0014274) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.2773804897007384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14275, "loss": 0.25349563360214233, "memory_gb": 7.721559524536133, "step_time_ms": 3353.273391723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:30] (step=0014275) Train Loss: 0.2052, Train Steps/Sec: 0.28, Epoch: 0.27739992226972404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14276, "loss": 0.15694095194339752, "memory_gb": 7.721559524536133, "step_time_ms": 3364.91322517395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:33] (step=0014276) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.27741935483870966, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14277, "loss": 0.20143866539001465, "memory_gb": 7.721559524536133, "step_time_ms": 3528.212070465088, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:37] (step=0014277) Train Loss: 0.2321, Train Steps/Sec: 0.28, Epoch: 0.2774387874076953, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14278, "loss": 0.2700902819633484, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7490062713623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:41] (step=0014278) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.2774582199766809, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14279, "loss": 0.2388264536857605, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7054176330566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:44] (step=0014279) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.27747765254566653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14280, "loss": 0.263349711894989, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1252307891846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:48] (step=0014280) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.27749708511465215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14281, "loss": 0.14609259366989136, "memory_gb": 7.721559524536133, "step_time_ms": 3367.236375808716, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:51] (step=0014281) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.2775165176836378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14282, "loss": 0.22169017791748047, "memory_gb": 7.721559524536133, "step_time_ms": 3367.979049682617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:55] (step=0014282) Train Loss: 0.1851, Train Steps/Sec: 0.28, Epoch: 0.2775359502526234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14283, "loss": 0.2024400532245636, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9520149230957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:16:59] (step=0014283) Train Loss: 0.2560, Train Steps/Sec: 0.28, Epoch: 0.277555382821609, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14284, "loss": 0.15578380227088928, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3547763824463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:02] (step=0014284) Train Loss: 0.2333, Train Steps/Sec: 0.28, Epoch: 0.27757481539059464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14285, "loss": 0.19823652505874634, "memory_gb": 7.721559524536133, "step_time_ms": 3365.056276321411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:06] (step=0014285) Train Loss: 0.2415, Train Steps/Sec: 0.28, Epoch: 0.27759424795958026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14286, "loss": 0.19070670008659363, "memory_gb": 7.721559524536133, "step_time_ms": 3362.802505493164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:09] (step=0014286) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.2776136805285659, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14287, "loss": 0.24814434349536896, "memory_gb": 7.721559524536133, "step_time_ms": 3368.964195251465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:13] (step=0014287) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.2776331130975515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14288, "loss": 0.2506334185600281, "memory_gb": 7.721559524536133, "step_time_ms": 3367.130756378174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:17] (step=0014288) Train Loss: 0.1804, Train Steps/Sec: 0.28, Epoch: 0.27765254566653713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14289, "loss": 0.199387788772583, "memory_gb": 7.715639114379883, "step_time_ms": 3336.7581367492676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:20] (step=0014289) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.27767197823552275, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14290, "loss": 0.30473893880844116, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2280101776123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:24] (step=0014290) Train Loss: 0.2821, Train Steps/Sec: 0.28, Epoch: 0.2776914108045084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14291, "loss": 0.22224661707878113, "memory_gb": 7.721559524536133, "step_time_ms": 3363.837718963623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:27] (step=0014291) Train Loss: 0.1922, Train Steps/Sec: 0.28, Epoch: 0.277710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14292, "loss": 0.1736728847026825, "memory_gb": 7.721559524536133, "step_time_ms": 3370.090961456299, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:31] (step=0014292) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.2777302759424796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14293, "loss": 0.2868182957172394, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8417415618896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:35] (step=0014293) Train Loss: 0.2969, Train Steps/Sec: 0.28, Epoch: 0.27774970851146524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14294, "loss": 0.31141823530197144, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3711338043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:38] (step=0014294) Train Loss: 0.2655, Train Steps/Sec: 0.28, Epoch: 0.2777691410804508, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14295, "loss": 0.26592910289764404, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1889095306396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:42] (step=0014295) Train Loss: 0.2093, Train Steps/Sec: 0.28, Epoch: 0.27778857364943643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14296, "loss": 0.18032369017601013, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9843463897705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:45] (step=0014296) Train Loss: 0.1963, Train Steps/Sec: 0.28, Epoch: 0.27780800621842205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14297, "loss": 0.16006681323051453, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7754192352295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:49] (step=0014297) Train Loss: 0.1719, Train Steps/Sec: 0.28, Epoch: 0.2778274387874077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14298, "loss": 0.2907911241054535, "memory_gb": 7.721559524536133, "step_time_ms": 3371.713161468506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:53] (step=0014298) Train Loss: 0.2791, Train Steps/Sec: 0.28, Epoch: 0.2778468713563933, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:17:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14299, "loss": 0.22861386835575104, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7168827056885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:17:56] (step=0014299) Train Loss: 0.2087, Train Steps/Sec: 0.28, Epoch: 0.2778663039253789, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14300, "loss": 0.22111362218856812, "memory_gb": 7.721559524536133, "step_time_ms": 3372.340202331543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:00] (step=0014300) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.27788573649436454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14301, "loss": 0.22475174069404602, "memory_gb": 7.721559524536133, "step_time_ms": 3373.7053871154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:03] (step=0014301) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.27790516906335017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14302, "loss": 0.26836860179901123, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8642768859863, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:07] (step=0014302) Train Loss: 0.2163, Train Steps/Sec: 0.28, Epoch: 0.2779246016323358, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14303, "loss": 0.29465633630752563, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0766563415527, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:11] (step=0014303) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.2779440342013214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14304, "loss": 0.19606168568134308, "memory_gb": 7.721559524536133, "step_time_ms": 3369.999408721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:14] (step=0014304) Train Loss: 0.1835, Train Steps/Sec: 0.28, Epoch: 0.27796346677030703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14305, "loss": 0.21223914623260498, "memory_gb": 7.721559524536133, "step_time_ms": 3367.7406311035156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:18] (step=0014305) Train Loss: 0.1936, Train Steps/Sec: 0.28, Epoch: 0.27798289933929266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14306, "loss": 0.1917726844549179, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9352321624756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:21] (step=0014306) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.2780023319082783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14307, "loss": 0.26491835713386536, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1397857666016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:25] (step=0014307) Train Loss: 0.2476, Train Steps/Sec: 0.28, Epoch: 0.2780217644772639, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14308, "loss": 0.24793124198913574, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2818927764893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:29] (step=0014308) Train Loss: 0.2610, Train Steps/Sec: 0.28, Epoch: 0.2780411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14309, "loss": 0.2224861979484558, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6137924194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:32] (step=0014309) Train Loss: 0.2628, Train Steps/Sec: 0.27, Epoch: 0.27806062961523514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14310, "loss": 0.21559883654117584, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1132583618164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:36] (step=0014310) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.27808006218422077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14311, "loss": 0.20412540435791016, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5779361724854, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:40] (step=0014311) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.2780994947532064, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14312, "loss": 0.24002289772033691, "memory_gb": 7.721559524536133, "step_time_ms": 3364.008903503418, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:43] (step=0014312) Train Loss: 0.1824, Train Steps/Sec: 0.28, Epoch: 0.278118927322192, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14313, "loss": 0.30816107988357544, "memory_gb": 7.721559524536133, "step_time_ms": 3366.87970161438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:47] (step=0014313) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.27813835989117763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14314, "loss": 0.18060778081417084, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0527725219727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:50] (step=0014314) Train Loss: 0.1693, Train Steps/Sec: 0.28, Epoch: 0.27815779246016326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14315, "loss": 0.15457497537136078, "memory_gb": 7.721559524536133, "step_time_ms": 3368.088483810425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:54] (step=0014315) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.2781772250291489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:18:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14316, "loss": 0.24426835775375366, "memory_gb": 7.721559524536133, "step_time_ms": 3367.65718460083, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:18:58] (step=0014316) Train Loss: 0.2694, Train Steps/Sec: 0.28, Epoch: 0.2781966575981345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14317, "loss": 0.2719624638557434, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1465435028076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:01] (step=0014317) Train Loss: 0.2092, Train Steps/Sec: 0.28, Epoch: 0.27821609016712007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14318, "loss": 0.17754733562469482, "memory_gb": 7.721559524536133, "step_time_ms": 3514.3725872039795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:05] (step=0014318) Train Loss: 0.1727, Train Steps/Sec: 0.28, Epoch: 0.2782355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14319, "loss": 0.1549389362335205, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4325733184814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:08] (step=0014319) Train Loss: 0.2177, Train Steps/Sec: 0.28, Epoch: 0.2782549553050913, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14320, "loss": 0.2735217213630676, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0288581848145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:12] (step=0014320) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.27827438787407693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14321, "loss": 0.18755091726779938, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6108169555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:16] (step=0014321) Train Loss: 0.2298, Train Steps/Sec: 0.28, Epoch: 0.27829382044306256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14322, "loss": 0.20568154752254486, "memory_gb": 7.721559524536133, "step_time_ms": 3361.73415184021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:19] (step=0014322) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.2783132530120482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14323, "loss": 0.13643647730350494, "memory_gb": 7.721559524536133, "step_time_ms": 3361.348867416382, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:23] (step=0014323) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.2783326855810338, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14324, "loss": 0.2652072310447693, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2843284606934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:26] (step=0014324) Train Loss: 0.3171, Train Steps/Sec: 0.28, Epoch: 0.2783521181500194, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14325, "loss": 0.19201144576072693, "memory_gb": 7.721559524536133, "step_time_ms": 3361.536741256714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:30] (step=0014325) Train Loss: 0.2375, Train Steps/Sec: 0.28, Epoch: 0.27837155071900505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14326, "loss": 0.1701878309249878, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7589263916016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:34] (step=0014326) Train Loss: 0.2324, Train Steps/Sec: 0.28, Epoch: 0.27839098328799067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14327, "loss": 0.21570473909378052, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5282096862793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:37] (step=0014327) Train Loss: 0.1918, Train Steps/Sec: 0.28, Epoch: 0.2784104158569763, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14328, "loss": 0.18727438151836395, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2756004333496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:41] (step=0014328) Train Loss: 0.2006, Train Steps/Sec: 0.28, Epoch: 0.2784298484259619, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14329, "loss": 0.24063211679458618, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3338260650635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:44] (step=0014329) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.27844928099494753, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14330, "loss": 0.2650497555732727, "memory_gb": 7.721559524536133, "step_time_ms": 3354.081392288208, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:48] (step=0014330) Train Loss: 0.2105, Train Steps/Sec: 0.28, Epoch: 0.27846871356393316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14331, "loss": 0.1967577338218689, "memory_gb": 7.721559524536133, "step_time_ms": 3361.95707321167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:52] (step=0014331) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.2784881461329188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14332, "loss": 0.16769887506961823, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1608486175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:55] (step=0014332) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.2785075787019044, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:19:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14333, "loss": 0.2357006072998047, "memory_gb": 7.721559524536133, "step_time_ms": 3358.124256134033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:19:59] (step=0014333) Train Loss: 0.2142, Train Steps/Sec: 0.28, Epoch: 0.27852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14334, "loss": 0.3395715653896332, "memory_gb": 7.721559524536133, "step_time_ms": 3362.173080444336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:02] (step=0014334) Train Loss: 0.2792, Train Steps/Sec: 0.28, Epoch: 0.27854644383987565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14335, "loss": 0.1647053360939026, "memory_gb": 7.721559524536133, "step_time_ms": 3363.973617553711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:06] (step=0014335) Train Loss: 0.1381, Train Steps/Sec: 0.28, Epoch: 0.27856587640886127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14336, "loss": 0.16200405359268188, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1108322143555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:10] (step=0014336) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.2785853089778469, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14337, "loss": 0.2097005844116211, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3274097442627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:13] (step=0014337) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.2786047415468325, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14338, "loss": 0.13607671856880188, "memory_gb": 7.721559524536133, "step_time_ms": 3351.741313934326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:17] (step=0014338) Train Loss: 0.1635, Train Steps/Sec: 0.28, Epoch: 0.27862417411581814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14339, "loss": 0.2570432424545288, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9070053100586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:20] (step=0014339) Train Loss: 0.3029, Train Steps/Sec: 0.28, Epoch: 0.27864360668480376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14340, "loss": 0.29882147908210754, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3710403442383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:24] (step=0014340) Train Loss: 0.2210, Train Steps/Sec: 0.28, Epoch: 0.2786630392537893, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14341, "loss": 0.22465628385543823, "memory_gb": 7.721559524536133, "step_time_ms": 3342.6527976989746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:27] (step=0014341) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.27868247182277495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14342, "loss": 0.2518768310546875, "memory_gb": 7.721559524536133, "step_time_ms": 3353.4159660339355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:31] (step=0014342) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.27870190439176057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14343, "loss": 0.1932155340909958, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4967079162598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:35] (step=0014343) Train Loss: 0.2020, Train Steps/Sec: 0.28, Epoch: 0.2787213369607462, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14344, "loss": 0.12627080082893372, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5902404785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:38] (step=0014344) Train Loss: 0.2327, Train Steps/Sec: 0.28, Epoch: 0.2787407695297318, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14345, "loss": 0.3571082353591919, "memory_gb": 7.721559524536133, "step_time_ms": 3359.046697616577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:42] (step=0014345) Train Loss: 0.3077, Train Steps/Sec: 0.28, Epoch: 0.27876020209871744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14346, "loss": 0.3531256914138794, "memory_gb": 7.721559524536133, "step_time_ms": 3343.618631362915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:45] (step=0014346) Train Loss: 0.3349, Train Steps/Sec: 0.29, Epoch: 0.27877963466770306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14347, "loss": 0.21768797934055328, "memory_gb": 7.721559524536133, "step_time_ms": 3360.347032546997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:49] (step=0014347) Train Loss: 0.1765, Train Steps/Sec: 0.28, Epoch: 0.2787990672366887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14348, "loss": 0.2902929186820984, "memory_gb": 7.715639114379883, "step_time_ms": 3322.2122192382812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:52] (step=0014348) Train Loss: 0.2838, Train Steps/Sec: 0.28, Epoch: 0.2788184998056743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:20:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14349, "loss": 0.14158856868743896, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3266735076904, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:20:56] (step=0014349) Train Loss: 0.2039, Train Steps/Sec: 0.28, Epoch: 0.2788379323746599, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14350, "loss": 0.27669447660446167, "memory_gb": 7.721559524536133, "step_time_ms": 3355.541944503784, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:00] (step=0014350) Train Loss: 0.2670, Train Steps/Sec: 0.27, Epoch: 0.27885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14351, "loss": 0.3452643156051636, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1341972351074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:03] (step=0014351) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.27887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14352, "loss": 0.2613970935344696, "memory_gb": 7.721559524536133, "step_time_ms": 3358.1321239471436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:07] (step=0014352) Train Loss: 0.2220, Train Steps/Sec: 0.28, Epoch: 0.2788962300816168, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14353, "loss": 0.2379724532365799, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3697547912598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:10] (step=0014353) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.2789156626506024, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14354, "loss": 0.20695015788078308, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4484119415283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:14] (step=0014354) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.27893509521958804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14355, "loss": 0.24692916870117188, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5376529693604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:18] (step=0014355) Train Loss: 0.2344, Train Steps/Sec: 0.28, Epoch: 0.27895452778857366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14356, "loss": 0.35329943895339966, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3998470306396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:21] (step=0014356) Train Loss: 0.3096, Train Steps/Sec: 0.28, Epoch: 0.2789739603575593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14357, "loss": 0.20121446251869202, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9601516723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:25] (step=0014357) Train Loss: 0.2108, Train Steps/Sec: 0.28, Epoch: 0.2789933929265449, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14358, "loss": 0.3589940369129181, "memory_gb": 7.721559524536133, "step_time_ms": 3357.968807220459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:28] (step=0014358) Train Loss: 0.2878, Train Steps/Sec: 0.28, Epoch: 0.2790128254955305, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14359, "loss": 0.20980024337768555, "memory_gb": 7.721559524536133, "step_time_ms": 3345.2839851379395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:32] (step=0014359) Train Loss: 0.2184, Train Steps/Sec: 0.28, Epoch: 0.27903225806451615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14360, "loss": 0.23845446109771729, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0517044067383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:35] (step=0014360) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.27905169063350177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14361, "loss": 0.22361615300178528, "memory_gb": 7.721559524536133, "step_time_ms": 3359.98797416687, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:39] (step=0014361) Train Loss: 0.2597, Train Steps/Sec: 0.28, Epoch: 0.2790711232024874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14362, "loss": 0.16026751697063446, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5609455108643, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:43] (step=0014362) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.279090555771473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14363, "loss": 0.26175612211227417, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5897941589355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:46] (step=0014363) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.2791099883404586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14364, "loss": 0.2330952286720276, "memory_gb": 7.721559524536133, "step_time_ms": 3359.10964012146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:50] (step=0014364) Train Loss: 0.1926, Train Steps/Sec: 0.28, Epoch: 0.2791294209094442, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14365, "loss": 0.14156177639961243, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0595722198486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:53] (step=0014365) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.2791488534784298, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:21:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14366, "loss": 0.10894481837749481, "memory_gb": 7.721559524536133, "step_time_ms": 3506.77752494812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:21:57] (step=0014366) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.27916828604741545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14367, "loss": 0.26799851655960083, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8245639801025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:00] (step=0014367) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.27918771861640107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14368, "loss": 0.268028199672699, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7912063598633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:04] (step=0014368) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.2792071511853867, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14369, "loss": 0.20854921638965607, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3898334503174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:08] (step=0014369) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.2792265837543723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14370, "loss": 0.24623803794384003, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2674522399902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:11] (step=0014370) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.27924601632335794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14371, "loss": 0.25723159313201904, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8998317718506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:15] (step=0014371) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.27926544889234356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14372, "loss": 0.16996078193187714, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6488494873047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:18] (step=0014372) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.2792848814613292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14373, "loss": 0.24753013253211975, "memory_gb": 7.721559524536133, "step_time_ms": 3345.945119857788, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:22] (step=0014373) Train Loss: 0.2643, Train Steps/Sec: 0.28, Epoch: 0.2793043140303148, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14374, "loss": 0.1907913237810135, "memory_gb": 7.721559524536133, "step_time_ms": 3354.8078536987305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:26] (step=0014374) Train Loss: 0.2436, Train Steps/Sec: 0.28, Epoch: 0.2793237465993004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14375, "loss": 0.24678468704223633, "memory_gb": 7.721559524536133, "step_time_ms": 3356.058120727539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:29] (step=0014375) Train Loss: 0.2813, Train Steps/Sec: 0.28, Epoch: 0.27934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14376, "loss": 0.2595420777797699, "memory_gb": 7.721559524536133, "step_time_ms": 3358.170986175537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:33] (step=0014376) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.27936261173727167, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14377, "loss": 0.31084367632865906, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0194244384766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:36] (step=0014377) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.2793820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14378, "loss": 0.2319795787334442, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3029460906982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:40] (step=0014378) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.2794014768752429, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14379, "loss": 0.19535855948925018, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1302585601807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:43] (step=0014379) Train Loss: 0.2219, Train Steps/Sec: 0.28, Epoch: 0.27942090944422854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14380, "loss": 0.28441137075424194, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8792037963867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:47] (step=0014380) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.27944034201321416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14381, "loss": 0.2785382568836212, "memory_gb": 7.721559524536133, "step_time_ms": 3358.214855194092, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:51] (step=0014381) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.2794597745821998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14382, "loss": 0.18544673919677734, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2259464263916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:54] (step=0014382) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.2794792071511854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:22:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14383, "loss": 0.17463862895965576, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8687133789062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:22:58] (step=0014383) Train Loss: 0.1913, Train Steps/Sec: 0.28, Epoch: 0.279498639720171, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14384, "loss": 0.2653108835220337, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8200340270996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:01] (step=0014384) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.27951807228915665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14385, "loss": 0.18105778098106384, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6949787139893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:05] (step=0014385) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.27953750485814227, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14386, "loss": 0.288258820772171, "memory_gb": 7.721559524536133, "step_time_ms": 3361.809253692627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:09] (step=0014386) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.2795569374271279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14387, "loss": 0.2812204957008362, "memory_gb": 7.721559524536133, "step_time_ms": 3357.757806777954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:12] (step=0014387) Train Loss: 0.3328, Train Steps/Sec: 0.28, Epoch: 0.27957636999611346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14388, "loss": 0.21877805888652802, "memory_gb": 7.721559524536133, "step_time_ms": 3362.239360809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:16] (step=0014388) Train Loss: 0.1884, Train Steps/Sec: 0.28, Epoch: 0.2795958025650991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14389, "loss": 0.25991642475128174, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9064865112305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:19] (step=0014389) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.2796152351340847, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14390, "loss": 0.21884272992610931, "memory_gb": 7.721559524536133, "step_time_ms": 3362.837076187134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:23] (step=0014390) Train Loss: 0.1930, Train Steps/Sec: 0.28, Epoch: 0.2796346677030703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14391, "loss": 0.1596737504005432, "memory_gb": 7.721559524536133, "step_time_ms": 3357.936143875122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:26] (step=0014391) Train Loss: 0.1832, Train Steps/Sec: 0.28, Epoch: 0.27965410027205595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14392, "loss": 0.2369903028011322, "memory_gb": 7.721559524536133, "step_time_ms": 3357.837677001953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:30] (step=0014392) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.27967353284104157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14393, "loss": 0.28437650203704834, "memory_gb": 7.721559524536133, "step_time_ms": 3357.475519180298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:34] (step=0014393) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.2796929654100272, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14394, "loss": 0.27537602186203003, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4183616638184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:37] (step=0014394) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.2797123979790128, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14395, "loss": 0.20079578459262848, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5408668518066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:41] (step=0014395) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.27973183054799844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14396, "loss": 0.2470189929008484, "memory_gb": 7.721559524536133, "step_time_ms": 3368.638277053833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:44] (step=0014396) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.27975126311698406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14397, "loss": 0.21897107362747192, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4789505004883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:48] (step=0014397) Train Loss: 0.2723, Train Steps/Sec: 0.28, Epoch: 0.2797706956859697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14398, "loss": 0.22133156657218933, "memory_gb": 7.721559524536133, "step_time_ms": 3364.983081817627, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:52] (step=0014398) Train Loss: 0.2634, Train Steps/Sec: 0.27, Epoch: 0.2797901282549553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14399, "loss": 0.37067335844039917, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2570247650146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:55] (step=0014399) Train Loss: 0.3163, Train Steps/Sec: 0.28, Epoch: 0.27980956082394093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:23:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14400, "loss": 0.19030460715293884, "memory_gb": 7.721559524536133, "step_time_ms": 3349.1086959838867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:23:59] (step=0014400) Train Loss: 0.1744, Train Steps/Sec: 0.28, Epoch: 0.27982899339292655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14401, "loss": 0.2213508039712906, "memory_gb": 7.721559524536133, "step_time_ms": 3364.741563796997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:02] (step=0014401) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.2798484259619122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14402, "loss": 0.26361578702926636, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5832023620605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:06] (step=0014402) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2798678585308978, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14403, "loss": 0.22537989914417267, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8366203308105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:10] (step=0014403) Train Loss: 0.2131, Train Steps/Sec: 0.28, Epoch: 0.2798872910998834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14404, "loss": 0.2739425301551819, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1627559661865, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:13] (step=0014404) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.27990672366886904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14405, "loss": 0.33729881048202515, "memory_gb": 7.721559524536133, "step_time_ms": 3371.8667030334473, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:17] (step=0014405) Train Loss: 0.3078, Train Steps/Sec: 0.28, Epoch: 0.27992615623785466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14406, "loss": 0.26680266857147217, "memory_gb": 7.721559524536133, "step_time_ms": 3512.6895904541016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:20] (step=0014406) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.2799455888068403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14407, "loss": 0.19632063806056976, "memory_gb": 7.721559524536133, "step_time_ms": 3363.4114265441895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:24] (step=0014407) Train Loss: 0.1578, Train Steps/Sec: 0.28, Epoch: 0.2799650213758259, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14408, "loss": 0.2677135765552521, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7874336242676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:28] (step=0014408) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.27998445394481153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14409, "loss": 0.19808879494667053, "memory_gb": 7.721559524536133, "step_time_ms": 3365.978956222534, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:31] (step=0014409) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.28000388651379715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14410, "loss": 0.17886221408843994, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9181385040283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:35] (step=0014410) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.2800233190827827, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14411, "loss": 0.3584117293357849, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6185607910156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:38] (step=0014411) Train Loss: 0.2978, Train Steps/Sec: 0.28, Epoch: 0.28004275165176834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14412, "loss": 0.20086725056171417, "memory_gb": 7.721559524536133, "step_time_ms": 3362.372636795044, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:42] (step=0014412) Train Loss: 0.1568, Train Steps/Sec: 0.28, Epoch: 0.28006218422075396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14413, "loss": 0.2850869297981262, "memory_gb": 7.721559524536133, "step_time_ms": 3364.90535736084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:46] (step=0014413) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.2800816167897396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14414, "loss": 0.21813365817070007, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6580657958984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:49] (step=0014414) Train Loss: 0.2257, Train Steps/Sec: 0.28, Epoch: 0.2801010493587252, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14415, "loss": 0.1652640402317047, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4660453796387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:53] (step=0014415) Train Loss: 0.2267, Train Steps/Sec: 0.28, Epoch: 0.28012048192771083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:24:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14416, "loss": 0.2683725655078888, "memory_gb": 7.721559524536133, "step_time_ms": 3367.741823196411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:24:56] (step=0014416) Train Loss: 0.2820, Train Steps/Sec: 0.28, Epoch: 0.28013991449669645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14417, "loss": 0.20914122462272644, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9186573028564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:00] (step=0014417) Train Loss: 0.1828, Train Steps/Sec: 0.28, Epoch: 0.2801593470656821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14418, "loss": 0.2634581923484802, "memory_gb": 7.721559524536133, "step_time_ms": 3369.853734970093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:04] (step=0014418) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.2801787796346677, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14419, "loss": 0.14850841462612152, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8950538635254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:07] (step=0014419) Train Loss: 0.1527, Train Steps/Sec: 0.28, Epoch: 0.2801982122036533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14420, "loss": 0.2707306146621704, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7212982177734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:11] (step=0014420) Train Loss: 0.2083, Train Steps/Sec: 0.28, Epoch: 0.28021764477263894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14421, "loss": 0.19154509902000427, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5286254882812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:14] (step=0014421) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.28023707734162456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14422, "loss": 0.24778473377227783, "memory_gb": 7.721559524536133, "step_time_ms": 3364.611864089966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:18] (step=0014422) Train Loss: 0.1973, Train Steps/Sec: 0.28, Epoch: 0.2802565099106102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14423, "loss": 0.30043289065361023, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6957359313965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:22] (step=0014423) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.2802759424795958, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14424, "loss": 0.19229118525981903, "memory_gb": 7.721559524536133, "step_time_ms": 3365.736484527588, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:25] (step=0014424) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.28029537504858143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14425, "loss": 0.3002017140388489, "memory_gb": 7.721559524536133, "step_time_ms": 3367.114543914795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:29] (step=0014425) Train Loss: 0.2732, Train Steps/Sec: 0.28, Epoch: 0.28031480761756705, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14426, "loss": 0.24822071194648743, "memory_gb": 7.721559524536133, "step_time_ms": 3369.1132068634033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:32] (step=0014426) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.2803342401865527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14427, "loss": 0.18803556263446808, "memory_gb": 7.721559524536133, "step_time_ms": 3363.833427429199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:36] (step=0014427) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.2803536727555383, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14428, "loss": 0.32051408290863037, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2500171661377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:40] (step=0014428) Train Loss: 0.3087, Train Steps/Sec: 0.28, Epoch: 0.2803731053245239, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14429, "loss": 0.19070199131965637, "memory_gb": 7.721559524536133, "step_time_ms": 3363.8083934783936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:43] (step=0014429) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.28039253789350954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14430, "loss": 0.27157026529312134, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3973598480225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:47] (step=0014430) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.28041197046249516, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14431, "loss": 0.27833354473114014, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3976917266846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:50] (step=0014431) Train Loss: 0.3006, Train Steps/Sec: 0.28, Epoch: 0.2804314030314808, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14432, "loss": 0.31150245666503906, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7305450439453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:54] (step=0014432) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.2804508356004664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:25:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14433, "loss": 0.228951558470726, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8005046844482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:25:58] (step=0014433) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.280470268169452, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14434, "loss": 0.3195974826812744, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1997833251953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:01] (step=0014434) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.2804897007384376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14435, "loss": 0.28595778346061707, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9063110351562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:05] (step=0014435) Train Loss: 0.3065, Train Steps/Sec: 0.28, Epoch: 0.2805091333074232, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14436, "loss": 0.25977393984794617, "memory_gb": 7.721559524536133, "step_time_ms": 3367.302656173706, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:09] (step=0014436) Train Loss: 0.2927, Train Steps/Sec: 0.28, Epoch: 0.28052856587640884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14437, "loss": 0.27165067195892334, "memory_gb": 7.721559524536133, "step_time_ms": 3365.098237991333, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:12] (step=0014437) Train Loss: 0.2453, Train Steps/Sec: 0.28, Epoch: 0.28054799844539446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14438, "loss": 0.15028533339500427, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3086891174316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:16] (step=0014438) Train Loss: 0.2050, Train Steps/Sec: 0.27, Epoch: 0.2805674310143801, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14439, "loss": 0.24826082587242126, "memory_gb": 7.721559524536133, "step_time_ms": 3359.311580657959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:19] (step=0014439) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.2805868635833657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14440, "loss": 0.2616652250289917, "memory_gb": 7.721559524536133, "step_time_ms": 3357.5618267059326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:23] (step=0014440) Train Loss: 0.2305, Train Steps/Sec: 0.28, Epoch: 0.28060629615235133, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14441, "loss": 0.24054518342018127, "memory_gb": 7.721559524536133, "step_time_ms": 3363.772392272949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:27] (step=0014441) Train Loss: 0.2663, Train Steps/Sec: 0.28, Epoch: 0.28062572872133695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14442, "loss": 0.1885969638824463, "memory_gb": 7.721559524536133, "step_time_ms": 3358.070135116577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:30] (step=0014442) Train Loss: 0.2153, Train Steps/Sec: 0.28, Epoch: 0.2806451612903226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14443, "loss": 0.27476879954338074, "memory_gb": 7.721559524536133, "step_time_ms": 3359.29274559021, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:34] (step=0014443) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.2806645938593082, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14444, "loss": 0.24337314069271088, "memory_gb": 7.715639114379883, "step_time_ms": 3321.279764175415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:37] (step=0014444) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.2806840264282938, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14445, "loss": 0.15065832436084747, "memory_gb": 7.721559524536133, "step_time_ms": 3358.7539196014404, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:41] (step=0014445) Train Loss: 0.1688, Train Steps/Sec: 0.28, Epoch: 0.28070345899727944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14446, "loss": 0.29893773794174194, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7617149353027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:45] (step=0014446) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.28072289156626506, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14447, "loss": 0.290825217962265, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7981929779053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:48] (step=0014447) Train Loss: 0.2378, Train Steps/Sec: 0.28, Epoch: 0.2807423241352507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14448, "loss": 0.19381564855575562, "memory_gb": 7.721559524536133, "step_time_ms": 3360.888957977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:52] (step=0014448) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.2807617567042363, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14449, "loss": 0.18156737089157104, "memory_gb": 7.721559524536133, "step_time_ms": 3359.315872192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:55] (step=0014449) Train Loss: 0.1959, Train Steps/Sec: 0.28, Epoch: 0.28078118927322193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:26:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14450, "loss": 0.30111855268478394, "memory_gb": 7.715639114379883, "step_time_ms": 3318.312883377075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:26:59] (step=0014450) Train Loss: 0.2963, Train Steps/Sec: 0.28, Epoch: 0.28080062184220755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14451, "loss": 0.2610247731208801, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8169345855713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:03] (step=0014451) Train Loss: 0.2586, Train Steps/Sec: 0.28, Epoch: 0.2808200544111932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14452, "loss": 0.2903507649898529, "memory_gb": 7.721559524536133, "step_time_ms": 3357.384204864502, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:06] (step=0014452) Train Loss: 0.2717, Train Steps/Sec: 0.28, Epoch: 0.2808394869801788, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14453, "loss": 0.1626031994819641, "memory_gb": 7.721559524536133, "step_time_ms": 3509.25350189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:10] (step=0014453) Train Loss: 0.2187, Train Steps/Sec: 0.28, Epoch: 0.2808589195491644, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14454, "loss": 0.25302380323410034, "memory_gb": 7.721559524536133, "step_time_ms": 3349.323034286499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:13] (step=0014454) Train Loss: 0.2653, Train Steps/Sec: 0.28, Epoch: 0.28087835211815004, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14455, "loss": 0.2108420580625534, "memory_gb": 7.721559524536133, "step_time_ms": 3361.55366897583, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:17] (step=0014455) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.28089778468713567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14456, "loss": 0.29328653216362, "memory_gb": 7.721559524536133, "step_time_ms": 3341.5191173553467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:20] (step=0014456) Train Loss: 0.2885, Train Steps/Sec: 0.29, Epoch: 0.28091721725612123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14457, "loss": 0.23352479934692383, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4769191741943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:24] (step=0014457) Train Loss: 0.2731, Train Steps/Sec: 0.28, Epoch: 0.28093664982510685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14458, "loss": 0.26366037130355835, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9815368652344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:28] (step=0014458) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.2809560823940925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14459, "loss": 0.27873438596725464, "memory_gb": 7.721559524536133, "step_time_ms": 3357.708215713501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:31] (step=0014459) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.2809755149630781, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14460, "loss": 0.2613750398159027, "memory_gb": 7.721559524536133, "step_time_ms": 3363.638162612915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:35] (step=0014460) Train Loss: 0.2876, Train Steps/Sec: 0.28, Epoch: 0.2809949475320637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14461, "loss": 0.18118123710155487, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0292205810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:38] (step=0014461) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.28101438010104934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14462, "loss": 0.2558484375476837, "memory_gb": 7.721559524536133, "step_time_ms": 3355.360746383667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:42] (step=0014462) Train Loss: 0.2433, Train Steps/Sec: 0.28, Epoch: 0.28103381267003497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14463, "loss": 0.2788315713405609, "memory_gb": 7.721559524536133, "step_time_ms": 3358.494281768799, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:46] (step=0014463) Train Loss: 0.2629, Train Steps/Sec: 0.28, Epoch: 0.2810532452390206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14464, "loss": 0.268441379070282, "memory_gb": 7.721559524536133, "step_time_ms": 3365.955352783203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:49] (step=0014464) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.2810726778080062, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14465, "loss": 0.33860406279563904, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5264892578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:53] (step=0014465) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.28109211037699183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:27:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14466, "loss": 0.25976496934890747, "memory_gb": 7.721559524536133, "step_time_ms": 3357.294797897339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:27:56] (step=0014466) Train Loss: 0.2470, Train Steps/Sec: 0.28, Epoch: 0.28111154294597746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14467, "loss": 0.28545162081718445, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4811687469482, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:00] (step=0014467) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.2811309755149631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14468, "loss": 0.1774587333202362, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8830795288086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:04] (step=0014468) Train Loss: 0.1817, Train Steps/Sec: 0.28, Epoch: 0.2811504080839487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14469, "loss": 0.26184096932411194, "memory_gb": 7.721559524536133, "step_time_ms": 3349.3480682373047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:07] (step=0014469) Train Loss: 0.2719, Train Steps/Sec: 0.28, Epoch: 0.2811698406529343, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14470, "loss": 0.2154725044965744, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6089401245117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:11] (step=0014470) Train Loss: 0.2078, Train Steps/Sec: 0.28, Epoch: 0.28118927322191994, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14471, "loss": 0.40555018186569214, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3802242279053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:14] (step=0014471) Train Loss: 0.3342, Train Steps/Sec: 0.28, Epoch: 0.28120870579090557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14472, "loss": 0.306121826171875, "memory_gb": 7.721559524536133, "step_time_ms": 3360.617160797119, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:18] (step=0014472) Train Loss: 0.2730, Train Steps/Sec: 0.28, Epoch: 0.2812281383598912, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14473, "loss": 0.13366207480430603, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5941066741943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:21] (step=0014473) Train Loss: 0.1533, Train Steps/Sec: 0.28, Epoch: 0.2812475709288768, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14474, "loss": 0.104371577501297, "memory_gb": 7.721559524536133, "step_time_ms": 3349.8449325561523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:25] (step=0014474) Train Loss: 0.1439, Train Steps/Sec: 0.28, Epoch: 0.28126700349786243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14475, "loss": 0.2176467776298523, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3315353393555, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:29] (step=0014475) Train Loss: 0.2479, Train Steps/Sec: 0.28, Epoch: 0.28128643606684806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14476, "loss": 0.2536596655845642, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6407432556152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:32] (step=0014476) Train Loss: 0.2218, Train Steps/Sec: 0.28, Epoch: 0.2813058686358337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14477, "loss": 0.18799063563346863, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3669452667236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:36] (step=0014477) Train Loss: 0.1967, Train Steps/Sec: 0.28, Epoch: 0.2813253012048193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14478, "loss": 0.18825232982635498, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8247299194336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:39] (step=0014478) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2813447337738049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14479, "loss": 0.13337254524230957, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6668243408203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:43] (step=0014479) Train Loss: 0.2011, Train Steps/Sec: 0.28, Epoch: 0.2813641663427905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14480, "loss": 0.29027873277664185, "memory_gb": 7.721559524536133, "step_time_ms": 3359.511137008667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:47] (step=0014480) Train Loss: 0.3059, Train Steps/Sec: 0.28, Epoch: 0.2813835989117761, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14481, "loss": 0.2624640464782715, "memory_gb": 7.721559524536133, "step_time_ms": 3337.0347023010254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:50] (step=0014481) Train Loss: 0.2552, Train Steps/Sec: 0.28, Epoch: 0.28140303148076173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14482, "loss": 0.2865816354751587, "memory_gb": 7.721559524536133, "step_time_ms": 3357.125759124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:54] (step=0014482) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.28142246404974736, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:28:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14483, "loss": 0.1485111117362976, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5399131774902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:28:57] (step=0014483) Train Loss: 0.2018, Train Steps/Sec: 0.28, Epoch: 0.281441896618733, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14484, "loss": 0.23152869939804077, "memory_gb": 7.721559524536133, "step_time_ms": 3360.539436340332, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:01] (step=0014484) Train Loss: 0.2467, Train Steps/Sec: 0.28, Epoch: 0.2814613291877186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14485, "loss": 0.30447766184806824, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7049503326416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:05] (step=0014485) Train Loss: 0.2561, Train Steps/Sec: 0.27, Epoch: 0.2814807617567042, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14486, "loss": 0.2770135998725891, "memory_gb": 7.721559524536133, "step_time_ms": 3360.217809677124, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:08] (step=0014486) Train Loss: 0.2746, Train Steps/Sec: 0.28, Epoch: 0.28150019432568985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14487, "loss": 0.372922420501709, "memory_gb": 7.721559524536133, "step_time_ms": 3361.553192138672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:12] (step=0014487) Train Loss: 0.2945, Train Steps/Sec: 0.28, Epoch: 0.28151962689467547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14488, "loss": 0.3450925648212433, "memory_gb": 7.721559524536133, "step_time_ms": 3347.634792327881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:15] (step=0014488) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.2815390594636611, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14489, "loss": 0.15687665343284607, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8905029296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:19] (step=0014489) Train Loss: 0.1725, Train Steps/Sec: 0.28, Epoch: 0.2815584920326467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14490, "loss": 0.16430926322937012, "memory_gb": 7.721559524536133, "step_time_ms": 3369.431972503662, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:22] (step=0014490) Train Loss: 0.1720, Train Steps/Sec: 0.28, Epoch: 0.28157792460163233, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14491, "loss": 0.1766309291124344, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4013710021973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:26] (step=0014491) Train Loss: 0.2062, Train Steps/Sec: 0.28, Epoch: 0.28159735717061796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14492, "loss": 0.19750458002090454, "memory_gb": 7.721559524536133, "step_time_ms": 3355.0193309783936, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:30] (step=0014492) Train Loss: 0.1802, Train Steps/Sec: 0.28, Epoch: 0.2816167897396036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14493, "loss": 0.2778267562389374, "memory_gb": 7.721559524536133, "step_time_ms": 3356.152057647705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:33] (step=0014493) Train Loss: 0.2149, Train Steps/Sec: 0.28, Epoch: 0.2816362223085892, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14494, "loss": 0.26016420125961304, "memory_gb": 7.721559524536133, "step_time_ms": 3506.527900695801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:37] (step=0014494) Train Loss: 0.2001, Train Steps/Sec: 0.28, Epoch: 0.2816556548775748, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14495, "loss": 0.28565096855163574, "memory_gb": 7.721559524536133, "step_time_ms": 3361.983060836792, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:40] (step=0014495) Train Loss: 0.3099, Train Steps/Sec: 0.28, Epoch: 0.28167508744656045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14496, "loss": 0.2813347578048706, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5595569610596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:44] (step=0014496) Train Loss: 0.2571, Train Steps/Sec: 0.28, Epoch: 0.28169452001554607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14497, "loss": 0.22986167669296265, "memory_gb": 7.721559524536133, "step_time_ms": 3361.584186553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:48] (step=0014497) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.2817139525845317, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14498, "loss": 0.2400142103433609, "memory_gb": 7.721559524536133, "step_time_ms": 3355.743885040283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:51] (step=0014498) Train Loss: 0.2630, Train Steps/Sec: 0.28, Epoch: 0.2817333851535173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14499, "loss": 0.24509289860725403, "memory_gb": 7.721559524536133, "step_time_ms": 3363.077163696289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:55] (step=0014499) Train Loss: 0.2299, Train Steps/Sec: 0.28, Epoch: 0.28175281772250294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:29:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14500, "loss": 0.2584981918334961, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5681915283203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:29:58] (step=0014500) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.28177225029148856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14501, "loss": 0.34949713945388794, "memory_gb": 7.721559524536133, "step_time_ms": 3361.6862297058105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:02] (step=0014501) Train Loss: 0.2608, Train Steps/Sec: 0.28, Epoch: 0.2817916828604742, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14502, "loss": 0.14528460800647736, "memory_gb": 7.721559524536133, "step_time_ms": 3356.66561126709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:06] (step=0014502) Train Loss: 0.1694, Train Steps/Sec: 0.28, Epoch: 0.2818111154294598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14503, "loss": 0.23430104553699493, "memory_gb": 7.721559524536133, "step_time_ms": 3360.676050186157, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:09] (step=0014503) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.28183054799844537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14504, "loss": 0.16838029026985168, "memory_gb": 7.721559524536133, "step_time_ms": 3367.312431335449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:13] (step=0014504) Train Loss: 0.1714, Train Steps/Sec: 0.28, Epoch: 0.281849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14505, "loss": 0.276455819606781, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3355884552, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:16] (step=0014505) Train Loss: 0.2222, Train Steps/Sec: 0.28, Epoch: 0.2818694131364166, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14506, "loss": 0.15495972335338593, "memory_gb": 7.721559524536133, "step_time_ms": 3363.70849609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:20] (step=0014506) Train Loss: 0.1901, Train Steps/Sec: 0.28, Epoch: 0.28188884570540224, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14507, "loss": 0.15274378657341003, "memory_gb": 7.721559524536133, "step_time_ms": 3348.0405807495117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:23] (step=0014507) Train Loss: 0.2202, Train Steps/Sec: 0.28, Epoch: 0.28190827827438786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14508, "loss": 0.26517945528030396, "memory_gb": 7.721559524536133, "step_time_ms": 3366.685390472412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:27] (step=0014508) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.2819277108433735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14509, "loss": 0.1764662265777588, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5881366729736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:31] (step=0014509) Train Loss: 0.2121, Train Steps/Sec: 0.28, Epoch: 0.2819471434123591, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14510, "loss": 0.3033734858036041, "memory_gb": 7.721559524536133, "step_time_ms": 3366.705894470215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:34] (step=0014510) Train Loss: 0.3037, Train Steps/Sec: 0.28, Epoch: 0.2819665759813447, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14511, "loss": 0.22986125946044922, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5196685791016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:38] (step=0014511) Train Loss: 0.2749, Train Steps/Sec: 0.28, Epoch: 0.28198600855033035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14512, "loss": 0.09591260552406311, "memory_gb": 7.721559524536133, "step_time_ms": 3359.4627380371094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:41] (step=0014512) Train Loss: 0.1482, Train Steps/Sec: 0.28, Epoch: 0.28200544111931597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14513, "loss": 0.16039028763771057, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4732704162598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:45] (step=0014513) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.2820248736883016, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14514, "loss": 0.2070760726928711, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9614791870117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:49] (step=0014514) Train Loss: 0.2275, Train Steps/Sec: 0.28, Epoch: 0.2820443062572872, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14515, "loss": 0.29564589262008667, "memory_gb": 7.721559524536133, "step_time_ms": 3365.363836288452, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:52] (step=0014515) Train Loss: 0.2362, Train Steps/Sec: 0.28, Epoch: 0.28206373882627284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14516, "loss": 0.287031352519989, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9796714782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:56] (step=0014516) Train Loss: 0.2624, Train Steps/Sec: 0.28, Epoch: 0.28208317139525846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:30:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14517, "loss": 0.20615440607070923, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1085834503174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:30:59] (step=0014517) Train Loss: 0.2381, Train Steps/Sec: 0.28, Epoch: 0.2821026039642441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14518, "loss": 0.22038719058036804, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5081520080566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:03] (step=0014518) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.2821220365332297, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14519, "loss": 0.2943100929260254, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4752922058105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:07] (step=0014519) Train Loss: 0.3048, Train Steps/Sec: 0.28, Epoch: 0.2821414691022153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14520, "loss": 0.24101756513118744, "memory_gb": 7.721559524536133, "step_time_ms": 3373.995304107666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:10] (step=0014520) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.28216090167120095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14521, "loss": 0.3294728994369507, "memory_gb": 7.721559524536133, "step_time_ms": 3365.528345108032, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:14] (step=0014521) Train Loss: 0.3265, Train Steps/Sec: 0.28, Epoch: 0.28218033424018657, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14522, "loss": 0.20226135849952698, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0594482421875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:17] (step=0014522) Train Loss: 0.1819, Train Steps/Sec: 0.28, Epoch: 0.2821997668091722, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14523, "loss": 0.3029283881187439, "memory_gb": 7.721559524536133, "step_time_ms": 3369.354248046875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:21] (step=0014523) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.2822191993781578, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14524, "loss": 0.2269081473350525, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5544052124023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:25] (step=0014524) Train Loss: 0.2301, Train Steps/Sec: 0.28, Epoch: 0.28223863194714344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14525, "loss": 0.28215500712394714, "memory_gb": 7.721559524536133, "step_time_ms": 3364.776611328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:28] (step=0014525) Train Loss: 0.2345, Train Steps/Sec: 0.28, Epoch: 0.28225806451612906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14526, "loss": 0.2722875773906708, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5500831604004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:32] (step=0014526) Train Loss: 0.2678, Train Steps/Sec: 0.27, Epoch: 0.2822774970851146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14527, "loss": 0.29248690605163574, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0825023651123, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:36] (step=0014527) Train Loss: 0.2437, Train Steps/Sec: 0.28, Epoch: 0.28229692965410025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14528, "loss": 0.2522992789745331, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1746768951416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:39] (step=0014528) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.28231636222308587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14529, "loss": 0.24559468030929565, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5056228637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:43] (step=0014529) Train Loss: 0.2761, Train Steps/Sec: 0.28, Epoch: 0.2823357947920715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14530, "loss": 0.31402668356895447, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2849197387695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:46] (step=0014530) Train Loss: 0.2967, Train Steps/Sec: 0.28, Epoch: 0.2823552273610571, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14531, "loss": 0.26502203941345215, "memory_gb": 7.721559524536133, "step_time_ms": 3369.7433471679688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:50] (step=0014531) Train Loss: 0.1917, Train Steps/Sec: 0.28, Epoch: 0.28237465993004274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14532, "loss": 0.25425103306770325, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9063835144043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:54] (step=0014532) Train Loss: 0.3239, Train Steps/Sec: 0.28, Epoch: 0.28239409249902836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:31:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14533, "loss": 0.1623387634754181, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3600006103516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:31:57] (step=0014533) Train Loss: 0.2410, Train Steps/Sec: 0.28, Epoch: 0.282413525068014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14534, "loss": 0.27992895245552063, "memory_gb": 7.721559524536133, "step_time_ms": 3369.995594024658, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:01] (step=0014534) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.2824329576369996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14535, "loss": 0.20267319679260254, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4844970703125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:04] (step=0014535) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.2824523902059852, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14536, "loss": 0.22380489110946655, "memory_gb": 7.721559524536133, "step_time_ms": 3364.762783050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:08] (step=0014536) Train Loss: 0.2421, Train Steps/Sec: 0.28, Epoch: 0.28247182277497085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14537, "loss": 0.353238046169281, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5269145965576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:12] (step=0014537) Train Loss: 0.3049, Train Steps/Sec: 0.28, Epoch: 0.28249125534395647, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14538, "loss": 0.12143903970718384, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4254207611084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:15] (step=0014538) Train Loss: 0.2002, Train Steps/Sec: 0.28, Epoch: 0.2825106879129421, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14539, "loss": 0.24553821980953217, "memory_gb": 7.721559524536133, "step_time_ms": 3365.931510925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:19] (step=0014539) Train Loss: 0.2616, Train Steps/Sec: 0.28, Epoch: 0.2825301204819277, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14540, "loss": 0.24049751460552216, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9558811187744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:22] (step=0014540) Train Loss: 0.2376, Train Steps/Sec: 0.28, Epoch: 0.28254955305091334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14541, "loss": 0.2108064591884613, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0213947296143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:26] (step=0014541) Train Loss: 0.2485, Train Steps/Sec: 0.28, Epoch: 0.28256898561989896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14542, "loss": 0.14995847642421722, "memory_gb": 7.721559524536133, "step_time_ms": 3502.4373531341553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:30] (step=0014542) Train Loss: 0.2010, Train Steps/Sec: 0.28, Epoch: 0.2825884181888846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14543, "loss": 0.3679952621459961, "memory_gb": 7.721559524536133, "step_time_ms": 3366.5688037872314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:33] (step=0014543) Train Loss: 0.2921, Train Steps/Sec: 0.28, Epoch: 0.2826078507578702, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14544, "loss": 0.2254064679145813, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9096488952637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:37] (step=0014544) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.2826272833268558, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14545, "loss": 0.24986745417118073, "memory_gb": 7.721559524536133, "step_time_ms": 3366.6248321533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:40] (step=0014545) Train Loss: 0.2483, Train Steps/Sec: 0.28, Epoch: 0.28264671589584145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14546, "loss": 0.27519842982292175, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4208908081055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:44] (step=0014546) Train Loss: 0.2069, Train Steps/Sec: 0.28, Epoch: 0.28266614846482707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14547, "loss": 0.16911828517913818, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8923377990723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:48] (step=0014547) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.2826855810338127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14548, "loss": 0.24327045679092407, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9486770629883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:51] (step=0014548) Train Loss: 0.1826, Train Steps/Sec: 0.28, Epoch: 0.2827050136027983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14549, "loss": 0.2519896924495697, "memory_gb": 7.721559524536133, "step_time_ms": 3363.078832626343, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:55] (step=0014549) Train Loss: 0.2612, Train Steps/Sec: 0.28, Epoch: 0.2827244461717839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:32:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14550, "loss": 0.10689239948987961, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7261390686035, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:32:58] (step=0014550) Train Loss: 0.1918, Train Steps/Sec: 0.28, Epoch: 0.2827438787407695, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14551, "loss": 0.22577759623527527, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5501136779785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:02] (step=0014551) Train Loss: 0.2385, Train Steps/Sec: 0.28, Epoch: 0.2827633113097551, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14552, "loss": 0.17024768888950348, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2686138153076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:06] (step=0014552) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.28278274387874075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14553, "loss": 0.27473610639572144, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0403747558594, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:09] (step=0014553) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.28280217644772637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14554, "loss": 0.25438565015792847, "memory_gb": 7.721559524536133, "step_time_ms": 3365.196466445923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:13] (step=0014554) Train Loss: 0.2371, Train Steps/Sec: 0.28, Epoch: 0.282821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14555, "loss": 0.17080509662628174, "memory_gb": 7.721559524536133, "step_time_ms": 3364.323854446411, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:17] (step=0014555) Train Loss: 0.2254, Train Steps/Sec: 0.28, Epoch: 0.2828410415856976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14556, "loss": 0.2668144106864929, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9417724609375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:20] (step=0014556) Train Loss: 0.2573, Train Steps/Sec: 0.28, Epoch: 0.28286047415468324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14557, "loss": 0.2550008296966553, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9323921203613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:24] (step=0014557) Train Loss: 0.2091, Train Steps/Sec: 0.28, Epoch: 0.28287990672366886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14558, "loss": 0.22134064137935638, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3998165130615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:27] (step=0014558) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.2828993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14559, "loss": 0.23816601932048798, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5337142944336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:31] (step=0014559) Train Loss: 0.2712, Train Steps/Sec: 0.28, Epoch: 0.2829187718616401, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14560, "loss": 0.25617754459381104, "memory_gb": 7.721559524536133, "step_time_ms": 3353.221654891968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:34] (step=0014560) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.28293820443062573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14561, "loss": 0.23469677567481995, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9155883789062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:38] (step=0014561) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.28295763699961135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14562, "loss": 0.25510311126708984, "memory_gb": 7.721559524536133, "step_time_ms": 3360.086679458618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:42] (step=0014562) Train Loss: 0.2782, Train Steps/Sec: 0.28, Epoch: 0.282977069568597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14563, "loss": 0.2384229302406311, "memory_gb": 7.721559524536133, "step_time_ms": 3361.440420150757, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:45] (step=0014563) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.2829965021375826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14564, "loss": 0.27147552371025085, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8642349243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:49] (step=0014564) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.2830159347065682, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14565, "loss": 0.293949693441391, "memory_gb": 7.721559524536133, "step_time_ms": 3342.837333679199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:52] (step=0014565) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.28303536727555384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:33:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14566, "loss": 0.2773040235042572, "memory_gb": 7.715639114379883, "step_time_ms": 3325.455665588379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:33:56] (step=0014566) Train Loss: 0.2440, Train Steps/Sec: 0.28, Epoch: 0.28305479984453946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14567, "loss": 0.1862238347530365, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5640659332275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:00] (step=0014567) Train Loss: 0.1763, Train Steps/Sec: 0.28, Epoch: 0.2830742324135251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14568, "loss": 0.2406005859375, "memory_gb": 7.721559524536133, "step_time_ms": 3343.8127040863037, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:03] (step=0014568) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.2830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14569, "loss": 0.3403746485710144, "memory_gb": 7.721559524536133, "step_time_ms": 3357.09810256958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:07] (step=0014569) Train Loss: 0.2567, Train Steps/Sec: 0.28, Epoch: 0.28311309755149633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14570, "loss": 0.2769850492477417, "memory_gb": 7.715639114379883, "step_time_ms": 3328.533887863159, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:10] (step=0014570) Train Loss: 0.2826, Train Steps/Sec: 0.28, Epoch: 0.28313253012048195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14571, "loss": 0.2256946563720703, "memory_gb": 7.721559524536133, "step_time_ms": 3359.2655658721924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:14] (step=0014571) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.2831519626894676, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14572, "loss": 0.3133600950241089, "memory_gb": 7.715639114379883, "step_time_ms": 3318.3324337005615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:18] (step=0014572) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.28317139525845314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14573, "loss": 0.2611667215824127, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8415851593018, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:21] (step=0014573) Train Loss: 0.2208, Train Steps/Sec: 0.28, Epoch: 0.28319082782743876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14574, "loss": 0.20214524865150452, "memory_gb": 7.721559524536133, "step_time_ms": 3363.048315048218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:25] (step=0014574) Train Loss: 0.2328, Train Steps/Sec: 0.27, Epoch: 0.2832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14575, "loss": 0.19897419214248657, "memory_gb": 7.721559524536133, "step_time_ms": 3357.168674468994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:28] (step=0014575) Train Loss: 0.2063, Train Steps/Sec: 0.28, Epoch: 0.28322969296541, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14576, "loss": 0.27459901571273804, "memory_gb": 7.721559524536133, "step_time_ms": 3348.574161529541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:32] (step=0014576) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.28324912553439563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14577, "loss": 0.3444497585296631, "memory_gb": 7.721559524536133, "step_time_ms": 3361.140489578247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:36] (step=0014577) Train Loss: 0.2766, Train Steps/Sec: 0.28, Epoch: 0.28326855810338125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14578, "loss": 0.2897438406944275, "memory_gb": 7.721559524536133, "step_time_ms": 3342.9551124572754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:39] (step=0014578) Train Loss: 0.2736, Train Steps/Sec: 0.28, Epoch: 0.2832879906723669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14579, "loss": 0.12407657504081726, "memory_gb": 7.721559524536133, "step_time_ms": 3357.9318523406982, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:43] (step=0014579) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.2833074232413525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14580, "loss": 0.2755363881587982, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5286140441895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:46] (step=0014580) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.2833268558103381, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14581, "loss": 0.35673457384109497, "memory_gb": 7.721559524536133, "step_time_ms": 3363.16180229187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:50] (step=0014581) Train Loss: 0.3099, Train Steps/Sec: 0.28, Epoch: 0.28334628837932374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14582, "loss": 0.23737646639347076, "memory_gb": 7.721559524536133, "step_time_ms": 3512.883424758911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:54] (step=0014582) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.28336572094830936, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:34:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14583, "loss": 0.275287926197052, "memory_gb": 7.721559524536133, "step_time_ms": 3361.73677444458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:34:57] (step=0014583) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.283385153517295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14584, "loss": 0.3001129627227783, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6267280578613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:01] (step=0014584) Train Loss: 0.2096, Train Steps/Sec: 0.28, Epoch: 0.2834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14585, "loss": 0.20968961715698242, "memory_gb": 7.721559524536133, "step_time_ms": 3358.893394470215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:04] (step=0014585) Train Loss: 0.1963, Train Steps/Sec: 0.28, Epoch: 0.28342401865526623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14586, "loss": 0.14360299706459045, "memory_gb": 7.721559524536133, "step_time_ms": 3350.999593734741, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:08] (step=0014586) Train Loss: 0.1837, Train Steps/Sec: 0.28, Epoch: 0.28344345122425185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14587, "loss": 0.2992934286594391, "memory_gb": 7.721559524536133, "step_time_ms": 3353.8241386413574, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:11] (step=0014587) Train Loss: 0.3067, Train Steps/Sec: 0.28, Epoch: 0.2834628837932375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14588, "loss": 0.3245445787906647, "memory_gb": 7.715639114379883, "step_time_ms": 3325.6518840789795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:15] (step=0014588) Train Loss: 0.2558, Train Steps/Sec: 0.28, Epoch: 0.2834823163622231, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14589, "loss": 0.28860360383987427, "memory_gb": 7.721559524536133, "step_time_ms": 3360.138177871704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:19] (step=0014589) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.2835017489312087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14590, "loss": 0.25370222330093384, "memory_gb": 7.721559524536133, "step_time_ms": 3350.169897079468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:22] (step=0014590) Train Loss: 0.3081, Train Steps/Sec: 0.28, Epoch: 0.28352118150019434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14591, "loss": 0.3627169132232666, "memory_gb": 7.721559524536133, "step_time_ms": 3359.9343299865723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:26] (step=0014591) Train Loss: 0.2984, Train Steps/Sec: 0.28, Epoch: 0.28354061406917996, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14592, "loss": 0.18902313709259033, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7308654785156, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:29] (step=0014592) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.2835600466381656, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14593, "loss": 0.22015298902988434, "memory_gb": 7.715639114379883, "step_time_ms": 3328.335762023926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:33] (step=0014593) Train Loss: 0.2204, Train Steps/Sec: 0.28, Epoch: 0.2835794792071512, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14594, "loss": 0.2019878625869751, "memory_gb": 7.721559524536133, "step_time_ms": 3345.3450202941895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:37] (step=0014594) Train Loss: 0.2745, Train Steps/Sec: 0.28, Epoch: 0.28359891177613683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14595, "loss": 0.251598596572876, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1958236694336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:40] (step=0014595) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.28361834434512245, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14596, "loss": 0.25208356976509094, "memory_gb": 7.721559524536133, "step_time_ms": 3368.3061599731445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:44] (step=0014596) Train Loss: 0.2435, Train Steps/Sec: 0.28, Epoch: 0.283637776914108, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14597, "loss": 0.12909173965454102, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2309226989746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:47] (step=0014597) Train Loss: 0.1547, Train Steps/Sec: 0.28, Epoch: 0.28365720948309364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14598, "loss": 0.22607237100601196, "memory_gb": 7.721559524536133, "step_time_ms": 3374.436378479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:51] (step=0014598) Train Loss: 0.2259, Train Steps/Sec: 0.28, Epoch: 0.28367664205207926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14599, "loss": 0.3169339597225189, "memory_gb": 7.721559524536133, "step_time_ms": 3361.591577529907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:54] (step=0014599) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.2836960746210649, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:35:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14600, "loss": 0.25011175870895386, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9168224334717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:35:58] (step=0014600) Train Loss: 0.2353, Train Steps/Sec: 0.28, Epoch: 0.2837155071900505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14601, "loss": 0.18041680753231049, "memory_gb": 7.721559524536133, "step_time_ms": 3365.302085876465, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:02] (step=0014601) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.28373493975903613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14602, "loss": 0.30271008610725403, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0802631378174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:05] (step=0014602) Train Loss: 0.3048, Train Steps/Sec: 0.28, Epoch: 0.28375437232802175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14603, "loss": 0.22839206457138062, "memory_gb": 7.721559524536133, "step_time_ms": 3362.0529174804688, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:09] (step=0014603) Train Loss: 0.2427, Train Steps/Sec: 0.28, Epoch: 0.2837738048970074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14604, "loss": 0.2119094580411911, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9676570892334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:12] (step=0014604) Train Loss: 0.1791, Train Steps/Sec: 0.28, Epoch: 0.283793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14605, "loss": 0.2910362482070923, "memory_gb": 7.721559524536133, "step_time_ms": 3366.992712020874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:16] (step=0014605) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.2838126700349786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14606, "loss": 0.23267175257205963, "memory_gb": 7.721559524536133, "step_time_ms": 3365.938663482666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:20] (step=0014606) Train Loss: 0.2348, Train Steps/Sec: 0.28, Epoch: 0.28383210260396424, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14607, "loss": 0.3616281747817993, "memory_gb": 7.721559524536133, "step_time_ms": 3366.971969604492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:23] (step=0014607) Train Loss: 0.3331, Train Steps/Sec: 0.28, Epoch: 0.28385153517294986, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14608, "loss": 0.21867182850837708, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6137199401855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:27] (step=0014608) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.2838709677419355, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14609, "loss": 0.2289266586303711, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8016872406006, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:31] (step=0014609) Train Loss: 0.2902, Train Steps/Sec: 0.28, Epoch: 0.2838904003109211, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14610, "loss": 0.28877848386764526, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6322536468506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:34] (step=0014610) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.28390983287990673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14611, "loss": 0.17787596583366394, "memory_gb": 7.721559524536133, "step_time_ms": 3361.097574234009, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:38] (step=0014611) Train Loss: 0.2101, Train Steps/Sec: 0.28, Epoch: 0.28392926544889235, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14612, "loss": 0.2305411547422409, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3134574890137, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:41] (step=0014612) Train Loss: 0.2112, Train Steps/Sec: 0.28, Epoch: 0.283948698017878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14613, "loss": 0.2019818276166916, "memory_gb": 7.721559524536133, "step_time_ms": 3366.55855178833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:45] (step=0014613) Train Loss: 0.2512, Train Steps/Sec: 0.28, Epoch: 0.2839681305868636, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14614, "loss": 0.2297893613576889, "memory_gb": 7.721559524536133, "step_time_ms": 3365.2889728546143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:49] (step=0014614) Train Loss: 0.2428, Train Steps/Sec: 0.27, Epoch: 0.2839875631558492, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14615, "loss": 0.372733473777771, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6302433013916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:52] (step=0014615) Train Loss: 0.2803, Train Steps/Sec: 0.28, Epoch: 0.28400699572483484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:36:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14616, "loss": 0.13444791734218597, "memory_gb": 7.721559524536133, "step_time_ms": 3365.701913833618, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:36:56] (step=0014616) Train Loss: 0.1562, Train Steps/Sec: 0.28, Epoch: 0.28402642829382047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14617, "loss": 0.23011232912540436, "memory_gb": 7.721559524536133, "step_time_ms": 3365.009307861328, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:00] (step=0014617) Train Loss: 0.2861, Train Steps/Sec: 0.28, Epoch: 0.2840458608628061, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14618, "loss": 0.21324419975280762, "memory_gb": 7.721559524536133, "step_time_ms": 3361.032724380493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:03] (step=0014618) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.2840652934317917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14619, "loss": 0.24848490953445435, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5474185943604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:07] (step=0014619) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.2840847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14620, "loss": 0.3108462393283844, "memory_gb": 7.721559524536133, "step_time_ms": 3373.269557952881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:10] (step=0014620) Train Loss: 0.2420, Train Steps/Sec: 0.28, Epoch: 0.2841041585697629, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14621, "loss": 0.25727277994155884, "memory_gb": 7.721559524536133, "step_time_ms": 3373.096227645874, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:14] (step=0014621) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.2841235911387485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14622, "loss": 0.262712299823761, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5014457702637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:18] (step=0014622) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.28414302370773414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14623, "loss": 0.19465981423854828, "memory_gb": 7.721559524536133, "step_time_ms": 3360.670804977417, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:21] (step=0014623) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.28416245627671977, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14624, "loss": 0.303144633769989, "memory_gb": 7.721559524536133, "step_time_ms": 3379.617691040039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:25] (step=0014624) Train Loss: 0.3094, Train Steps/Sec: 0.28, Epoch: 0.2841818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14625, "loss": 0.15689876675605774, "memory_gb": 7.721559524536133, "step_time_ms": 3373.6298084259033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:29] (step=0014625) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.284201321414691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14626, "loss": 0.24418199062347412, "memory_gb": 7.721559524536133, "step_time_ms": 3360.654354095459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:32] (step=0014626) Train Loss: 0.1998, Train Steps/Sec: 0.28, Epoch: 0.28422075398367663, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14627, "loss": 0.3396640419960022, "memory_gb": 7.721559524536133, "step_time_ms": 3367.28835105896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:36] (step=0014627) Train Loss: 0.2387, Train Steps/Sec: 0.28, Epoch: 0.28424018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14628, "loss": 0.2808961868286133, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3063049316406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:39] (step=0014628) Train Loss: 0.2347, Train Steps/Sec: 0.28, Epoch: 0.2842596191216479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14629, "loss": 0.19812984764575958, "memory_gb": 7.721559524536133, "step_time_ms": 3515.1450634002686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:43] (step=0014629) Train Loss: 0.2031, Train Steps/Sec: 0.28, Epoch: 0.2842790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14630, "loss": 0.19087181985378265, "memory_gb": 7.721559524536133, "step_time_ms": 3366.982936859131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:47] (step=0014630) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.2842984842596191, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14631, "loss": 0.22149190306663513, "memory_gb": 7.721559524536133, "step_time_ms": 3364.7119998931885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:50] (step=0014631) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.28431791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14632, "loss": 0.2431962937116623, "memory_gb": 7.721559524536133, "step_time_ms": 3369.144916534424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:54] (step=0014632) Train Loss: 0.2582, Train Steps/Sec: 0.28, Epoch: 0.28433734939759037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:37:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14633, "loss": 0.17078882455825806, "memory_gb": 7.721559524536133, "step_time_ms": 3351.696252822876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:37:57] (step=0014633) Train Loss: 0.2810, Train Steps/Sec: 0.28, Epoch: 0.284356781966576, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14634, "loss": 0.19706198573112488, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9449977874756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:01] (step=0014634) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.2843762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14635, "loss": 0.24665331840515137, "memory_gb": 7.721559524536133, "step_time_ms": 3373.9776611328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:05] (step=0014635) Train Loss: 0.2721, Train Steps/Sec: 0.28, Epoch: 0.28439564710454723, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14636, "loss": 0.29957935214042664, "memory_gb": 7.721559524536133, "step_time_ms": 3364.527940750122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:08] (step=0014636) Train Loss: 0.3010, Train Steps/Sec: 0.28, Epoch: 0.28441507967353286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14637, "loss": 0.1413549929857254, "memory_gb": 7.721559524536133, "step_time_ms": 3371.9773292541504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:12] (step=0014637) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.2844345122425185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14638, "loss": 0.3258446455001831, "memory_gb": 7.721559524536133, "step_time_ms": 3367.5906658172607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:16] (step=0014638) Train Loss: 0.2966, Train Steps/Sec: 0.28, Epoch: 0.2844539448115041, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14639, "loss": 0.17239677906036377, "memory_gb": 7.721559524536133, "step_time_ms": 3363.821268081665, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:19] (step=0014639) Train Loss: 0.2026, Train Steps/Sec: 0.28, Epoch: 0.2844733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14640, "loss": 0.263896644115448, "memory_gb": 7.721559524536133, "step_time_ms": 3369.509696960449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:23] (step=0014640) Train Loss: 0.2963, Train Steps/Sec: 0.28, Epoch: 0.28449280994947534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14641, "loss": 0.16706041991710663, "memory_gb": 7.721559524536133, "step_time_ms": 3367.152214050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:26] (step=0014641) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.28451224251846097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14642, "loss": 0.2430277168750763, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2260818481445, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:30] (step=0014642) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.28453167508744653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14643, "loss": 0.2738575339317322, "memory_gb": 7.721559524536133, "step_time_ms": 3373.78191947937, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:34] (step=0014643) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.28455110765643216, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14644, "loss": 0.14803817868232727, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7620372772217, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:37] (step=0014644) Train Loss: 0.1448, Train Steps/Sec: 0.28, Epoch: 0.2845705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14645, "loss": 0.3556038439273834, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7565326690674, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:41] (step=0014645) Train Loss: 0.3092, Train Steps/Sec: 0.28, Epoch: 0.2845899727944034, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14646, "loss": 0.2376859337091446, "memory_gb": 7.721559524536133, "step_time_ms": 3366.177797317505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:45] (step=0014646) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.284609405363389, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14647, "loss": 0.19311991333961487, "memory_gb": 7.721559524536133, "step_time_ms": 3356.445789337158, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:48] (step=0014647) Train Loss: 0.2116, Train Steps/Sec: 0.28, Epoch: 0.28462883793237465, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14648, "loss": 0.16118945181369781, "memory_gb": 7.721559524536133, "step_time_ms": 3353.515625, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:52] (step=0014648) Train Loss: 0.1682, Train Steps/Sec: 0.28, Epoch: 0.28464827050136027, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14649, "loss": 0.3569166660308838, "memory_gb": 7.721559524536133, "step_time_ms": 3366.908073425293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:55] (step=0014649) Train Loss: 0.3101, Train Steps/Sec: 0.28, Epoch: 0.2846677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:38:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14650, "loss": 0.2873036861419678, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8689098358154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:38:59] (step=0014650) Train Loss: 0.2303, Train Steps/Sec: 0.28, Epoch: 0.2846871356393315, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14651, "loss": 0.31030842661857605, "memory_gb": 7.715639114379883, "step_time_ms": 3326.777219772339, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:03] (step=0014651) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.28470656820831713, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14652, "loss": 0.3212492763996124, "memory_gb": 7.721559524536133, "step_time_ms": 3363.755702972412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:06] (step=0014652) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.28472600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14653, "loss": 0.15381501615047455, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9490604400635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:10] (step=0014653) Train Loss: 0.2315, Train Steps/Sec: 0.28, Epoch: 0.2847454333462884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14654, "loss": 0.25575554370880127, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7641410827637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:14] (step=0014654) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.284764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14655, "loss": 0.2724078595638275, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2143268585205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:17] (step=0014655) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.2847842984842596, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14656, "loss": 0.27294397354125977, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1509284973145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:21] (step=0014656) Train Loss: 0.2693, Train Steps/Sec: 0.28, Epoch: 0.28480373105324525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14657, "loss": 0.1789332628250122, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2817058563232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:24] (step=0014657) Train Loss: 0.2050, Train Steps/Sec: 0.28, Epoch: 0.28482316362223087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14658, "loss": 0.18225504457950592, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1670989990234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:28] (step=0014658) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.2848425961912165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14659, "loss": 0.2509367763996124, "memory_gb": 7.721559524536133, "step_time_ms": 3360.253095626831, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:32] (step=0014659) Train Loss: 0.2855, Train Steps/Sec: 0.28, Epoch: 0.2848620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14660, "loss": 0.24188919365406036, "memory_gb": 7.721559524536133, "step_time_ms": 3353.395462036133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:35] (step=0014660) Train Loss: 0.2258, Train Steps/Sec: 0.28, Epoch: 0.28488146132918774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14661, "loss": 0.3588376045227051, "memory_gb": 7.721559524536133, "step_time_ms": 3358.510971069336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:39] (step=0014661) Train Loss: 0.2474, Train Steps/Sec: 0.27, Epoch: 0.28490089389817336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14662, "loss": 0.27785974740982056, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1581535339355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:43] (step=0014662) Train Loss: 0.2212, Train Steps/Sec: 0.28, Epoch: 0.284920326467159, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14663, "loss": 0.2853531241416931, "memory_gb": 7.721559524536133, "step_time_ms": 3356.2614917755127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:46] (step=0014663) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.2849397590361446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14664, "loss": 0.18948811292648315, "memory_gb": 7.721559524536133, "step_time_ms": 3349.142074584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:50] (step=0014664) Train Loss: 0.2403, Train Steps/Sec: 0.28, Epoch: 0.2849591916051302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14665, "loss": 0.3180490732192993, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4018173217773, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:53] (step=0014665) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.2849786241741158, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:39:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14666, "loss": 0.13464419543743134, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5057258605957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:39:57] (step=0014666) Train Loss: 0.1883, Train Steps/Sec: 0.28, Epoch: 0.2849980567431014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14667, "loss": 0.2164415866136551, "memory_gb": 7.721559524536133, "step_time_ms": 3351.858139038086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:01] (step=0014667) Train Loss: 0.2230, Train Steps/Sec: 0.28, Epoch: 0.28501748931208704, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14668, "loss": 0.16823643445968628, "memory_gb": 7.721559524536133, "step_time_ms": 3364.319324493408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:04] (step=0014668) Train Loss: 0.1434, Train Steps/Sec: 0.28, Epoch: 0.28503692188107266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14669, "loss": 0.17290203273296356, "memory_gb": 7.721559524536133, "step_time_ms": 3359.903335571289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:08] (step=0014669) Train Loss: 0.2498, Train Steps/Sec: 0.28, Epoch: 0.2850563544500583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14670, "loss": 0.18837416172027588, "memory_gb": 7.721559524536133, "step_time_ms": 3504.5557022094727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:12] (step=0014670) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.2850757870190439, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14671, "loss": 0.1672867238521576, "memory_gb": 7.721559524536133, "step_time_ms": 3360.750675201416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:15] (step=0014671) Train Loss: 0.1906, Train Steps/Sec: 0.28, Epoch: 0.2850952195880295, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14672, "loss": 0.18991217017173767, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7078018188477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:19] (step=0014672) Train Loss: 0.2200, Train Steps/Sec: 0.28, Epoch: 0.28511465215701515, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14673, "loss": 0.24828405678272247, "memory_gb": 7.721559524536133, "step_time_ms": 3355.8247089385986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:22] (step=0014673) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.28513408472600077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14674, "loss": 0.1988103985786438, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8694076538086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:26] (step=0014674) Train Loss: 0.2022, Train Steps/Sec: 0.28, Epoch: 0.2851535172949864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14675, "loss": 0.281952828168869, "memory_gb": 7.721559524536133, "step_time_ms": 3357.746124267578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:30] (step=0014675) Train Loss: 0.2619, Train Steps/Sec: 0.28, Epoch: 0.285172949863972, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14676, "loss": 0.30385440587997437, "memory_gb": 7.721559524536133, "step_time_ms": 3350.620746612549, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:33] (step=0014676) Train Loss: 0.2874, Train Steps/Sec: 0.28, Epoch: 0.28519238243295764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14677, "loss": 0.141752228140831, "memory_gb": 7.721559524536133, "step_time_ms": 3359.111547470093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:37] (step=0014677) Train Loss: 0.2051, Train Steps/Sec: 0.28, Epoch: 0.28521181500194326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 14678, "loss": 0.2970123291015625, "memory_gb": 7.721559524536133, "step_time_ms": 3362.332344055176, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:40] (step=0014678) Train Loss: 0.2343, Train Steps/Sec: 0.28, Epoch: 0.2852312475709289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14679, "loss": 0.25675827264785767, "memory_gb": 7.721559524536133, "step_time_ms": 3357.8217029571533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:44] (step=0014679) Train Loss: 0.2964, Train Steps/Sec: 0.28, Epoch: 0.2852506801399145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14680, "loss": 0.33965611457824707, "memory_gb": 7.721559524536133, "step_time_ms": 3358.478307723999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:48] (step=0014680) Train Loss: 0.2478, Train Steps/Sec: 0.28, Epoch: 0.2852701127089001, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 14681, "loss": 0.27846992015838623, "memory_gb": 7.721559524536133, "step_time_ms": 3360.292673110962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:51] (step=0014681) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.28528954527788575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14682, "loss": 0.315303772687912, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7302436828613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:55] (step=0014682) Train Loss: 0.2664, Train Steps/Sec: 0.28, Epoch: 0.28530897784687137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:40:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14683, "loss": 0.17966438829898834, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0472049713135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:40:58] (step=0014683) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.285328410415857, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14684, "loss": 0.23418378829956055, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8176708221436, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:02] (step=0014684) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2853478429848426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14685, "loss": 0.1980314552783966, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7260971069336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:06] (step=0014685) Train Loss: 0.1750, Train Steps/Sec: 0.28, Epoch: 0.28536727555382824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 14686, "loss": 0.17760103940963745, "memory_gb": 7.721559524536133, "step_time_ms": 3356.365442276001, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:09] (step=0014686) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.28538670812281386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14687, "loss": 0.18470004200935364, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4271926879883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:13] (step=0014687) Train Loss: 0.1509, Train Steps/Sec: 0.28, Epoch: 0.2854061406917995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14688, "loss": 0.3582078218460083, "memory_gb": 7.721559524536133, "step_time_ms": 3358.940362930298, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:16] (step=0014688) Train Loss: 0.3078, Train Steps/Sec: 0.28, Epoch: 0.28542557326078505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14689, "loss": 0.2109614610671997, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2120876312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:20] (step=0014689) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.28544500582977067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14690, "loss": 0.17301389575004578, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5822162628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:24] (step=0014690) Train Loss: 0.1786, Train Steps/Sec: 0.28, Epoch: 0.2854644383987563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14691, "loss": 0.277495801448822, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1698150634766, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:27] (step=0014691) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.2854838709677419, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14692, "loss": 0.251399964094162, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7256927490234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:31] (step=0014692) Train Loss: 0.2505, Train Steps/Sec: 0.28, Epoch: 0.28550330353672754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14693, "loss": 0.27239835262298584, "memory_gb": 7.721559524536133, "step_time_ms": 3361.490488052368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:34] (step=0014693) Train Loss: 0.2770, Train Steps/Sec: 0.28, Epoch: 0.28552273610571316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14694, "loss": 0.21397918462753296, "memory_gb": 7.721559524536133, "step_time_ms": 3357.107400894165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:38] (step=0014694) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.2855421686746988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14695, "loss": 0.27676132321357727, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2569313049316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:42] (step=0014695) Train Loss: 0.3165, Train Steps/Sec: 0.28, Epoch: 0.2855616012436844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14696, "loss": 0.1702469289302826, "memory_gb": 7.721559524536133, "step_time_ms": 3350.483179092407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:45] (step=0014696) Train Loss: 0.1949, Train Steps/Sec: 0.28, Epoch: 0.28558103381267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14697, "loss": 0.19924326241016388, "memory_gb": 7.721559524536133, "step_time_ms": 3362.774133682251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:49] (step=0014697) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.28560046638165565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14698, "loss": 0.20000538229942322, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5030517578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:52] (step=0014698) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.28561989895064127, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:41:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14699, "loss": 0.19854184985160828, "memory_gb": 7.721559524536133, "step_time_ms": 3355.4606437683105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:41:56] (step=0014699) Train Loss: 0.2137, Train Steps/Sec: 0.28, Epoch: 0.2856393315196269, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14700, "loss": 0.24976375699043274, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7990016937256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:00] (step=0014700) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.2856587640886125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14701, "loss": 0.26701435446739197, "memory_gb": 7.721559524536133, "step_time_ms": 3362.577438354492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:03] (step=0014701) Train Loss: 0.2389, Train Steps/Sec: 0.28, Epoch: 0.28567819665759814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14702, "loss": 0.30722105503082275, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1888885498047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:07] (step=0014702) Train Loss: 0.2554, Train Steps/Sec: 0.28, Epoch: 0.28569762922658376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14703, "loss": 0.2031274139881134, "memory_gb": 7.721559524536133, "step_time_ms": 3348.7625122070312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:10] (step=0014703) Train Loss: 0.2029, Train Steps/Sec: 0.28, Epoch: 0.2857170617955694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14704, "loss": 0.1365060657262802, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5779151916504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:14] (step=0014704) Train Loss: 0.1575, Train Steps/Sec: 0.28, Epoch: 0.285736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14705, "loss": 0.30174458026885986, "memory_gb": 7.721559524536133, "step_time_ms": 3362.825393676758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:18] (step=0014705) Train Loss: 0.2678, Train Steps/Sec: 0.28, Epoch: 0.2857559269335406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14706, "loss": 0.1768631488084793, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5611534118652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:21] (step=0014706) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.28577535950252625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14707, "loss": 0.20767977833747864, "memory_gb": 7.721559524536133, "step_time_ms": 3362.75053024292, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:25] (step=0014707) Train Loss: 0.2186, Train Steps/Sec: 0.28, Epoch: 0.28579479207151187, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14708, "loss": 0.18646371364593506, "memory_gb": 7.721559524536133, "step_time_ms": 3365.6692504882812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:28] (step=0014708) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.2858142246404975, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14709, "loss": 0.2362256944179535, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4726276397705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:32] (step=0014709) Train Loss: 0.2083, Train Steps/Sec: 0.27, Epoch: 0.2858336572094831, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14710, "loss": 0.24118082225322723, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9610748291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:36] (step=0014710) Train Loss: 0.2451, Train Steps/Sec: 0.28, Epoch: 0.28585308977846874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14711, "loss": 0.2025403380393982, "memory_gb": 7.721559524536133, "step_time_ms": 3362.4582290649414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:39] (step=0014711) Train Loss: 0.2148, Train Steps/Sec: 0.28, Epoch: 0.28587252234745436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14712, "loss": 0.2629684805870056, "memory_gb": 7.721559524536133, "step_time_ms": 3361.959218978882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:43] (step=0014712) Train Loss: 0.3148, Train Steps/Sec: 0.28, Epoch: 0.2858919549164399, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14713, "loss": 0.27996981143951416, "memory_gb": 7.721559524536133, "step_time_ms": 3344.684839248657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:47] (step=0014713) Train Loss: 0.2997, Train Steps/Sec: 0.28, Epoch: 0.28591138748542555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14714, "loss": 0.20183636248111725, "memory_gb": 7.721559524536133, "step_time_ms": 3346.9107151031494, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:50] (step=0014714) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.28593082005441117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14715, "loss": 0.3055623173713684, "memory_gb": 7.721559524536133, "step_time_ms": 3362.069845199585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:54] (step=0014715) Train Loss: 0.2771, Train Steps/Sec: 0.28, Epoch: 0.2859502526233968, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:42:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14716, "loss": 0.3240188956260681, "memory_gb": 7.721559524536133, "step_time_ms": 3364.013671875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:42:57] (step=0014716) Train Loss: 0.2508, Train Steps/Sec: 0.28, Epoch: 0.2859696851923824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14717, "loss": 0.36593878269195557, "memory_gb": 7.721559524536133, "step_time_ms": 3367.2914505004883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:01] (step=0014717) Train Loss: 0.3016, Train Steps/Sec: 0.28, Epoch: 0.28598911776136804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14718, "loss": 0.13313168287277222, "memory_gb": 7.721559524536133, "step_time_ms": 3506.1917304992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:05] (step=0014718) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.28600855033035366, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14719, "loss": 0.15408310294151306, "memory_gb": 7.721559524536133, "step_time_ms": 3367.687463760376, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:08] (step=0014719) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.2860279828993393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14720, "loss": 0.30247363448143005, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9543476104736, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:12] (step=0014720) Train Loss: 0.2973, Train Steps/Sec: 0.28, Epoch: 0.2860474154683249, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14721, "loss": 0.2025740146636963, "memory_gb": 7.721559524536133, "step_time_ms": 3365.4186725616455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:15] (step=0014721) Train Loss: 0.2473, Train Steps/Sec: 0.28, Epoch: 0.28606684803731053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14722, "loss": 0.21680159866809845, "memory_gb": 7.721559524536133, "step_time_ms": 3366.926670074463, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:19] (step=0014722) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.28608628060629615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14723, "loss": 0.15217402577400208, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0172691345215, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:23] (step=0014723) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.2861057131752818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14724, "loss": 0.18676653504371643, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3664169311523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:26] (step=0014724) Train Loss: 0.2162, Train Steps/Sec: 0.28, Epoch: 0.2861251457442674, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14725, "loss": 0.1959543079137802, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7466220855713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:30] (step=0014725) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.286144578313253, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14726, "loss": 0.14138157665729523, "memory_gb": 7.721559524536133, "step_time_ms": 3365.8359050750732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:34] (step=0014726) Train Loss: 0.2147, Train Steps/Sec: 0.28, Epoch: 0.28616401088223864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14727, "loss": 0.326884925365448, "memory_gb": 7.721559524536133, "step_time_ms": 3369.629144668579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:37] (step=0014727) Train Loss: 0.2438, Train Steps/Sec: 0.28, Epoch: 0.28618344345122426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14728, "loss": 0.19077031314373016, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1397018432617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:41] (step=0014728) Train Loss: 0.1892, Train Steps/Sec: 0.28, Epoch: 0.2862028760202099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14729, "loss": 0.21142348647117615, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5336112976074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:44] (step=0014729) Train Loss: 0.1950, Train Steps/Sec: 0.28, Epoch: 0.2862223085891955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14730, "loss": 0.22819280624389648, "memory_gb": 7.721559524536133, "step_time_ms": 3368.5595989227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:48] (step=0014730) Train Loss: 0.2537, Train Steps/Sec: 0.28, Epoch: 0.28624174115818113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14731, "loss": 0.23560206592082977, "memory_gb": 7.721559524536133, "step_time_ms": 3371.5693950653076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:52] (step=0014731) Train Loss: 0.3043, Train Steps/Sec: 0.28, Epoch: 0.28626117372716675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14732, "loss": 0.2358873039484024, "memory_gb": 7.721559524536133, "step_time_ms": 3376.87611579895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:55] (step=0014732) Train Loss: 0.2115, Train Steps/Sec: 0.28, Epoch: 0.2862806062961524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:43:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14733, "loss": 0.2756239175796509, "memory_gb": 7.721559524536133, "step_time_ms": 3372.2050189971924, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:43:59] (step=0014733) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.286300038865138, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 14734, "loss": 0.2612372934818268, "memory_gb": 7.721559524536133, "step_time_ms": 3370.668411254883, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:02] (step=0014734) Train Loss: 0.2674, Train Steps/Sec: 0.28, Epoch: 0.2863194714341236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14735, "loss": 0.18016737699508667, "memory_gb": 7.721559524536133, "step_time_ms": 3348.724603652954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:06] (step=0014735) Train Loss: 0.1970, Train Steps/Sec: 0.28, Epoch: 0.2863389040031092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14736, "loss": 0.1892842948436737, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4589862823486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:10] (step=0014736) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.2863583365720948, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14737, "loss": 0.2396303415298462, "memory_gb": 7.721559524536133, "step_time_ms": 3373.96502494812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:13] (step=0014737) Train Loss: 0.2432, Train Steps/Sec: 0.28, Epoch: 0.28637776914108043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14738, "loss": 0.18512314558029175, "memory_gb": 7.721559524536133, "step_time_ms": 3375.077247619629, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:17] (step=0014738) Train Loss: 0.2061, Train Steps/Sec: 0.28, Epoch: 0.28639720171006605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 14739, "loss": 0.3118821382522583, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4228191375732, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:20] (step=0014739) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2864166342790517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14740, "loss": 0.18517184257507324, "memory_gb": 7.721559524536133, "step_time_ms": 3374.8505115509033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:24] (step=0014740) Train Loss: 0.2388, Train Steps/Sec: 0.28, Epoch: 0.2864360668480373, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14741, "loss": 0.2643311321735382, "memory_gb": 7.721559524536133, "step_time_ms": 3368.96014213562, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:28] (step=0014741) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.2864554994170229, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14742, "loss": 0.26075172424316406, "memory_gb": 7.721559524536133, "step_time_ms": 3376.6844272613525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:31] (step=0014742) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.28647493198600854, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14743, "loss": 0.26659709215164185, "memory_gb": 7.721559524536133, "step_time_ms": 3376.706600189209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:35] (step=0014743) Train Loss: 0.2651, Train Steps/Sec: 0.28, Epoch: 0.28649436455499416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14744, "loss": 0.24086356163024902, "memory_gb": 7.721559524536133, "step_time_ms": 3373.037576675415, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:39] (step=0014744) Train Loss: 0.2473, Train Steps/Sec: 0.28, Epoch: 0.2865137971239798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14745, "loss": 0.2042209506034851, "memory_gb": 7.721559524536133, "step_time_ms": 3377.164125442505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:42] (step=0014745) Train Loss: 0.2179, Train Steps/Sec: 0.28, Epoch: 0.2865332296929654, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14746, "loss": 0.25567305088043213, "memory_gb": 7.721559524536133, "step_time_ms": 3363.719940185547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:46] (step=0014746) Train Loss: 0.2551, Train Steps/Sec: 0.28, Epoch: 0.28655266226195103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14747, "loss": 0.22318914532661438, "memory_gb": 7.721559524536133, "step_time_ms": 3367.568254470825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:49] (step=0014747) Train Loss: 0.2524, Train Steps/Sec: 0.28, Epoch: 0.28657209483093665, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14748, "loss": 0.2791035771369934, "memory_gb": 7.721559524536133, "step_time_ms": 3377.095937728882, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:53] (step=0014748) Train Loss: 0.2545, Train Steps/Sec: 0.28, Epoch: 0.2865915273999223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:44:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14749, "loss": 0.3115236163139343, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4332885742188, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:44:57] (step=0014749) Train Loss: 0.2856, Train Steps/Sec: 0.28, Epoch: 0.2866109599689079, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14750, "loss": 0.12002534419298172, "memory_gb": 7.721559524536133, "step_time_ms": 3372.941732406616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:01] (step=0014750) Train Loss: 0.1471, Train Steps/Sec: 0.27, Epoch: 0.2866303925378935, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14751, "loss": 0.17309343814849854, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4824237823486, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:04] (step=0014751) Train Loss: 0.2079, Train Steps/Sec: 0.28, Epoch: 0.28664982510687914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14752, "loss": 0.3259349465370178, "memory_gb": 7.721559524536133, "step_time_ms": 3376.879930496216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:08] (step=0014752) Train Loss: 0.3062, Train Steps/Sec: 0.28, Epoch: 0.28666925767586476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14753, "loss": 0.29214367270469666, "memory_gb": 7.721559524536133, "step_time_ms": 3376.220226287842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:11] (step=0014753) Train Loss: 0.2407, Train Steps/Sec: 0.27, Epoch: 0.2866886902448504, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14754, "loss": 0.22778700292110443, "memory_gb": 7.721559524536133, "step_time_ms": 3373.269557952881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:15] (step=0014754) Train Loss: 0.2489, Train Steps/Sec: 0.27, Epoch: 0.286708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14755, "loss": 0.1917915642261505, "memory_gb": 7.721559524536133, "step_time_ms": 3375.9472370147705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:19] (step=0014755) Train Loss: 0.2229, Train Steps/Sec: 0.27, Epoch: 0.28672755538282163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14756, "loss": 0.30060482025146484, "memory_gb": 7.721559524536133, "step_time_ms": 3375.0789165496826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:22] (step=0014756) Train Loss: 0.2565, Train Steps/Sec: 0.27, Epoch: 0.28674698795180725, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14757, "loss": 0.2875157296657562, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4232444763184, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:26] (step=0014757) Train Loss: 0.2960, Train Steps/Sec: 0.27, Epoch: 0.2867664205207929, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14758, "loss": 0.28078222274780273, "memory_gb": 7.721559524536133, "step_time_ms": 3514.95361328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:30] (step=0014758) Train Loss: 0.2338, Train Steps/Sec: 0.27, Epoch: 0.28678585308977844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 14759, "loss": 0.3195022642612457, "memory_gb": 7.721559524536133, "step_time_ms": 3370.098114013672, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:33] (step=0014759) Train Loss: 0.2189, Train Steps/Sec: 0.27, Epoch: 0.28680528565876406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14760, "loss": 0.3377820551395416, "memory_gb": 7.721559524536133, "step_time_ms": 3371.591329574585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:37] (step=0014760) Train Loss: 0.2859, Train Steps/Sec: 0.28, Epoch: 0.2868247182277497, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14761, "loss": 0.18659904599189758, "memory_gb": 7.721559524536133, "step_time_ms": 3374.2198944091797, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:41] (step=0014761) Train Loss: 0.2087, Train Steps/Sec: 0.27, Epoch: 0.2868441507967353, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14762, "loss": 0.24936111271381378, "memory_gb": 7.721559524536133, "step_time_ms": 3373.5556602478027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:44] (step=0014762) Train Loss: 0.1706, Train Steps/Sec: 0.27, Epoch: 0.28686358336572093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14763, "loss": 0.25623127818107605, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8438873291016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:48] (step=0014763) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.28688301593470655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14764, "loss": 0.26504939794540405, "memory_gb": 7.721559524536133, "step_time_ms": 3374.1676807403564, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:52] (step=0014764) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.2869024485036922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14765, "loss": 0.2394738346338272, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4137802124023, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:55] (step=0014765) Train Loss: 0.2469, Train Steps/Sec: 0.28, Epoch: 0.2869218810726778, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:45:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14766, "loss": 0.18013323843479156, "memory_gb": 7.721559524536133, "step_time_ms": 3370.173931121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:45:59] (step=0014766) Train Loss: 0.2360, Train Steps/Sec: 0.28, Epoch: 0.2869413136416634, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14767, "loss": 0.2079273760318756, "memory_gb": 7.721559524536133, "step_time_ms": 3369.380235671997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:03] (step=0014767) Train Loss: 0.2268, Train Steps/Sec: 0.28, Epoch: 0.28696074621064904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14768, "loss": 0.23010092973709106, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6418533325195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:06] (step=0014768) Train Loss: 0.2175, Train Steps/Sec: 0.28, Epoch: 0.28698017877963466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14769, "loss": 0.1728348731994629, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0103034973145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:10] (step=0014769) Train Loss: 0.1654, Train Steps/Sec: 0.28, Epoch: 0.2869996113486203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14770, "loss": 0.24282331764698029, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0542755126953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:13] (step=0014770) Train Loss: 0.1951, Train Steps/Sec: 0.28, Epoch: 0.2870190439176059, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14771, "loss": 0.19669491052627563, "memory_gb": 7.721559524536133, "step_time_ms": 3370.7315921783447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:17] (step=0014771) Train Loss: 0.2216, Train Steps/Sec: 0.27, Epoch: 0.28703847648659153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14772, "loss": 0.227888822555542, "memory_gb": 7.721559524536133, "step_time_ms": 3367.65456199646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:21] (step=0014772) Train Loss: 0.2752, Train Steps/Sec: 0.28, Epoch: 0.28705790905557715, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14773, "loss": 0.21530592441558838, "memory_gb": 7.721559524536133, "step_time_ms": 3371.110200881958, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:24] (step=0014773) Train Loss: 0.1900, Train Steps/Sec: 0.28, Epoch: 0.2870773416245628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14774, "loss": 0.2267231047153473, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4492111206055, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:28] (step=0014774) Train Loss: 0.2182, Train Steps/Sec: 0.28, Epoch: 0.2870967741935484, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14775, "loss": 0.2827579975128174, "memory_gb": 7.721559524536133, "step_time_ms": 3371.4327812194824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:32] (step=0014775) Train Loss: 0.3188, Train Steps/Sec: 0.28, Epoch: 0.287116206762534, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14776, "loss": 0.2706453502178192, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9021129608154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:35] (step=0014776) Train Loss: 0.2555, Train Steps/Sec: 0.28, Epoch: 0.28713563933151964, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14777, "loss": 0.234117791056633, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8602027893066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:39] (step=0014777) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.28715507190050527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14778, "loss": 0.2111051082611084, "memory_gb": 7.721559524536133, "step_time_ms": 3367.652654647827, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:42] (step=0014778) Train Loss: 0.2510, Train Steps/Sec: 0.28, Epoch: 0.2871745044694909, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14779, "loss": 0.2209177315235138, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5552864074707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:46] (step=0014779) Train Loss: 0.2226, Train Steps/Sec: 0.28, Epoch: 0.2871939370384765, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14780, "loss": 0.14052169024944305, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1794452667236, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:50] (step=0014780) Train Loss: 0.1439, Train Steps/Sec: 0.28, Epoch: 0.28721336960746213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14781, "loss": 0.2032208889722824, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6951446533203, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:53] (step=0014781) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.2872328021764477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:46:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14782, "loss": 0.19033189117908478, "memory_gb": 7.721559524536133, "step_time_ms": 3364.429473876953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:46:57] (step=0014782) Train Loss: 0.2601, Train Steps/Sec: 0.28, Epoch: 0.2872522347454333, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14783, "loss": 0.11297726631164551, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7001819610596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:00] (step=0014783) Train Loss: 0.1627, Train Steps/Sec: 0.28, Epoch: 0.28727166731441894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14784, "loss": 0.23078761994838715, "memory_gb": 7.721559524536133, "step_time_ms": 3359.229803085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:04] (step=0014784) Train Loss: 0.2188, Train Steps/Sec: 0.28, Epoch: 0.28729109988340457, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14785, "loss": 0.17945601046085358, "memory_gb": 7.721559524536133, "step_time_ms": 3360.095262527466, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:07] (step=0014785) Train Loss: 0.1619, Train Steps/Sec: 0.28, Epoch: 0.2873105324523902, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14786, "loss": 0.267048180103302, "memory_gb": 7.721559524536133, "step_time_ms": 3358.313798904419, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:11] (step=0014786) Train Loss: 0.3040, Train Steps/Sec: 0.28, Epoch: 0.2873299650213758, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14787, "loss": 0.18342354893684387, "memory_gb": 7.721559524536133, "step_time_ms": 3359.806537628174, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:14] (step=0014787) Train Loss: 0.2167, Train Steps/Sec: 0.28, Epoch: 0.28734939759036143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14788, "loss": 0.23916137218475342, "memory_gb": 7.721559524536133, "step_time_ms": 3361.633062362671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:18] (step=0014788) Train Loss: 0.2206, Train Steps/Sec: 0.28, Epoch: 0.28736883015934706, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14789, "loss": 0.24619579315185547, "memory_gb": 7.721559524536133, "step_time_ms": 3360.006332397461, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:22] (step=0014789) Train Loss: 0.2119, Train Steps/Sec: 0.28, Epoch: 0.2873882627283327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14790, "loss": 0.1974477916955948, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4796447753906, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:25] (step=0014790) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.2874076952973183, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14791, "loss": 0.25322964787483215, "memory_gb": 7.721559524536133, "step_time_ms": 3358.145236968994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:29] (step=0014791) Train Loss: 0.2183, Train Steps/Sec: 0.28, Epoch: 0.2874271278663039, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14792, "loss": 0.26751214265823364, "memory_gb": 7.721559524536133, "step_time_ms": 3357.501983642578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:32] (step=0014792) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.28744656043528954, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14793, "loss": 0.2798874080181122, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0303916931152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:36] (step=0014793) Train Loss: 0.2738, Train Steps/Sec: 0.28, Epoch: 0.28746599300427517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14794, "loss": 0.2261490523815155, "memory_gb": 7.721559524536133, "step_time_ms": 3355.7634353637695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:39] (step=0014794) Train Loss: 0.2589, Train Steps/Sec: 0.28, Epoch: 0.2874854255732608, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14795, "loss": 0.2606508731842041, "memory_gb": 7.721559524536133, "step_time_ms": 3351.282835006714, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:43] (step=0014795) Train Loss: 0.2635, Train Steps/Sec: 0.28, Epoch: 0.2875048581422464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14796, "loss": 0.30916523933410645, "memory_gb": 7.721559524536133, "step_time_ms": 3351.78279876709, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:46] (step=0014796) Train Loss: 0.2285, Train Steps/Sec: 0.29, Epoch: 0.28752429071123203, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14797, "loss": 0.24264289438724518, "memory_gb": 7.721559524536133, "step_time_ms": 3355.5917739868164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:50] (step=0014797) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.28754372328021766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14798, "loss": 0.2466113567352295, "memory_gb": 7.721559524536133, "step_time_ms": 3356.849431991577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:53] (step=0014798) Train Loss: 0.2422, Train Steps/Sec: 0.28, Epoch: 0.2875631558492033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:47:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14799, "loss": 0.24360260367393494, "memory_gb": 7.721559524536133, "step_time_ms": 3339.3242359161377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:47:57] (step=0014799) Train Loss: 0.2170, Train Steps/Sec: 0.29, Epoch: 0.2875825884181889, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14800, "loss": 0.26437437534332275, "memory_gb": 7.721559524536133, "step_time_ms": 3351.3245582580566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:00] (step=0014800) Train Loss: 0.2199, Train Steps/Sec: 0.29, Epoch: 0.2876020209871745, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14801, "loss": 0.2820170521736145, "memory_gb": 7.721559524536133, "step_time_ms": 3347.288131713867, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:04] (step=0014801) Train Loss: 0.2629, Train Steps/Sec: 0.29, Epoch: 0.28762145355616014, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14802, "loss": 0.24055629968643188, "memory_gb": 7.721559524536133, "step_time_ms": 3352.0967960357666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:07] (step=0014802) Train Loss: 0.2049, Train Steps/Sec: 0.29, Epoch: 0.28764088612514577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14803, "loss": 0.21343521773815155, "memory_gb": 7.721559524536133, "step_time_ms": 3347.011089324951, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:11] (step=0014803) Train Loss: 0.1964, Train Steps/Sec: 0.29, Epoch: 0.2876603186941314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14804, "loss": 0.23745927214622498, "memory_gb": 7.721559524536133, "step_time_ms": 3350.443124771118, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:14] (step=0014804) Train Loss: 0.2008, Train Steps/Sec: 0.29, Epoch: 0.287679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14805, "loss": 0.23821905255317688, "memory_gb": 7.721559524536133, "step_time_ms": 3493.9393997192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:18] (step=0014805) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.2876991838321026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14806, "loss": 0.3011283874511719, "memory_gb": 7.721559524536133, "step_time_ms": 3345.4251289367676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:21] (step=0014806) Train Loss: 0.2547, Train Steps/Sec: 0.29, Epoch: 0.2877186164010882, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14807, "loss": 0.22162920236587524, "memory_gb": 7.721559524536133, "step_time_ms": 3348.8636016845703, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:25] (step=0014807) Train Loss: 0.2051, Train Steps/Sec: 0.29, Epoch: 0.2877380489700738, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14808, "loss": 0.22549211978912354, "memory_gb": 7.721559524536133, "step_time_ms": 3351.837396621704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:28] (step=0014808) Train Loss: 0.2673, Train Steps/Sec: 0.29, Epoch: 0.28775748153905945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14809, "loss": 0.2479153722524643, "memory_gb": 7.721559524536133, "step_time_ms": 3348.8564491271973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:32] (step=0014809) Train Loss: 0.2443, Train Steps/Sec: 0.29, Epoch: 0.28777691410804507, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14810, "loss": 0.23990769684314728, "memory_gb": 7.721559524536133, "step_time_ms": 3349.41029548645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:35] (step=0014810) Train Loss: 0.2661, Train Steps/Sec: 0.29, Epoch: 0.2877963466770307, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14811, "loss": 0.20953762531280518, "memory_gb": 7.721559524536133, "step_time_ms": 3346.205711364746, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:39] (step=0014811) Train Loss: 0.2079, Train Steps/Sec: 0.29, Epoch: 0.2878157792460163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14812, "loss": 0.3112639784812927, "memory_gb": 7.721559524536133, "step_time_ms": 3347.508430480957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:42] (step=0014812) Train Loss: 0.3114, Train Steps/Sec: 0.29, Epoch: 0.28783521181500193, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14813, "loss": 0.19636432826519012, "memory_gb": 7.721559524536133, "step_time_ms": 3345.982551574707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:46] (step=0014813) Train Loss: 0.2331, Train Steps/Sec: 0.29, Epoch: 0.28785464438398756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14814, "loss": 0.2644249498844147, "memory_gb": 7.721559524536133, "step_time_ms": 3348.1814861297607, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:49] (step=0014814) Train Loss: 0.2162, Train Steps/Sec: 0.29, Epoch: 0.2878740769529732, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14815, "loss": 0.2882124185562134, "memory_gb": 7.715639114379883, "step_time_ms": 3309.3950748443604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:53] (step=0014815) Train Loss: 0.2626, Train Steps/Sec: 0.29, Epoch: 0.2878935095219588, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:48:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14816, "loss": 0.1903376430273056, "memory_gb": 7.721559524536133, "step_time_ms": 3346.571683883667, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:48:56] (step=0014816) Train Loss: 0.2174, Train Steps/Sec: 0.29, Epoch: 0.2879129420909444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14817, "loss": 0.2613489031791687, "memory_gb": 7.721559524536133, "step_time_ms": 3353.9116382598877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:00] (step=0014817) Train Loss: 0.2252, Train Steps/Sec: 0.29, Epoch: 0.28793237465993005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14818, "loss": 0.16307269036769867, "memory_gb": 7.721559524536133, "step_time_ms": 3346.2424278259277, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:03] (step=0014818) Train Loss: 0.2066, Train Steps/Sec: 0.29, Epoch: 0.28795180722891567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14819, "loss": 0.28892213106155396, "memory_gb": 7.721559524536133, "step_time_ms": 3349.644660949707, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:07] (step=0014819) Train Loss: 0.2175, Train Steps/Sec: 0.29, Epoch: 0.2879712397979013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14820, "loss": 0.1155933067202568, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7606258392334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:10] (step=0014820) Train Loss: 0.1339, Train Steps/Sec: 0.29, Epoch: 0.2879906723668869, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14821, "loss": 0.26407039165496826, "memory_gb": 7.721559524536133, "step_time_ms": 3357.661485671997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:14] (step=0014821) Train Loss: 0.2708, Train Steps/Sec: 0.29, Epoch: 0.28801010493587254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14822, "loss": 0.28160709142684937, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9300899505615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:17] (step=0014822) Train Loss: 0.3241, Train Steps/Sec: 0.28, Epoch: 0.28802953750485816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14823, "loss": 0.2843122184276581, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6037673950195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:21] (step=0014823) Train Loss: 0.2602, Train Steps/Sec: 0.28, Epoch: 0.2880489700738438, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14824, "loss": 0.2949702739715576, "memory_gb": 7.721559524536133, "step_time_ms": 3354.520797729492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:24] (step=0014824) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.2880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14825, "loss": 0.19507277011871338, "memory_gb": 7.721559524536133, "step_time_ms": 3361.392021179199, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:28] (step=0014825) Train Loss: 0.2048, Train Steps/Sec: 0.28, Epoch: 0.288087835211815, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14826, "loss": 0.30395177006721497, "memory_gb": 7.721559524536133, "step_time_ms": 3349.4067192077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:31] (step=0014826) Train Loss: 0.2577, Train Steps/Sec: 0.28, Epoch: 0.28810726778080065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14827, "loss": 0.3780457675457001, "memory_gb": 7.721559524536133, "step_time_ms": 3358.030080795288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:35] (step=0014827) Train Loss: 0.3097, Train Steps/Sec: 0.28, Epoch: 0.28812670034978627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 14828, "loss": 0.24860097467899323, "memory_gb": 7.721559524536133, "step_time_ms": 3356.861352920532, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:38] (step=0014828) Train Loss: 0.2382, Train Steps/Sec: 0.28, Epoch: 0.28814613291877184, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14829, "loss": 0.2403162121772766, "memory_gb": 7.721559524536133, "step_time_ms": 3362.661361694336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:42] (step=0014829) Train Loss: 0.2633, Train Steps/Sec: 0.28, Epoch: 0.28816556548775746, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14830, "loss": 0.2723234295845032, "memory_gb": 7.721559524536133, "step_time_ms": 3359.570264816284, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:45] (step=0014830) Train Loss: 0.2591, Train Steps/Sec: 0.28, Epoch: 0.2881849980567431, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14831, "loss": 0.1938614547252655, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4630699157715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:49] (step=0014831) Train Loss: 0.1871, Train Steps/Sec: 0.28, Epoch: 0.2882044306257287, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14832, "loss": 0.2705044150352478, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9960079193115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:53] (step=0014832) Train Loss: 0.2824, Train Steps/Sec: 0.28, Epoch: 0.2882238631947143, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:49:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14833, "loss": 0.36535295844078064, "memory_gb": 7.721559524536133, "step_time_ms": 3362.544298171997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:49:56] (step=0014833) Train Loss: 0.3142, Train Steps/Sec: 0.28, Epoch: 0.28824329576369995, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14834, "loss": 0.1720767319202423, "memory_gb": 7.721559524536133, "step_time_ms": 3361.164093017578, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:00] (step=0014834) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.28826272833268557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14835, "loss": 0.34156566858291626, "memory_gb": 7.721559524536133, "step_time_ms": 3359.255313873291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:03] (step=0014835) Train Loss: 0.2763, Train Steps/Sec: 0.28, Epoch: 0.2882821609016712, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14836, "loss": 0.1674976944923401, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3245277404785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:07] (step=0014836) Train Loss: 0.2067, Train Steps/Sec: 0.28, Epoch: 0.2883015934706568, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14837, "loss": 0.1735563576221466, "memory_gb": 7.721559524536133, "step_time_ms": 3356.8620681762695, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:10] (step=0014837) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.28832102603964244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14838, "loss": 0.33702218532562256, "memory_gb": 7.721559524536133, "step_time_ms": 3364.457130432129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:14] (step=0014838) Train Loss: 0.3025, Train Steps/Sec: 0.27, Epoch: 0.28834045860862806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14839, "loss": 0.28108924627304077, "memory_gb": 7.721559524536133, "step_time_ms": 3356.842517852783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:18] (step=0014839) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.2883598911776137, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14840, "loss": 0.32639843225479126, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6436462402344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:21] (step=0014840) Train Loss: 0.2806, Train Steps/Sec: 0.28, Epoch: 0.2883793237465993, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14841, "loss": 0.2735467553138733, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3127212524414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:25] (step=0014841) Train Loss: 0.2428, Train Steps/Sec: 0.28, Epoch: 0.2883987563155849, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14842, "loss": 0.23037919402122498, "memory_gb": 7.721559524536133, "step_time_ms": 3364.3851280212402, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:29] (step=0014842) Train Loss: 0.2130, Train Steps/Sec: 0.28, Epoch: 0.28841818888457055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14843, "loss": 0.35686495900154114, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3105964660645, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:32] (step=0014843) Train Loss: 0.3116, Train Steps/Sec: 0.28, Epoch: 0.28843762145355617, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14844, "loss": 0.21565115451812744, "memory_gb": 7.721559524536133, "step_time_ms": 3365.99063873291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:36] (step=0014844) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.2884570540225418, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14845, "loss": 0.21270428597927094, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9652004241943, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:39] (step=0014845) Train Loss: 0.1954, Train Steps/Sec: 0.28, Epoch: 0.2884764865915274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14846, "loss": 0.2343662530183792, "memory_gb": 7.721559524536133, "step_time_ms": 3366.737127304077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:43] (step=0014846) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.28849591916051304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14847, "loss": 0.20407173037528992, "memory_gb": 7.721559524536133, "step_time_ms": 3355.3082942962646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:47] (step=0014847) Train Loss: 0.2252, Train Steps/Sec: 0.28, Epoch: 0.28851535172949866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14848, "loss": 0.17773950099945068, "memory_gb": 7.721559524536133, "step_time_ms": 3364.238500595093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:50] (step=0014848) Train Loss: 0.1909, Train Steps/Sec: 0.28, Epoch: 0.2885347842984843, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14849, "loss": 0.25100797414779663, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1675968170166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:54] (step=0014849) Train Loss: 0.2040, Train Steps/Sec: 0.28, Epoch: 0.2885542168674699, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:50:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 14850, "loss": 0.24987231194972992, "memory_gb": 7.721559524536133, "step_time_ms": 3356.032371520996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:50:58] (step=0014850) Train Loss: 0.2032, Train Steps/Sec: 0.28, Epoch: 0.2885736494364555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14851, "loss": 0.20067493617534637, "memory_gb": 7.721559524536133, "step_time_ms": 3359.309673309326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:01] (step=0014851) Train Loss: 0.2368, Train Steps/Sec: 0.28, Epoch: 0.2885930820054411, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14852, "loss": 0.2659202516078949, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1295433044434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:05] (step=0014852) Train Loss: 0.2942, Train Steps/Sec: 0.28, Epoch: 0.2886125145744267, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14853, "loss": 0.15938979387283325, "memory_gb": 7.721559524536133, "step_time_ms": 3501.370429992676, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:08] (step=0014853) Train Loss: 0.2556, Train Steps/Sec: 0.28, Epoch: 0.28863194714341234, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14854, "loss": 0.2893386483192444, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4986362457275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:12] (step=0014854) Train Loss: 0.2229, Train Steps/Sec: 0.28, Epoch: 0.28865137971239796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 14855, "loss": 0.09813001751899719, "memory_gb": 7.721559524536133, "step_time_ms": 3368.635892868042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:16] (step=0014855) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.2886708122813836, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14856, "loss": 0.2701033055782318, "memory_gb": 7.715639114379883, "step_time_ms": 3341.843605041504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:19] (step=0014856) Train Loss: 0.2654, Train Steps/Sec: 0.28, Epoch: 0.2886902448503692, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14857, "loss": 0.33423641324043274, "memory_gb": 7.721559524536133, "step_time_ms": 3368.0522441864014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:23] (step=0014857) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.2887096774193548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 14858, "loss": 0.18947792053222656, "memory_gb": 7.721559524536133, "step_time_ms": 3368.682861328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:27] (step=0014858) Train Loss: 0.2637, Train Steps/Sec: 0.28, Epoch: 0.28872910998834045, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14859, "loss": 0.19495822489261627, "memory_gb": 7.721559524536133, "step_time_ms": 3365.107774734497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:30] (step=0014859) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.28874854255732607, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14860, "loss": 0.3296407461166382, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5382957458496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:34] (step=0014860) Train Loss: 0.3563, Train Steps/Sec: 0.28, Epoch: 0.2887679751263117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14861, "loss": 0.2691698968410492, "memory_gb": 7.721559524536133, "step_time_ms": 3370.1322078704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:37] (step=0014861) Train Loss: 0.1985, Train Steps/Sec: 0.28, Epoch: 0.2887874076952973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14862, "loss": 0.2797713279724121, "memory_gb": 7.721559524536133, "step_time_ms": 3365.229845046997, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:41] (step=0014862) Train Loss: 0.2851, Train Steps/Sec: 0.28, Epoch: 0.28880684026428294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 14863, "loss": 0.2947530150413513, "memory_gb": 7.721559524536133, "step_time_ms": 3364.388942718506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:45] (step=0014863) Train Loss: 0.3119, Train Steps/Sec: 0.28, Epoch: 0.28882627283326856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14864, "loss": 0.24143743515014648, "memory_gb": 7.721559524536133, "step_time_ms": 3370.26047706604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:48] (step=0014864) Train Loss: 0.2088, Train Steps/Sec: 0.28, Epoch: 0.2888457054022542, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14865, "loss": 0.1889645755290985, "memory_gb": 7.721559524536133, "step_time_ms": 3372.056484222412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:52] (step=0014865) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.2888651379712398, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14866, "loss": 0.25566786527633667, "memory_gb": 7.721559524536133, "step_time_ms": 3376.8322467803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:56] (step=0014866) Train Loss: 0.2239, Train Steps/Sec: 0.28, Epoch: 0.2888845705402254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:51:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14867, "loss": 0.2473953515291214, "memory_gb": 7.721559524536133, "step_time_ms": 3373.778820037842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:51:59] (step=0014867) Train Loss: 0.2110, Train Steps/Sec: 0.28, Epoch: 0.28890400310921105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14868, "loss": 0.29851865768432617, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8619861602783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:03] (step=0014868) Train Loss: 0.3010, Train Steps/Sec: 0.28, Epoch: 0.28892343567819667, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14869, "loss": 0.16847890615463257, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4518852233887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:06] (step=0014869) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.2889428682471823, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14870, "loss": 0.28475138545036316, "memory_gb": 7.721559524536133, "step_time_ms": 3359.454870223999, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:10] (step=0014870) Train Loss: 0.2197, Train Steps/Sec: 0.28, Epoch: 0.2889623008161679, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14871, "loss": 0.26072320342063904, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8956966400146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:14] (step=0014871) Train Loss: 0.2590, Train Steps/Sec: 0.28, Epoch: 0.28898173338515354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14872, "loss": 0.20273429155349731, "memory_gb": 7.721559524536133, "step_time_ms": 3369.678497314453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:17] (step=0014872) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.28900116595413916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14873, "loss": 0.2568352222442627, "memory_gb": 7.721559524536133, "step_time_ms": 3360.560655593872, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:21] (step=0014873) Train Loss: 0.2339, Train Steps/Sec: 0.28, Epoch: 0.2890205985231248, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14874, "loss": 0.20931798219680786, "memory_gb": 7.721559524536133, "step_time_ms": 3369.619846343994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:25] (step=0014874) Train Loss: 0.2263, Train Steps/Sec: 0.28, Epoch: 0.28904003109211035, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14875, "loss": 0.25332048535346985, "memory_gb": 7.721559524536133, "step_time_ms": 3369.817018508911, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:28] (step=0014875) Train Loss: 0.2708, Train Steps/Sec: 0.28, Epoch: 0.28905946366109597, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14876, "loss": 0.20744463801383972, "memory_gb": 7.721559524536133, "step_time_ms": 3369.455099105835, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:32] (step=0014876) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.2890788962300816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14877, "loss": 0.3025275468826294, "memory_gb": 7.721559524536133, "step_time_ms": 3369.300603866577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:35] (step=0014877) Train Loss: 0.3068, Train Steps/Sec: 0.28, Epoch: 0.2890983287990672, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14878, "loss": 0.15159939229488373, "memory_gb": 7.721559524536133, "step_time_ms": 3364.238500595093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:39] (step=0014878) Train Loss: 0.2098, Train Steps/Sec: 0.28, Epoch: 0.28911776136805284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14879, "loss": 0.37701478600502014, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4421520233154, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:43] (step=0014879) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.28913719393703846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14880, "loss": 0.22382299602031708, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3104610443115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:46] (step=0014880) Train Loss: 0.2283, Train Steps/Sec: 0.28, Epoch: 0.2891566265060241, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14881, "loss": 0.1688890904188156, "memory_gb": 7.721559524536133, "step_time_ms": 3367.304801940918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:50] (step=0014881) Train Loss: 0.1840, Train Steps/Sec: 0.28, Epoch: 0.2891760590750097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14882, "loss": 0.16256386041641235, "memory_gb": 7.721559524536133, "step_time_ms": 3361.024856567383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:54] (step=0014882) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.28919549164399533, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:52:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14883, "loss": 0.28850653767585754, "memory_gb": 7.721559524536133, "step_time_ms": 3356.630563735962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:52:57] (step=0014883) Train Loss: 0.2903, Train Steps/Sec: 0.28, Epoch: 0.28921492421298095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14884, "loss": 0.1773173063993454, "memory_gb": 7.721559524536133, "step_time_ms": 3366.567850112915, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:01] (step=0014884) Train Loss: 0.1892, Train Steps/Sec: 0.28, Epoch: 0.2892343567819666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 14885, "loss": 0.2460864931344986, "memory_gb": 7.721559524536133, "step_time_ms": 3367.6517009735107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:05] (step=0014885) Train Loss: 0.2411, Train Steps/Sec: 0.27, Epoch: 0.2892537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14886, "loss": 0.3698388338088989, "memory_gb": 7.721559524536133, "step_time_ms": 3372.6866245269775, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:08] (step=0014886) Train Loss: 0.3452, Train Steps/Sec: 0.28, Epoch: 0.2892732219199378, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14887, "loss": 0.15536224842071533, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1437969207764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:12] (step=0014887) Train Loss: 0.2171, Train Steps/Sec: 0.28, Epoch: 0.28929265448892344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14888, "loss": 0.1886865198612213, "memory_gb": 7.721559524536133, "step_time_ms": 3365.5271530151367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:15] (step=0014888) Train Loss: 0.1978, Train Steps/Sec: 0.28, Epoch: 0.28931208705790906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14889, "loss": 0.36450672149658203, "memory_gb": 7.721559524536133, "step_time_ms": 3363.723039627075, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:19] (step=0014889) Train Loss: 0.2672, Train Steps/Sec: 0.28, Epoch: 0.2893315196268947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14890, "loss": 0.2390005886554718, "memory_gb": 7.721559524536133, "step_time_ms": 3363.2373809814453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:23] (step=0014890) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.2893509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14891, "loss": 0.10561992228031158, "memory_gb": 7.721559524536133, "step_time_ms": 3361.013412475586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:26] (step=0014891) Train Loss: 0.1898, Train Steps/Sec: 0.28, Epoch: 0.28937038476486593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14892, "loss": 0.33067193627357483, "memory_gb": 7.721559524536133, "step_time_ms": 3354.6323776245117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:30] (step=0014892) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.28938981733385155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14893, "loss": 0.22057464718818665, "memory_gb": 7.721559524536133, "step_time_ms": 3366.950511932373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:34] (step=0014893) Train Loss: 0.2049, Train Steps/Sec: 0.28, Epoch: 0.2894092499028372, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14894, "loss": 0.2519817352294922, "memory_gb": 7.721559524536133, "step_time_ms": 3584.4156742095947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:37] (step=0014894) Train Loss: 0.2212, Train Steps/Sec: 0.27, Epoch: 0.2894286824718228, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14895, "loss": 0.23159442842006683, "memory_gb": 7.721559524536133, "step_time_ms": 3370.948553085327, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:41] (step=0014895) Train Loss: 0.1983, Train Steps/Sec: 0.28, Epoch: 0.2894481150408084, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14896, "loss": 0.19666865468025208, "memory_gb": 7.721559524536133, "step_time_ms": 3368.69740486145, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:44] (step=0014896) Train Loss: 0.1559, Train Steps/Sec: 0.28, Epoch: 0.28946754760979404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14897, "loss": 0.33350586891174316, "memory_gb": 7.721559524536133, "step_time_ms": 3363.0800247192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:48] (step=0014897) Train Loss: 0.3000, Train Steps/Sec: 0.28, Epoch: 0.2894869801787796, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14898, "loss": 0.36308738589286804, "memory_gb": 7.721559524536133, "step_time_ms": 3367.9959774017334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:52] (step=0014898) Train Loss: 0.2818, Train Steps/Sec: 0.28, Epoch: 0.28950641274776523, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14899, "loss": 0.17847001552581787, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9727668762207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:55] (step=0014899) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.28952584531675085, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:53:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14900, "loss": 0.16501173377037048, "memory_gb": 7.721559524536133, "step_time_ms": 3362.8599643707275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:53:59] (step=0014900) Train Loss: 0.1733, Train Steps/Sec: 0.28, Epoch: 0.2895452778857365, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14901, "loss": 0.27621978521347046, "memory_gb": 7.721559524536133, "step_time_ms": 3363.754987716675, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:03] (step=0014901) Train Loss: 0.2542, Train Steps/Sec: 0.28, Epoch: 0.2895647104547221, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14902, "loss": 0.3247329592704773, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4209938049316, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:06] (step=0014902) Train Loss: 0.2466, Train Steps/Sec: 0.28, Epoch: 0.2895841430237077, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14903, "loss": 0.2556917071342468, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8037910461426, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:10] (step=0014903) Train Loss: 0.2775, Train Steps/Sec: 0.28, Epoch: 0.28960357559269334, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14904, "loss": 0.2942522168159485, "memory_gb": 7.721559524536133, "step_time_ms": 3362.600088119507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:13] (step=0014904) Train Loss: 0.3082, Train Steps/Sec: 0.28, Epoch: 0.28962300816167896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14905, "loss": 0.31963568925857544, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9190196990967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:17] (step=0014905) Train Loss: 0.2587, Train Steps/Sec: 0.28, Epoch: 0.2896424407306646, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14906, "loss": 0.18566951155662537, "memory_gb": 7.721559524536133, "step_time_ms": 3363.558292388916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:21] (step=0014906) Train Loss: 0.2213, Train Steps/Sec: 0.28, Epoch: 0.2896618732996502, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14907, "loss": 0.3702910542488098, "memory_gb": 7.721559524536133, "step_time_ms": 3364.4068241119385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:24] (step=0014907) Train Loss: 0.3243, Train Steps/Sec: 0.28, Epoch: 0.28968130586863583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14908, "loss": 0.2946520447731018, "memory_gb": 7.721559524536133, "step_time_ms": 3366.1651611328125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:28] (step=0014908) Train Loss: 0.2341, Train Steps/Sec: 0.28, Epoch: 0.28970073843762145, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14909, "loss": 0.28501248359680176, "memory_gb": 7.721559524536133, "step_time_ms": 3365.269422531128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:32] (step=0014909) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.2897201710066071, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14910, "loss": 0.22207030653953552, "memory_gb": 7.721559524536133, "step_time_ms": 3362.102746963501, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:35] (step=0014910) Train Loss: 0.2430, Train Steps/Sec: 0.28, Epoch: 0.2897396035755927, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14911, "loss": 0.29745936393737793, "memory_gb": 7.721559524536133, "step_time_ms": 3366.037368774414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:39] (step=0014911) Train Loss: 0.2145, Train Steps/Sec: 0.28, Epoch: 0.2897590361445783, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14912, "loss": 0.12538987398147583, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2591495513916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:42] (step=0014912) Train Loss: 0.2181, Train Steps/Sec: 0.28, Epoch: 0.28977846871356394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14913, "loss": 0.22134293615818024, "memory_gb": 7.721559524536133, "step_time_ms": 3364.976406097412, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:46] (step=0014913) Train Loss: 0.1838, Train Steps/Sec: 0.28, Epoch: 0.28979790128254956, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14914, "loss": 0.17991721630096436, "memory_gb": 7.721559524536133, "step_time_ms": 3361.654043197632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:50] (step=0014914) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.2898173338515352, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14915, "loss": 0.25035303831100464, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5268001556396, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:53] (step=0014915) Train Loss: 0.2845, Train Steps/Sec: 0.28, Epoch: 0.2898367664205208, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:54:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14916, "loss": 0.28804469108581543, "memory_gb": 7.721559524536133, "step_time_ms": 3371.1533546447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:54:57] (step=0014916) Train Loss: 0.2667, Train Steps/Sec: 0.28, Epoch: 0.28985619898950643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 14917, "loss": 0.19234728813171387, "memory_gb": 7.721559524536133, "step_time_ms": 3371.7424869537354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:01] (step=0014917) Train Loss: 0.2207, Train Steps/Sec: 0.27, Epoch: 0.28987563155849205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14918, "loss": 0.23431038856506348, "memory_gb": 7.721559524536133, "step_time_ms": 3374.063014984131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:04] (step=0014918) Train Loss: 0.2119, Train Steps/Sec: 0.27, Epoch: 0.2898950641274777, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 14919, "loss": 0.2299554944038391, "memory_gb": 7.721559524536133, "step_time_ms": 3374.528408050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:08] (step=0014919) Train Loss: 0.2024, Train Steps/Sec: 0.27, Epoch: 0.2899144966964633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 14920, "loss": 0.16325151920318604, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8315620422363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:12] (step=0014920) Train Loss: 0.2654, Train Steps/Sec: 0.27, Epoch: 0.2899339292654489, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 14921, "loss": 0.2522485852241516, "memory_gb": 7.721559524536133, "step_time_ms": 3370.0051307678223, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:15] (step=0014921) Train Loss: 0.2058, Train Steps/Sec: 0.27, Epoch: 0.2899533618344345, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 14922, "loss": 0.17105209827423096, "memory_gb": 7.721559524536133, "step_time_ms": 3369.657516479492, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:19] (step=0014922) Train Loss: 0.2190, Train Steps/Sec: 0.27, Epoch: 0.2899727944034201, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 14923, "loss": 0.1432301104068756, "memory_gb": 7.721559524536133, "step_time_ms": 3368.380546569824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:23] (step=0014923) Train Loss: 0.1296, Train Steps/Sec: 0.28, Epoch: 0.28999222697240573, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 14924, "loss": 0.29830825328826904, "memory_gb": 7.721559524536133, "step_time_ms": 3368.260622024536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:26] (step=0014924) Train Loss: 0.2579, Train Steps/Sec: 0.27, Epoch: 0.29001165954139135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 14925, "loss": 0.14820152521133423, "memory_gb": 7.721559524536133, "step_time_ms": 3371.2713718414307, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:30] (step=0014925) Train Loss: 0.2241, Train Steps/Sec: 0.28, Epoch: 0.290031092110377, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 14926, "loss": 0.19919416308403015, "memory_gb": 7.721559524536133, "step_time_ms": 3370.668649673462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:34] (step=0014926) Train Loss: 0.2026, Train Steps/Sec: 0.27, Epoch: 0.2900505246793626, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 14927, "loss": 0.29180365800857544, "memory_gb": 7.721559524536133, "step_time_ms": 3365.952730178833, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:37] (step=0014927) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.2900699572483482, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 14928, "loss": 0.24976514279842377, "memory_gb": 7.721559524536133, "step_time_ms": 3372.4911212921143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:41] (step=0014928) Train Loss: 0.2522, Train Steps/Sec: 0.28, Epoch: 0.29008938981733384, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 14929, "loss": 0.2658277451992035, "memory_gb": 7.721559524536133, "step_time_ms": 3368.701457977295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:44] (step=0014929) Train Loss: 0.2570, Train Steps/Sec: 0.28, Epoch: 0.29010882238631946, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 14930, "loss": 0.28311556577682495, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8742389678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:48] (step=0014930) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.2901282549553051, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 14931, "loss": 0.17589369416236877, "memory_gb": 7.721559524536133, "step_time_ms": 3362.071752548218, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:52] (step=0014931) Train Loss: 0.2337, Train Steps/Sec: 0.28, Epoch: 0.2901476875242907, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 14932, "loss": 0.3446215093135834, "memory_gb": 7.721559524536133, "step_time_ms": 3363.759994506836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:55] (step=0014932) Train Loss: 0.3149, Train Steps/Sec: 0.28, Epoch: 0.29016712009327633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:55:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 14933, "loss": 0.20723941922187805, "memory_gb": 7.721559524536133, "step_time_ms": 3368.50643157959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:55:59] (step=0014933) Train Loss: 0.2408, Train Steps/Sec: 0.28, Epoch: 0.29018655266226195, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14934, "loss": 0.27162662148475647, "memory_gb": 7.721559524536133, "step_time_ms": 3362.6058101654053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:03] (step=0014934) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.2902059852312476, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 14935, "loss": 0.23987601697444916, "memory_gb": 7.721559524536133, "step_time_ms": 3374.0785121917725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:06] (step=0014935) Train Loss: 0.2929, Train Steps/Sec: 0.27, Epoch: 0.2902254178002332, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14936, "loss": 0.2838405966758728, "memory_gb": 7.721559524536133, "step_time_ms": 3369.8205947875977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:10] (step=0014936) Train Loss: 0.2151, Train Steps/Sec: 0.27, Epoch: 0.2902448503692188, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 14937, "loss": 0.23671357333660126, "memory_gb": 7.721559524536133, "step_time_ms": 3371.0031509399414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:13] (step=0014937) Train Loss: 0.2384, Train Steps/Sec: 0.28, Epoch: 0.29026428293820444, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 14938, "loss": 0.3059704303741455, "memory_gb": 7.721559524536133, "step_time_ms": 3369.5201873779297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:17] (step=0014938) Train Loss: 0.2289, Train Steps/Sec: 0.28, Epoch: 0.29028371550719007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14939, "loss": 0.24492771923542023, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5970821380615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:21] (step=0014939) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.2903031480761757, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 14940, "loss": 0.23364385962486267, "memory_gb": 7.715639114379883, "step_time_ms": 3335.620403289795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:24] (step=0014940) Train Loss: 0.2666, Train Steps/Sec: 0.28, Epoch: 0.2903225806451613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14941, "loss": 0.15193161368370056, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4796352386475, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:28] (step=0014941) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.29034201321414693, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 14942, "loss": 0.16188934445381165, "memory_gb": 7.721559524536133, "step_time_ms": 3507.504940032959, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:31] (step=0014942) Train Loss: 0.2104, Train Steps/Sec: 0.28, Epoch: 0.29036144578313255, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14943, "loss": 0.18980705738067627, "memory_gb": 7.721559524536133, "step_time_ms": 3363.370418548584, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:35] (step=0014943) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.2903808783521182, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14944, "loss": 0.1996118724346161, "memory_gb": 7.721559524536133, "step_time_ms": 3362.666606903076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:39] (step=0014944) Train Loss: 0.2044, Train Steps/Sec: 0.28, Epoch: 0.29040031092110374, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14945, "loss": 0.34022578597068787, "memory_gb": 7.715639114379883, "step_time_ms": 3326.6549110412598, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:42] (step=0014945) Train Loss: 0.3035, Train Steps/Sec: 0.28, Epoch: 0.29041974349008937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14946, "loss": 0.33029812574386597, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9258193969727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:46] (step=0014946) Train Loss: 0.2569, Train Steps/Sec: 0.28, Epoch: 0.290439176059075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14947, "loss": 0.24382533133029938, "memory_gb": 7.721559524536133, "step_time_ms": 3360.8055114746094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:49] (step=0014947) Train Loss: 0.1899, Train Steps/Sec: 0.28, Epoch: 0.2904586086280606, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14948, "loss": 0.35988521575927734, "memory_gb": 7.721559524536133, "step_time_ms": 3359.790563583374, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:53] (step=0014948) Train Loss: 0.3362, Train Steps/Sec: 0.28, Epoch: 0.29047804119704623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:56:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14949, "loss": 0.23651957511901855, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9677295684814, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:56:56] (step=0014949) Train Loss: 0.2459, Train Steps/Sec: 0.28, Epoch: 0.29049747376603186, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14950, "loss": 0.2642887830734253, "memory_gb": 7.721559524536133, "step_time_ms": 3360.405683517456, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:00] (step=0014950) Train Loss: 0.2281, Train Steps/Sec: 0.28, Epoch: 0.2905169063350175, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14951, "loss": 0.1076187789440155, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0357303619385, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:03] (step=0014951) Train Loss: 0.1410, Train Steps/Sec: 0.28, Epoch: 0.2905363389040031, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14952, "loss": 0.27001023292541504, "memory_gb": 7.721559524536133, "step_time_ms": 3360.001802444458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:07] (step=0014952) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.2905557714729887, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14953, "loss": 0.1855878233909607, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9425296783447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:10] (step=0014953) Train Loss: 0.2243, Train Steps/Sec: 0.28, Epoch: 0.29057520404197434, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14954, "loss": 0.289273738861084, "memory_gb": 7.721559524536133, "step_time_ms": 3369.9095249176025, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:14] (step=0014954) Train Loss: 0.2526, Train Steps/Sec: 0.28, Epoch: 0.29059463661095997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14955, "loss": 0.20908796787261963, "memory_gb": 7.721559524536133, "step_time_ms": 3350.893259048462, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:18] (step=0014955) Train Loss: 0.2355, Train Steps/Sec: 0.28, Epoch: 0.2906140691799456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14956, "loss": 0.21678514778614044, "memory_gb": 7.721559524536133, "step_time_ms": 3358.916997909546, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:21] (step=0014956) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.2906335017489312, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14957, "loss": 0.186832994222641, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4304580688477, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:25] (step=0014957) Train Loss: 0.1933, Train Steps/Sec: 0.28, Epoch: 0.29065293431791683, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14958, "loss": 0.2673003673553467, "memory_gb": 7.721559524536133, "step_time_ms": 3359.7655296325684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:28] (step=0014958) Train Loss: 0.2250, Train Steps/Sec: 0.28, Epoch: 0.29067236688690246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14959, "loss": 0.23880331218242645, "memory_gb": 7.721559524536133, "step_time_ms": 3360.7540130615234, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:32] (step=0014959) Train Loss: 0.1803, Train Steps/Sec: 0.28, Epoch: 0.2906917994558881, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14960, "loss": 0.3317105770111084, "memory_gb": 7.721559524536133, "step_time_ms": 3360.402822494507, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:35] (step=0014960) Train Loss: 0.2727, Train Steps/Sec: 0.28, Epoch: 0.2907112320248737, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14961, "loss": 0.18622960150241852, "memory_gb": 7.721559524536133, "step_time_ms": 3360.469102859497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:39] (step=0014961) Train Loss: 0.1467, Train Steps/Sec: 0.28, Epoch: 0.2907306645938593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14962, "loss": 0.20922335982322693, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2120876312256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:42] (step=0014962) Train Loss: 0.2539, Train Steps/Sec: 0.28, Epoch: 0.29075009716284494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14963, "loss": 0.2045697718858719, "memory_gb": 7.721559524536133, "step_time_ms": 3364.1209602355957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:46] (step=0014963) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.29076952973183057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14964, "loss": 0.23889490962028503, "memory_gb": 7.721559524536133, "step_time_ms": 3366.344690322876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:49] (step=0014964) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.2907889623008162, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14965, "loss": 0.24688459932804108, "memory_gb": 7.721559524536133, "step_time_ms": 3379.7683715820312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:53] (step=0014965) Train Loss: 0.2246, Train Steps/Sec: 0.28, Epoch: 0.2908083948698018, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:57:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 14966, "loss": 0.2960200309753418, "memory_gb": 7.721559524536133, "step_time_ms": 3391.1449909210205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:57:56] (step=0014966) Train Loss: 0.2812, Train Steps/Sec: 0.28, Epoch: 0.29082782743878743, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14967, "loss": 0.2245851755142212, "memory_gb": 7.721559524536133, "step_time_ms": 3380.115747451782, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:00] (step=0014967) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.290847260007773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 14968, "loss": 0.234313502907753, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6743717193604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:03] (step=0014968) Train Loss: 0.2686, Train Steps/Sec: 0.28, Epoch: 0.2908666925767586, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14969, "loss": 0.3192098140716553, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3318881988525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:07] (step=0014969) Train Loss: 0.2540, Train Steps/Sec: 0.28, Epoch: 0.29088612514574425, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 14970, "loss": 0.2767673134803772, "memory_gb": 7.721559524536133, "step_time_ms": 3366.771459579468, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:10] (step=0014970) Train Loss: 0.2625, Train Steps/Sec: 0.28, Epoch: 0.29090555771472987, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14971, "loss": 0.1987873911857605, "memory_gb": 7.721559524536133, "step_time_ms": 3362.283706665039, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:14] (step=0014971) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.2909249902837155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14972, "loss": 0.3033055067062378, "memory_gb": 7.721559524536133, "step_time_ms": 3360.692262649536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:18] (step=0014972) Train Loss: 0.2789, Train Steps/Sec: 0.28, Epoch: 0.2909444228527011, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 14973, "loss": 0.32163047790527344, "memory_gb": 7.721559524536133, "step_time_ms": 3371.338129043579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:21] (step=0014973) Train Loss: 0.2869, Train Steps/Sec: 0.28, Epoch: 0.29096385542168673, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14974, "loss": 0.15053927898406982, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3054027557373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:25] (step=0014974) Train Loss: 0.2141, Train Steps/Sec: 0.27, Epoch: 0.29098328799067236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 14975, "loss": 0.34700924158096313, "memory_gb": 7.721559524536133, "step_time_ms": 3366.9166564941406, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:28] (step=0014975) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.291002720559658, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14976, "loss": 0.1905229538679123, "memory_gb": 7.721559524536133, "step_time_ms": 3362.394332885742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:32] (step=0014976) Train Loss: 0.1578, Train Steps/Sec: 0.28, Epoch: 0.2910221531286436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 14977, "loss": 0.23690921068191528, "memory_gb": 7.721559524536133, "step_time_ms": 3359.546661376953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:35] (step=0014977) Train Loss: 0.2349, Train Steps/Sec: 0.28, Epoch: 0.2910415856976292, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14978, "loss": 0.1475626528263092, "memory_gb": 7.721559524536133, "step_time_ms": 3355.6056022644043, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:39] (step=0014978) Train Loss: 0.2016, Train Steps/Sec: 0.28, Epoch: 0.29106101826661485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 14979, "loss": 0.23587313294410706, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2825870513916, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:42] (step=0014979) Train Loss: 0.2456, Train Steps/Sec: 0.28, Epoch: 0.29108045083560047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 14980, "loss": 0.30098018050193787, "memory_gb": 7.721559524536133, "step_time_ms": 3359.830141067505, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:46] (step=0014980) Train Loss: 0.2575, Train Steps/Sec: 0.28, Epoch: 0.2910998834045861, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 14981, "loss": 0.24644333124160767, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3240718841553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:50] (step=0014981) Train Loss: 0.2641, Train Steps/Sec: 0.28, Epoch: 0.2911193159735717, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 14982, "loss": 0.2678702175617218, "memory_gb": 7.721559524536133, "step_time_ms": 3506.8347454071045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:53] (step=0014982) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.29113874854255734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:58:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 14983, "loss": 0.2772984802722931, "memory_gb": 7.721559524536133, "step_time_ms": 3359.267473220825, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:58:57] (step=0014983) Train Loss: 0.2377, Train Steps/Sec: 0.28, Epoch: 0.29115818111154296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 14984, "loss": 0.29971978068351746, "memory_gb": 7.721559524536133, "step_time_ms": 3362.525463104248, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:00] (step=0014984) Train Loss: 0.2646, Train Steps/Sec: 0.28, Epoch: 0.2911776136805286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 14985, "loss": 0.2904522120952606, "memory_gb": 7.721559524536133, "step_time_ms": 3362.751245498657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:04] (step=0014985) Train Loss: 0.3133, Train Steps/Sec: 0.28, Epoch: 0.2911970462495142, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 14986, "loss": 0.253684401512146, "memory_gb": 7.721559524536133, "step_time_ms": 3362.69474029541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:07] (step=0014986) Train Loss: 0.2041, Train Steps/Sec: 0.28, Epoch: 0.2912164788184998, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 14987, "loss": 0.2625247836112976, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0334396362305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:11] (step=0014987) Train Loss: 0.2396, Train Steps/Sec: 0.28, Epoch: 0.29123591138748545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 14988, "loss": 0.22493621706962585, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9728813171387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:14] (step=0014988) Train Loss: 0.2225, Train Steps/Sec: 0.28, Epoch: 0.29125534395647107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 14989, "loss": 0.26322802901268005, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0032329559326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:18] (step=0014989) Train Loss: 0.2996, Train Steps/Sec: 0.28, Epoch: 0.2912747765254567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 14990, "loss": 0.226670503616333, "memory_gb": 7.721559524536133, "step_time_ms": 3362.5593185424805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:22] (step=0014990) Train Loss: 0.2316, Train Steps/Sec: 0.28, Epoch: 0.29129420909444226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 14991, "loss": 0.30844736099243164, "memory_gb": 7.721559524536133, "step_time_ms": 3362.997055053711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:25] (step=0014991) Train Loss: 0.2701, Train Steps/Sec: 0.28, Epoch: 0.2913136416634279, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 14992, "loss": 0.1513897329568863, "memory_gb": 7.721559524536133, "step_time_ms": 3360.994815826416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:29] (step=0014992) Train Loss: 0.2286, Train Steps/Sec: 0.28, Epoch: 0.2913330742324135, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 14993, "loss": 0.17410586774349213, "memory_gb": 7.721559524536133, "step_time_ms": 3361.590623855591, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:32] (step=0014993) Train Loss: 0.2032, Train Steps/Sec: 0.28, Epoch: 0.2913525068013991, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 14994, "loss": 0.11700385063886642, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5995178222656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:36] (step=0014994) Train Loss: 0.1646, Train Steps/Sec: 0.28, Epoch: 0.29137193937038475, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 14995, "loss": 0.1811695098876953, "memory_gb": 7.721559524536133, "step_time_ms": 3359.882593154907, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:39] (step=0014995) Train Loss: 0.2043, Train Steps/Sec: 0.28, Epoch: 0.29139137193937037, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 14996, "loss": 0.25224435329437256, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0142002105713, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:43] (step=0014996) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.291410804508356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 14997, "loss": 0.3064861297607422, "memory_gb": 7.721559524536133, "step_time_ms": 3358.9987754821777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:47] (step=0014997) Train Loss: 0.2757, Train Steps/Sec: 0.28, Epoch: 0.2914302370773416, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 14998, "loss": 0.3124523162841797, "memory_gb": 7.721559524536133, "step_time_ms": 3360.258102416992, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:50] (step=0014998) Train Loss: 0.2102, Train Steps/Sec: 0.28, Epoch: 0.29144966964632724, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 14999, "loss": 0.27531567215919495, "memory_gb": 7.721559524536133, "step_time_ms": 3357.888698577881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:54] (step=0014999) Train Loss: 0.2951, Train Steps/Sec: 0.28, Epoch: 0.29146910221531286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 14:59:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15000, "loss": 0.24258050322532654, "memory_gb": 7.721559524536133, "step_time_ms": 3356.6391468048096, "trainable_params": 4718592, "method": "lora"} [2025-07-29 14:59:57] (step=0015000) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.2914885347842985, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15001, "loss": 0.21938160061836243, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4938259124756, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:01] (step=0015001) Train Loss: 0.2365, Train Steps/Sec: 0.28, Epoch: 0.2915079673532841, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15002, "loss": 0.26625752449035645, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6815128326416, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:04] (step=0015002) Train Loss: 0.2228, Train Steps/Sec: 0.28, Epoch: 0.2915273999222697, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15003, "loss": 0.3205275535583496, "memory_gb": 7.715639114379883, "step_time_ms": 3326.3776302337646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:08] (step=0015003) Train Loss: 0.2826, Train Steps/Sec: 0.28, Epoch: 0.29154683249125535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15004, "loss": 0.21365146338939667, "memory_gb": 7.721559524536133, "step_time_ms": 3359.6155643463135, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:12] (step=0015004) Train Loss: 0.1856, Train Steps/Sec: 0.28, Epoch: 0.29156626506024097, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15005, "loss": 0.3011019229888916, "memory_gb": 7.721559524536133, "step_time_ms": 3361.71817779541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:15] (step=0015005) Train Loss: 0.2084, Train Steps/Sec: 0.28, Epoch: 0.2915856976292266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15006, "loss": 0.2932727336883545, "memory_gb": 7.721559524536133, "step_time_ms": 3361.219644546509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:19] (step=0015006) Train Loss: 0.2737, Train Steps/Sec: 0.28, Epoch: 0.2916051301982122, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15007, "loss": 0.3346748352050781, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4797592163086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:22] (step=0015007) Train Loss: 0.3175, Train Steps/Sec: 0.28, Epoch: 0.29162456276719784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15008, "loss": 0.21750853955745697, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0806674957275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:26] (step=0015008) Train Loss: 0.2170, Train Steps/Sec: 0.28, Epoch: 0.29164399533618346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15009, "loss": 0.19710899889469147, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1341972351074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:30] (step=0015009) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.2916634279051691, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15010, "loss": 0.31255054473876953, "memory_gb": 7.721559524536133, "step_time_ms": 3357.069969177246, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:33] (step=0015010) Train Loss: 0.3173, Train Steps/Sec: 0.28, Epoch: 0.2916828604741547, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15011, "loss": 0.30412614345550537, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4235649108887, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:37] (step=0015011) Train Loss: 0.2671, Train Steps/Sec: 0.28, Epoch: 0.2917022930431403, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15012, "loss": 0.22495119273662567, "memory_gb": 7.721559524536133, "step_time_ms": 3357.208490371704, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:40] (step=0015012) Train Loss: 0.2066, Train Steps/Sec: 0.28, Epoch: 0.29172172561212595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15013, "loss": 0.32443007826805115, "memory_gb": 7.721559524536133, "step_time_ms": 3357.7351570129395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:44] (step=0015013) Train Loss: 0.2402, Train Steps/Sec: 0.28, Epoch: 0.29174115818111157, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15014, "loss": 0.16278620064258575, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5268726348877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:48] (step=0015014) Train Loss: 0.1859, Train Steps/Sec: 0.27, Epoch: 0.29176059075009714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15015, "loss": 0.23647820949554443, "memory_gb": 7.721559524536133, "step_time_ms": 3357.4814796447754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:51] (step=0015015) Train Loss: 0.2380, Train Steps/Sec: 0.28, Epoch: 0.29178002331908276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15016, "loss": 0.3182825446128845, "memory_gb": 7.721559524536133, "step_time_ms": 3352.6268005371094, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:55] (step=0015016) Train Loss: 0.2491, Train Steps/Sec: 0.28, Epoch: 0.2917994558880684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:00:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 15017, "loss": 0.1942383050918579, "memory_gb": 7.721559524536133, "step_time_ms": 3370.2163696289062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:00:58] (step=0015017) Train Loss: 0.2519, Train Steps/Sec: 0.28, Epoch: 0.291818888457054, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15018, "loss": 0.26299533247947693, "memory_gb": 7.721559524536133, "step_time_ms": 3356.502056121826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:02] (step=0015018) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.2918383210260396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15019, "loss": 0.22289970517158508, "memory_gb": 7.721559524536133, "step_time_ms": 3360.00394821167, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:06] (step=0015019) Train Loss: 0.2468, Train Steps/Sec: 0.28, Epoch: 0.29185775359502525, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15020, "loss": 0.31524717807769775, "memory_gb": 7.721559524536133, "step_time_ms": 3406.3398838043213, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:09] (step=0015020) Train Loss: 0.2758, Train Steps/Sec: 0.27, Epoch: 0.29187718616401087, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15021, "loss": 0.21863065659999847, "memory_gb": 7.721559524536133, "step_time_ms": 3363.847494125366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:13] (step=0015021) Train Loss: 0.1638, Train Steps/Sec: 0.28, Epoch: 0.2918966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 15022, "loss": 0.22512586414813995, "memory_gb": 7.721559524536133, "step_time_ms": 3362.978219985962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:16] (step=0015022) Train Loss: 0.2238, Train Steps/Sec: 0.28, Epoch: 0.2919160513019821, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15023, "loss": 0.3629584312438965, "memory_gb": 7.721559524536133, "step_time_ms": 3456.880807876587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:20] (step=0015023) Train Loss: 0.2702, Train Steps/Sec: 0.28, Epoch: 0.29193548387096774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15024, "loss": 0.3712090253829956, "memory_gb": 7.721559524536133, "step_time_ms": 3486.309289932251, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:24] (step=0015024) Train Loss: 0.2748, Train Steps/Sec: 0.26, Epoch: 0.29195491643995336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15025, "loss": 0.3335803747177124, "memory_gb": 7.721559524536133, "step_time_ms": 3505.272150039673, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:28] (step=0015025) Train Loss: 0.2634, Train Steps/Sec: 0.27, Epoch: 0.291974349008939, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15026, "loss": 0.2353682816028595, "memory_gb": 7.721559524536133, "step_time_ms": 3528.5897254943848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:31] (step=0015026) Train Loss: 0.3007, Train Steps/Sec: 0.27, Epoch: 0.2919937815779246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15027, "loss": 0.28778666257858276, "memory_gb": 7.721559524536133, "step_time_ms": 3619.5006370544434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:35] (step=0015027) Train Loss: 0.2075, Train Steps/Sec: 0.26, Epoch: 0.2920132141469102, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15028, "loss": 0.14256471395492554, "memory_gb": 7.721559524536133, "step_time_ms": 3488.452196121216, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:39] (step=0015028) Train Loss: 0.2263, Train Steps/Sec: 0.27, Epoch: 0.29203264671589585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15029, "loss": 0.2308729737997055, "memory_gb": 7.721559524536133, "step_time_ms": 3858.529806137085, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:43] (step=0015029) Train Loss: 0.2186, Train Steps/Sec: 0.26, Epoch: 0.29205207928488147, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15030, "loss": 0.2572254538536072, "memory_gb": 7.721559524536133, "step_time_ms": 5432.393550872803, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:48] (step=0015030) Train Loss: 0.2155, Train Steps/Sec: 0.18, Epoch: 0.2920715118538671, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 15031, "loss": 0.23457641899585724, "memory_gb": 7.721559524536133, "step_time_ms": 4737.68424987793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:53] (step=0015031) Train Loss: 0.1980, Train Steps/Sec: 0.21, Epoch: 0.2920909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:01:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15032, "loss": 0.1802571415901184, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3203506469727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:01:57] (step=0015032) Train Loss: 0.2399, Train Steps/Sec: 0.28, Epoch: 0.29211037699183834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 15033, "loss": 0.22094303369522095, "memory_gb": 7.721559524536133, "step_time_ms": 3357.2030067443848, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:00] (step=0015033) Train Loss: 0.1844, Train Steps/Sec: 0.28, Epoch: 0.29212980956082396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15034, "loss": 0.31400322914123535, "memory_gb": 7.721559524536133, "step_time_ms": 3365.750551223755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:04] (step=0015034) Train Loss: 0.2748, Train Steps/Sec: 0.28, Epoch: 0.2921492421298096, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15035, "loss": 0.32262250781059265, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6391162872314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:08] (step=0015035) Train Loss: 0.2720, Train Steps/Sec: 0.28, Epoch: 0.2921686746987952, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15036, "loss": 0.188226580619812, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3136234283447, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:11] (step=0015036) Train Loss: 0.1798, Train Steps/Sec: 0.28, Epoch: 0.29218810726778083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15037, "loss": 0.2688066065311432, "memory_gb": 7.721559524536133, "step_time_ms": 3364.842414855957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:15] (step=0015037) Train Loss: 0.2751, Train Steps/Sec: 0.28, Epoch: 0.2922075398367664, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15038, "loss": 0.2781676948070526, "memory_gb": 7.721559524536133, "step_time_ms": 3381.2129497528076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:19] (step=0015038) Train Loss: 0.2695, Train Steps/Sec: 0.27, Epoch: 0.292226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15039, "loss": 0.2860194742679596, "memory_gb": 7.721559524536133, "step_time_ms": 3369.962692260742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:22] (step=0015039) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.29224640497473764, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15040, "loss": 0.30114275217056274, "memory_gb": 7.721559524536133, "step_time_ms": 3368.283987045288, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:26] (step=0015040) Train Loss: 0.3021, Train Steps/Sec: 0.28, Epoch: 0.29226583754372326, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15041, "loss": 0.3256891965866089, "memory_gb": 7.721559524536133, "step_time_ms": 3367.913246154785, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:29] (step=0015041) Train Loss: 0.2678, Train Steps/Sec: 0.28, Epoch: 0.2922852701127089, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15042, "loss": 0.27314209938049316, "memory_gb": 7.721559524536133, "step_time_ms": 3365.241765975952, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:33] (step=0015042) Train Loss: 0.2870, Train Steps/Sec: 0.28, Epoch: 0.2923047026816945, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15043, "loss": 0.18661893904209137, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5867881774902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:37] (step=0015043) Train Loss: 0.2026, Train Steps/Sec: 0.27, Epoch: 0.29232413525068013, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15044, "loss": 0.18062914907932281, "memory_gb": 7.721559524536133, "step_time_ms": 3372.8108406066895, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:40] (step=0015044) Train Loss: 0.2189, Train Steps/Sec: 0.27, Epoch: 0.29234356781966575, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15045, "loss": 0.28611984848976135, "memory_gb": 7.721559524536133, "step_time_ms": 3411.723852157593, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:44] (step=0015045) Train Loss: 0.2396, Train Steps/Sec: 0.27, Epoch: 0.2923630003886514, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15046, "loss": 0.28199517726898193, "memory_gb": 7.721559524536133, "step_time_ms": 3456.2878608703613, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:48] (step=0015046) Train Loss: 0.2719, Train Steps/Sec: 0.27, Epoch: 0.292382432957637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15047, "loss": 0.30999302864074707, "memory_gb": 7.721559524536133, "step_time_ms": 3442.3160552978516, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:51] (step=0015047) Train Loss: 0.3348, Train Steps/Sec: 0.27, Epoch: 0.2924018655266226, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15048, "loss": 0.24059289693832397, "memory_gb": 7.721559524536133, "step_time_ms": 3419.9612140655518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:55] (step=0015048) Train Loss: 0.2344, Train Steps/Sec: 0.27, Epoch: 0.29242129809560824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:02:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15049, "loss": 0.2759307622909546, "memory_gb": 7.721559524536133, "step_time_ms": 3553.6022186279297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:02:59] (step=0015049) Train Loss: 0.2930, Train Steps/Sec: 0.26, Epoch: 0.29244073066459386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 15050, "loss": 0.3378174304962158, "memory_gb": 7.715639114379883, "step_time_ms": 3323.929786682129, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:03] (step=0015050) Train Loss: 0.2511, Train Steps/Sec: 0.28, Epoch: 0.2924601632335795, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15051, "loss": 0.306398868560791, "memory_gb": 7.721559524536133, "step_time_ms": 3367.145299911499, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:06] (step=0015051) Train Loss: 0.3083, Train Steps/Sec: 0.28, Epoch: 0.2924795958025651, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15052, "loss": 0.2512396574020386, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7726516723633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:10] (step=0015052) Train Loss: 0.2158, Train Steps/Sec: 0.28, Epoch: 0.29249902837155073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15053, "loss": 0.29290738701820374, "memory_gb": 7.715639114379883, "step_time_ms": 3332.4596881866455, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:13] (step=0015053) Train Loss: 0.2372, Train Steps/Sec: 0.28, Epoch: 0.29251846094053635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15054, "loss": 0.30518513917922974, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1585578918457, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:17] (step=0015054) Train Loss: 0.2850, Train Steps/Sec: 0.28, Epoch: 0.292537893509522, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 15055, "loss": 0.2626604735851288, "memory_gb": 7.721559524536133, "step_time_ms": 3369.969606399536, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:21] (step=0015055) Train Loss: 0.2415, Train Steps/Sec: 0.27, Epoch: 0.2925573260785076, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15056, "loss": 0.25942277908325195, "memory_gb": 7.715639114379883, "step_time_ms": 3341.0065174102783, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:24] (step=0015056) Train Loss: 0.2504, Train Steps/Sec: 0.27, Epoch: 0.2925767586474932, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15057, "loss": 0.2719837725162506, "memory_gb": 7.721559524536133, "step_time_ms": 3369.692325592041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:28] (step=0015057) Train Loss: 0.2403, Train Steps/Sec: 0.27, Epoch: 0.29259619121647884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15058, "loss": 0.2645743489265442, "memory_gb": 7.721559524536133, "step_time_ms": 3382.2314739227295, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:32] (step=0015058) Train Loss: 0.2519, Train Steps/Sec: 0.27, Epoch: 0.29261562378546446, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15059, "loss": 0.3583154082298279, "memory_gb": 7.721559524536133, "step_time_ms": 3363.339900970459, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:35] (step=0015059) Train Loss: 0.3162, Train Steps/Sec: 0.28, Epoch: 0.2926350563544501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15060, "loss": 0.3120349049568176, "memory_gb": 7.721559524536133, "step_time_ms": 3368.912696838379, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:39] (step=0015060) Train Loss: 0.2665, Train Steps/Sec: 0.28, Epoch: 0.29265448892343565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15061, "loss": 0.21797378361225128, "memory_gb": 7.721559524536133, "step_time_ms": 3367.384433746338, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:43] (step=0015061) Train Loss: 0.2565, Train Steps/Sec: 0.26, Epoch: 0.2926739214924213, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15062, "loss": 0.257532000541687, "memory_gb": 7.721559524536133, "step_time_ms": 3360.377550125122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:46] (step=0015062) Train Loss: 0.2559, Train Steps/Sec: 0.28, Epoch: 0.2926933540614069, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15063, "loss": 0.19115899503231049, "memory_gb": 7.721559524536133, "step_time_ms": 3369.689702987671, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:50] (step=0015063) Train Loss: 0.1675, Train Steps/Sec: 0.28, Epoch: 0.2927127866303925, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15064, "loss": 0.3338540494441986, "memory_gb": 7.721559524536133, "step_time_ms": 3368.6063289642334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:54] (step=0015064) Train Loss: 0.3079, Train Steps/Sec: 0.28, Epoch: 0.29273221919937814, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:03:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15065, "loss": 0.14584168791770935, "memory_gb": 7.721559524536133, "step_time_ms": 3362.823009490967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:03:57] (step=0015065) Train Loss: 0.2211, Train Steps/Sec: 0.28, Epoch: 0.29275165176836376, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15066, "loss": 0.19352523982524872, "memory_gb": 7.721559524536133, "step_time_ms": 3349.262475967407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:01] (step=0015066) Train Loss: 0.2195, Train Steps/Sec: 0.28, Epoch: 0.2927710843373494, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15067, "loss": 0.32012444734573364, "memory_gb": 7.721559524536133, "step_time_ms": 3371.054172515869, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:04] (step=0015067) Train Loss: 0.2930, Train Steps/Sec: 0.28, Epoch: 0.292790516906335, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15068, "loss": 0.3272135853767395, "memory_gb": 7.721559524536133, "step_time_ms": 3368.9095973968506, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:08] (step=0015068) Train Loss: 0.3137, Train Steps/Sec: 0.28, Epoch: 0.29280994947532063, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15069, "loss": 0.23836351931095123, "memory_gb": 7.721559524536133, "step_time_ms": 3370.6741333007812, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:12] (step=0015069) Train Loss: 0.2557, Train Steps/Sec: 0.28, Epoch: 0.29282938204430625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15070, "loss": 0.16279278695583344, "memory_gb": 7.721559524536133, "step_time_ms": 3516.2744522094727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:15] (step=0015070) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2928488146132919, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15071, "loss": 0.3448579013347626, "memory_gb": 7.721559524536133, "step_time_ms": 3366.675615310669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:19] (step=0015071) Train Loss: 0.2993, Train Steps/Sec: 0.28, Epoch: 0.2928682471822775, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15072, "loss": 0.23479242622852325, "memory_gb": 7.721559524536133, "step_time_ms": 3370.535373687744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:23] (step=0015072) Train Loss: 0.2216, Train Steps/Sec: 0.28, Epoch: 0.2928876797512631, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15073, "loss": 0.3214854300022125, "memory_gb": 7.721559524536133, "step_time_ms": 3370.13578414917, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:26] (step=0015073) Train Loss: 0.2572, Train Steps/Sec: 0.28, Epoch: 0.29290711232024874, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15074, "loss": 0.20172348618507385, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9492263793945, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:30] (step=0015074) Train Loss: 0.2046, Train Steps/Sec: 0.28, Epoch: 0.29292654488923436, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15075, "loss": 0.23284682631492615, "memory_gb": 7.721559524536133, "step_time_ms": 3366.827964782715, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:33] (step=0015075) Train Loss: 0.2124, Train Steps/Sec: 0.28, Epoch: 0.29294597745822, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15076, "loss": 0.2362334132194519, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7095642089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:37] (step=0015076) Train Loss: 0.2190, Train Steps/Sec: 0.28, Epoch: 0.2929654100272056, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15077, "loss": 0.31696051359176636, "memory_gb": 7.721559524536133, "step_time_ms": 3368.201971054077, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:41] (step=0015077) Train Loss: 0.3276, Train Steps/Sec: 0.28, Epoch: 0.29298484259619123, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15078, "loss": 0.2716834545135498, "memory_gb": 7.721559524536133, "step_time_ms": 3371.2868690490723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:44] (step=0015078) Train Loss: 0.2322, Train Steps/Sec: 0.28, Epoch: 0.29300427516517685, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15079, "loss": 0.19854959845542908, "memory_gb": 7.721559524536133, "step_time_ms": 3375.9682178497314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:48] (step=0015079) Train Loss: 0.1756, Train Steps/Sec: 0.28, Epoch: 0.2930237077341625, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15080, "loss": 0.20424020290374756, "memory_gb": 7.721559524536133, "step_time_ms": 3368.755340576172, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:51] (step=0015080) Train Loss: 0.1928, Train Steps/Sec: 0.28, Epoch: 0.2930431403031481, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15081, "loss": 0.21528282761573792, "memory_gb": 7.721559524536133, "step_time_ms": 3363.255739212036, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:55] (step=0015081) Train Loss: 0.2152, Train Steps/Sec: 0.28, Epoch: 0.2930625728721337, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:04:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15082, "loss": 0.2070859670639038, "memory_gb": 7.721559524536133, "step_time_ms": 3361.8736267089844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:04:59] (step=0015082) Train Loss: 0.1876, Train Steps/Sec: 0.28, Epoch: 0.29308200544111934, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15083, "loss": 0.2327679544687271, "memory_gb": 7.721559524536133, "step_time_ms": 3365.41485786438, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:02] (step=0015083) Train Loss: 0.2710, Train Steps/Sec: 0.28, Epoch: 0.2931014380101049, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15084, "loss": 0.15375418961048126, "memory_gb": 7.721559524536133, "step_time_ms": 3363.129138946533, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:06] (step=0015084) Train Loss: 0.1785, Train Steps/Sec: 0.28, Epoch: 0.29312087057909053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15085, "loss": 0.270030677318573, "memory_gb": 7.721559524536133, "step_time_ms": 3357.860326766968, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:09] (step=0015085) Train Loss: 0.2359, Train Steps/Sec: 0.28, Epoch: 0.29314030314807615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15086, "loss": 0.252998948097229, "memory_gb": 7.721559524536133, "step_time_ms": 3364.283561706543, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:13] (step=0015086) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.2931597357170618, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 15087, "loss": 0.2516356408596039, "memory_gb": 7.721559524536133, "step_time_ms": 3361.386299133301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:16] (step=0015087) Train Loss: 0.2290, Train Steps/Sec: 0.28, Epoch: 0.2931791682860474, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15088, "loss": 0.23500458896160126, "memory_gb": 7.721559524536133, "step_time_ms": 3358.548641204834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:20] (step=0015088) Train Loss: 0.2695, Train Steps/Sec: 0.28, Epoch: 0.293198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15089, "loss": 0.23329919576644897, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9325885772705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:24] (step=0015089) Train Loss: 0.2799, Train Steps/Sec: 0.28, Epoch: 0.29321803342401864, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 15090, "loss": 0.3105359673500061, "memory_gb": 7.721559524536133, "step_time_ms": 3366.702079772949, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:27] (step=0015090) Train Loss: 0.2722, Train Steps/Sec: 0.28, Epoch: 0.29323746599300426, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15091, "loss": 0.25927257537841797, "memory_gb": 7.721559524536133, "step_time_ms": 3363.399028778076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:31] (step=0015091) Train Loss: 0.2401, Train Steps/Sec: 0.28, Epoch: 0.2932568985619899, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15092, "loss": 0.1437027007341385, "memory_gb": 7.721559524536133, "step_time_ms": 3359.8086833953857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:34] (step=0015092) Train Loss: 0.1826, Train Steps/Sec: 0.28, Epoch: 0.2932763311309755, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 15093, "loss": 0.25735437870025635, "memory_gb": 7.721559524536133, "step_time_ms": 3360.6629371643066, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:38] (step=0015093) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.29329576369996113, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15094, "loss": 0.285768985748291, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2663021087646, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:41] (step=0015094) Train Loss: 0.2755, Train Steps/Sec: 0.28, Epoch: 0.29331519626894675, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 15095, "loss": 0.2839202880859375, "memory_gb": 7.721559524536133, "step_time_ms": 3376.737117767334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:45] (step=0015095) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.2933346288379324, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15096, "loss": 0.261196106672287, "memory_gb": 7.721559524536133, "step_time_ms": 3416.546583175659, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:49] (step=0015096) Train Loss: 0.2169, Train Steps/Sec: 0.28, Epoch: 0.293354061406918, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15097, "loss": 0.383385568857193, "memory_gb": 7.721559524536133, "step_time_ms": 3450.5269527435303, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:52] (step=0015097) Train Loss: 0.3152, Train Steps/Sec: 0.28, Epoch: 0.2933734939759036, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:05:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 15098, "loss": 0.23676609992980957, "memory_gb": 7.721559524536133, "step_time_ms": 3431.4627647399902, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:05:56] (step=0015098) Train Loss: 0.2194, Train Steps/Sec: 0.27, Epoch: 0.29339292654488924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 15099, "loss": 0.2604464888572693, "memory_gb": 7.721559524536133, "step_time_ms": 3491.4021492004395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:00] (step=0015099) Train Loss: 0.1999, Train Steps/Sec: 0.27, Epoch: 0.29341235911387487, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15100, "loss": 0.25862205028533936, "memory_gb": 7.721559524536133, "step_time_ms": 6032.358646392822, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:06] (step=0015100) Train Loss: 0.2550, Train Steps/Sec: 0.16, Epoch: 0.2934317916828605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15101, "loss": 0.2592010796070099, "memory_gb": 7.721559524536133, "step_time_ms": 6238.667011260986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:12] (step=0015101) Train Loss: 0.2560, Train Steps/Sec: 0.16, Epoch: 0.2934512242518461, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15102, "loss": 0.30385953187942505, "memory_gb": 7.721559524536133, "step_time_ms": 5899.861097335815, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:18] (step=0015102) Train Loss: 0.2441, Train Steps/Sec: 0.17, Epoch: 0.29347065682083173, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15103, "loss": 0.27422818541526794, "memory_gb": 7.721559524536133, "step_time_ms": 6203.510522842407, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:24] (step=0015103) Train Loss: 0.2551, Train Steps/Sec: 0.16, Epoch: 0.29349008938981735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15104, "loss": 0.253034770488739, "memory_gb": 7.721559524536133, "step_time_ms": 5706.186294555664, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:31] (step=0015104) Train Loss: 0.2363, Train Steps/Sec: 0.15, Epoch: 0.293509521958803, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15105, "loss": 0.16547349095344543, "memory_gb": 7.721559524536133, "step_time_ms": 6223.9110469818115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:37] (step=0015105) Train Loss: 0.1884, Train Steps/Sec: 0.16, Epoch: 0.2935289545277886, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15106, "loss": 0.15609128773212433, "memory_gb": 7.721559524536133, "step_time_ms": 6134.334564208984, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:44] (step=0015106) Train Loss: 0.2537, Train Steps/Sec: 0.15, Epoch: 0.29354838709677417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15107, "loss": 0.1858363151550293, "memory_gb": 7.721559524536133, "step_time_ms": 3712.007761001587, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:51] (step=0015107) Train Loss: 0.2500, Train Steps/Sec: 0.15, Epoch: 0.2935678196657598, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:06:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15108, "loss": 0.24894854426383972, "memory_gb": 7.721559524536133, "step_time_ms": 4426.729440689087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:06:57] (step=0015108) Train Loss: 0.2038, Train Steps/Sec: 0.16, Epoch: 0.2935872522347454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15109, "loss": 0.35610347986221313, "memory_gb": 7.721559524536133, "step_time_ms": 5379.869699478149, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:02] (step=0015109) Train Loss: 0.2619, Train Steps/Sec: 0.18, Epoch: 0.29360668480373103, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15110, "loss": 0.22717970609664917, "memory_gb": 7.721559524536133, "step_time_ms": 3356.9905757904053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:06] (step=0015110) Train Loss: 0.2444, Train Steps/Sec: 0.28, Epoch: 0.29362611737271666, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15111, "loss": 0.3673498034477234, "memory_gb": 7.721559524536133, "step_time_ms": 3357.1150302886963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:10] (step=0015111) Train Loss: 0.3071, Train Steps/Sec: 0.28, Epoch: 0.2936455499417023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15112, "loss": 0.27460891008377075, "memory_gb": 7.721559524536133, "step_time_ms": 3339.137315750122, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:13] (step=0015112) Train Loss: 0.2762, Train Steps/Sec: 0.28, Epoch: 0.2936649825106879, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15113, "loss": 0.21961921453475952, "memory_gb": 7.721559524536133, "step_time_ms": 3356.1148643493652, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:17] (step=0015113) Train Loss: 0.1963, Train Steps/Sec: 0.28, Epoch: 0.2936844150796735, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15114, "loss": 0.29756537079811096, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5282306671143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:20] (step=0015114) Train Loss: 0.2906, Train Steps/Sec: 0.28, Epoch: 0.29370384764865914, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15115, "loss": 0.18438224494457245, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1716804504395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:24] (step=0015115) Train Loss: 0.1722, Train Steps/Sec: 0.28, Epoch: 0.29372328021764477, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15116, "loss": 0.15623031556606293, "memory_gb": 7.721559524536133, "step_time_ms": 3362.1439933776855, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:28] (step=0015116) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.2937427127866304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15117, "loss": 0.2925318479537964, "memory_gb": 7.721559524536133, "step_time_ms": 3358.163595199585, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:31] (step=0015117) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.293762145355616, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15118, "loss": 0.2514154016971588, "memory_gb": 7.721559524536133, "step_time_ms": 3497.7855682373047, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:35] (step=0015118) Train Loss: 0.2531, Train Steps/Sec: 0.28, Epoch: 0.29378157792460163, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15119, "loss": 0.25132858753204346, "memory_gb": 7.721559524536133, "step_time_ms": 3362.741708755493, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:39] (step=0015119) Train Loss: 0.2442, Train Steps/Sec: 0.28, Epoch: 0.29380101049358726, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 15120, "loss": 0.434116393327713, "memory_gb": 7.721559524536133, "step_time_ms": 3352.3828983306885, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:42] (step=0015120) Train Loss: 0.3276, Train Steps/Sec: 0.28, Epoch: 0.2938204430625729, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15121, "loss": 0.2694874703884125, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1413497924805, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:46] (step=0015121) Train Loss: 0.2270, Train Steps/Sec: 0.28, Epoch: 0.2938398756315585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15122, "loss": 0.2208743393421173, "memory_gb": 7.721559524536133, "step_time_ms": 3367.3408031463623, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:49] (step=0015122) Train Loss: 0.2176, Train Steps/Sec: 0.28, Epoch: 0.2938593082005441, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 15123, "loss": 0.17395150661468506, "memory_gb": 7.721559524536133, "step_time_ms": 3366.0831451416016, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:53] (step=0015123) Train Loss: 0.1775, Train Steps/Sec: 0.28, Epoch: 0.29387874076952974, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:07:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15124, "loss": 0.18034648895263672, "memory_gb": 7.721559524536133, "step_time_ms": 3364.417791366577, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:07:57] (step=0015124) Train Loss: 0.2338, Train Steps/Sec: 0.28, Epoch: 0.29389817333851537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 15125, "loss": 0.2949170768260956, "memory_gb": 7.721559524536133, "step_time_ms": 3363.333225250244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:00] (step=0015125) Train Loss: 0.2441, Train Steps/Sec: 0.28, Epoch: 0.293917605907501, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15126, "loss": 0.16634896397590637, "memory_gb": 7.721559524536133, "step_time_ms": 3350.998878479004, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:04] (step=0015126) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.2939370384764866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 15127, "loss": 0.29120540618896484, "memory_gb": 7.721559524536133, "step_time_ms": 3360.5499267578125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:07] (step=0015127) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.29395647104547223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15128, "loss": 0.26299285888671875, "memory_gb": 7.721559524536133, "step_time_ms": 3366.145610809326, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:11] (step=0015128) Train Loss: 0.2584, Train Steps/Sec: 0.28, Epoch: 0.29397590361445786, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15129, "loss": 0.3631919026374817, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0871047973633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:15] (step=0015129) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.2939953361834435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15130, "loss": 0.1954198181629181, "memory_gb": 7.721559524536133, "step_time_ms": 3361.268997192383, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:18] (step=0015130) Train Loss: 0.2264, Train Steps/Sec: 0.28, Epoch: 0.29401476875242905, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15131, "loss": 0.26725172996520996, "memory_gb": 7.721559524536133, "step_time_ms": 3360.745668411255, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:22] (step=0015131) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.29403420132141467, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15132, "loss": 0.15572938323020935, "memory_gb": 7.721559524536133, "step_time_ms": 3360.053777694702, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:26] (step=0015132) Train Loss: 0.1712, Train Steps/Sec: 0.28, Epoch: 0.2940536338904003, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15133, "loss": 0.2152547836303711, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2004051208496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:29] (step=0015133) Train Loss: 0.2892, Train Steps/Sec: 0.28, Epoch: 0.2940730664593859, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15134, "loss": 0.28460752964019775, "memory_gb": 7.721559524536133, "step_time_ms": 3351.7072200775146, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:33] (step=0015134) Train Loss: 0.2319, Train Steps/Sec: 0.28, Epoch: 0.29409249902837153, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15135, "loss": 0.22062933444976807, "memory_gb": 7.715639114379883, "step_time_ms": 3322.5257396698, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:36] (step=0015135) Train Loss: 0.1917, Train Steps/Sec: 0.28, Epoch: 0.29411193159735716, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15136, "loss": 0.17779813706874847, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2964458465576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:40] (step=0015136) Train Loss: 0.1879, Train Steps/Sec: 0.28, Epoch: 0.2941313641663428, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15137, "loss": 0.2063884735107422, "memory_gb": 7.721559524536133, "step_time_ms": 3353.259801864624, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:44] (step=0015137) Train Loss: 0.2414, Train Steps/Sec: 0.28, Epoch: 0.2941507967353284, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 15138, "loss": 0.15456968545913696, "memory_gb": 7.721559524536133, "step_time_ms": 3360.3830337524414, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:47] (step=0015138) Train Loss: 0.1780, Train Steps/Sec: 0.28, Epoch: 0.294170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15139, "loss": 0.3453334867954254, "memory_gb": 7.715639114379883, "step_time_ms": 3323.4434127807617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:51] (step=0015139) Train Loss: 0.2546, Train Steps/Sec: 0.28, Epoch: 0.29418966187329965, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15140, "loss": 0.1700938642024994, "memory_gb": 7.721559524536133, "step_time_ms": 3353.7700176239014, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:54] (step=0015140) Train Loss: 0.1972, Train Steps/Sec: 0.28, Epoch: 0.29420909444228527, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:08:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 15141, "loss": 0.3236929178237915, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3154678344727, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:08:58] (step=0015141) Train Loss: 0.3061, Train Steps/Sec: 0.28, Epoch: 0.2942285270112709, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15142, "loss": 0.2954225242137909, "memory_gb": 7.721559524536133, "step_time_ms": 3352.003335952759, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:02] (step=0015142) Train Loss: 0.2779, Train Steps/Sec: 0.28, Epoch: 0.2942479595802565, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 15143, "loss": 0.17105969786643982, "memory_gb": 7.721559524536133, "step_time_ms": 3354.2654514312744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:05] (step=0015143) Train Loss: 0.2310, Train Steps/Sec: 0.28, Epoch: 0.29426739214924214, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15144, "loss": 0.3523027300834656, "memory_gb": 7.721559524536133, "step_time_ms": 3353.6102771759033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:09] (step=0015144) Train Loss: 0.3250, Train Steps/Sec: 0.28, Epoch: 0.29428682471822776, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15145, "loss": 0.15810415148735046, "memory_gb": 7.721559524536133, "step_time_ms": 3347.9483127593994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:12] (step=0015145) Train Loss: 0.1842, Train Steps/Sec: 0.28, Epoch: 0.2943062572872134, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 15146, "loss": 0.25033366680145264, "memory_gb": 7.721559524536133, "step_time_ms": 3352.9763221740723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:16] (step=0015146) Train Loss: 0.2618, Train Steps/Sec: 0.28, Epoch: 0.294325689856199, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15147, "loss": 0.2729593515396118, "memory_gb": 7.721559524536133, "step_time_ms": 3350.719451904297, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:19] (step=0015147) Train Loss: 0.2811, Train Steps/Sec: 0.28, Epoch: 0.2943451224251846, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15148, "loss": 0.3373766541481018, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2097339630127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:23] (step=0015148) Train Loss: 0.2609, Train Steps/Sec: 0.28, Epoch: 0.29436455499417025, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15149, "loss": 0.2159263789653778, "memory_gb": 7.721559524536133, "step_time_ms": 3351.1931896209717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:26] (step=0015149) Train Loss: 0.2564, Train Steps/Sec: 0.28, Epoch: 0.29438398756315587, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15150, "loss": 0.2959231734275818, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0246906280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:30] (step=0015150) Train Loss: 0.2341, Train Steps/Sec: 0.27, Epoch: 0.2944034201321415, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15151, "loss": 0.1818205714225769, "memory_gb": 7.721559524536133, "step_time_ms": 3357.805013656616, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:34] (step=0015151) Train Loss: 0.1860, Train Steps/Sec: 0.28, Epoch: 0.2944228527011271, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15152, "loss": 0.30447763204574585, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5523624420166, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:37] (step=0015152) Train Loss: 0.2340, Train Steps/Sec: 0.28, Epoch: 0.29444228527011274, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15153, "loss": 0.2817012071609497, "memory_gb": 7.721559524536133, "step_time_ms": 3355.1571369171143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:41] (step=0015153) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.2944617178390983, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15154, "loss": 0.12074614316225052, "memory_gb": 7.721559524536133, "step_time_ms": 3354.992389678955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:44] (step=0015154) Train Loss: 0.2095, Train Steps/Sec: 0.28, Epoch: 0.2944811504080839, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15155, "loss": 0.30126604437828064, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0613136291504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:48] (step=0015155) Train Loss: 0.2801, Train Steps/Sec: 0.28, Epoch: 0.29450058297706955, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15156, "loss": 0.11559297889471054, "memory_gb": 7.721559524536133, "step_time_ms": 3361.4563941955566, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:51] (step=0015156) Train Loss: 0.1547, Train Steps/Sec: 0.28, Epoch: 0.29452001554605517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15157, "loss": 0.29833802580833435, "memory_gb": 7.721559524536133, "step_time_ms": 3358.973503112793, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:55] (step=0015157) Train Loss: 0.2143, Train Steps/Sec: 0.28, Epoch: 0.2945394481150408, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:09:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15158, "loss": 0.2730107307434082, "memory_gb": 7.721559524536133, "step_time_ms": 3502.5808811187744, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:09:59] (step=0015158) Train Loss: 0.2446, Train Steps/Sec: 0.28, Epoch: 0.2945588806840264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15159, "loss": 0.28491613268852234, "memory_gb": 7.721559524536133, "step_time_ms": 3364.319324493408, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:02] (step=0015159) Train Loss: 0.2463, Train Steps/Sec: 0.28, Epoch: 0.29457831325301204, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15160, "loss": 0.16148445010185242, "memory_gb": 7.721559524536133, "step_time_ms": 3357.0117950439453, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:06] (step=0015160) Train Loss: 0.1960, Train Steps/Sec: 0.28, Epoch: 0.29459774582199766, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15161, "loss": 0.1512637734413147, "memory_gb": 7.721559524536133, "step_time_ms": 3354.3713092803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:09] (step=0015161) Train Loss: 0.1939, Train Steps/Sec: 0.28, Epoch: 0.2946171783909833, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15162, "loss": 0.256126344203949, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3034534454346, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:13] (step=0015162) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.2946366109599689, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 15163, "loss": 0.33273568749427795, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2892627716064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:16] (step=0015163) Train Loss: 0.2998, Train Steps/Sec: 0.28, Epoch: 0.2946560435289545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15164, "loss": 0.3260191082954407, "memory_gb": 7.721559524536133, "step_time_ms": 3362.813949584961, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:20] (step=0015164) Train Loss: 0.3024, Train Steps/Sec: 0.28, Epoch: 0.29467547609794015, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15165, "loss": 0.3296910524368286, "memory_gb": 7.721559524536133, "step_time_ms": 3359.09366607666, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:24] (step=0015165) Train Loss: 0.2894, Train Steps/Sec: 0.28, Epoch: 0.29469490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 15166, "loss": 0.27065858244895935, "memory_gb": 7.721559524536133, "step_time_ms": 3358.574151992798, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:27] (step=0015166) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.2947143412359114, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15167, "loss": 0.17087262868881226, "memory_gb": 7.721559524536133, "step_time_ms": 3359.818935394287, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:31] (step=0015167) Train Loss: 0.2045, Train Steps/Sec: 0.28, Epoch: 0.294733773804897, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15168, "loss": 0.2625565826892853, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6277961730957, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:34] (step=0015168) Train Loss: 0.2898, Train Steps/Sec: 0.28, Epoch: 0.29475320637388264, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 15169, "loss": 0.24438101053237915, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8874340057373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:38] (step=0015169) Train Loss: 0.2117, Train Steps/Sec: 0.28, Epoch: 0.29477263894286826, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15170, "loss": 0.23604752123355865, "memory_gb": 7.721559524536133, "step_time_ms": 3358.3335876464844, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:41] (step=0015170) Train Loss: 0.2106, Train Steps/Sec: 0.28, Epoch: 0.2947920715118539, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 15171, "loss": 0.1981135457754135, "memory_gb": 7.721559524536133, "step_time_ms": 3346.543312072754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:45] (step=0015171) Train Loss: 0.2563, Train Steps/Sec: 0.28, Epoch: 0.2948115040808395, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15172, "loss": 0.2969263792037964, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3039932250977, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:49] (step=0015172) Train Loss: 0.2912, Train Steps/Sec: 0.28, Epoch: 0.2948309366498251, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15173, "loss": 0.2814443111419678, "memory_gb": 7.721559524536133, "step_time_ms": 3365.600824356079, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:52] (step=0015173) Train Loss: 0.2517, Train Steps/Sec: 0.28, Epoch: 0.29485036921881075, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 15174, "loss": 0.2928985357284546, "memory_gb": 7.721559524536133, "step_time_ms": 3359.945058822632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:56] (step=0015174) Train Loss: 0.2248, Train Steps/Sec: 0.28, Epoch: 0.29486980178779637, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:10:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15175, "loss": 0.1311020851135254, "memory_gb": 7.721559524536133, "step_time_ms": 3369.845390319824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:10:59] (step=0015175) Train Loss: 0.1504, Train Steps/Sec: 0.28, Epoch: 0.294889234356782, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 15176, "loss": 0.2907068431377411, "memory_gb": 7.721559524536133, "step_time_ms": 3362.3647689819336, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:03] (step=0015176) Train Loss: 0.2530, Train Steps/Sec: 0.28, Epoch: 0.29490866692576756, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 15177, "loss": 0.18925827741622925, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1324253082275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:07] (step=0015177) Train Loss: 0.1789, Train Steps/Sec: 0.28, Epoch: 0.2949280994947532, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15178, "loss": 0.2760152816772461, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7782287597656, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:10] (step=0015178) Train Loss: 0.2099, Train Steps/Sec: 0.28, Epoch: 0.2949475320637388, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 15179, "loss": 0.17264944314956665, "memory_gb": 7.721559524536133, "step_time_ms": 3365.64302444458, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:14] (step=0015179) Train Loss: 0.2205, Train Steps/Sec: 0.28, Epoch: 0.2949669646327244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15180, "loss": 0.17362841963768005, "memory_gb": 7.721559524536133, "step_time_ms": 3364.9821281433105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:17] (step=0015180) Train Loss: 0.2699, Train Steps/Sec: 0.28, Epoch: 0.29498639720171005, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 15181, "loss": 0.27689269185066223, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8410778045654, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:21] (step=0015181) Train Loss: 0.2477, Train Steps/Sec: 0.28, Epoch: 0.29500582977069567, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15182, "loss": 0.2842220067977905, "memory_gb": 7.721559524536133, "step_time_ms": 3366.905450820923, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:25] (step=0015182) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.2950252623396813, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15183, "loss": 0.17581507563591003, "memory_gb": 7.721559524536133, "step_time_ms": 3366.590738296509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:28] (step=0015183) Train Loss: 0.2074, Train Steps/Sec: 0.28, Epoch: 0.2950446949086669, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15184, "loss": 0.2867025136947632, "memory_gb": 7.721559524536133, "step_time_ms": 3365.278482437134, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:32] (step=0015184) Train Loss: 0.2293, Train Steps/Sec: 0.28, Epoch: 0.29506412747765254, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15185, "loss": 0.2561902403831482, "memory_gb": 7.721559524536133, "step_time_ms": 3364.381790161133, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:35] (step=0015185) Train Loss: 0.2346, Train Steps/Sec: 0.28, Epoch: 0.29508356004663816, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15186, "loss": 0.21948131918907166, "memory_gb": 7.721559524536133, "step_time_ms": 3369.2333698272705, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:39] (step=0015186) Train Loss: 0.1701, Train Steps/Sec: 0.28, Epoch: 0.2951029926156238, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15187, "loss": 0.20937830209732056, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0734424591064, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:43] (step=0015187) Train Loss: 0.2227, Train Steps/Sec: 0.28, Epoch: 0.2951224251846094, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15188, "loss": 0.22551509737968445, "memory_gb": 7.721559524536133, "step_time_ms": 3356.32586479187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:46] (step=0015188) Train Loss: 0.1746, Train Steps/Sec: 0.28, Epoch: 0.295141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15189, "loss": 0.3197900950908661, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5217208862305, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:50] (step=0015189) Train Loss: 0.2623, Train Steps/Sec: 0.28, Epoch: 0.29516129032258065, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15190, "loss": 0.25366562604904175, "memory_gb": 7.721559524536133, "step_time_ms": 3367.708206176758, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:54] (step=0015190) Train Loss: 0.2148, Train Steps/Sec: 0.27, Epoch: 0.29518072289156627, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:11:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15191, "loss": 0.21867763996124268, "memory_gb": 7.721559524536133, "step_time_ms": 3364.5260334014893, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:11:57] (step=0015191) Train Loss: 0.2434, Train Steps/Sec: 0.28, Epoch: 0.2952001554605519, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15192, "loss": 0.2380875051021576, "memory_gb": 7.721559524536133, "step_time_ms": 3367.147445678711, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:01] (step=0015192) Train Loss: 0.2709, Train Steps/Sec: 0.28, Epoch: 0.2952195880295375, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15193, "loss": 0.18624773621559143, "memory_gb": 7.721559524536133, "step_time_ms": 3365.828275680542, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:04] (step=0015193) Train Loss: 0.1980, Train Steps/Sec: 0.28, Epoch: 0.29523902059852314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15194, "loss": 0.17632633447647095, "memory_gb": 7.721559524536133, "step_time_ms": 3365.0853633880615, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:08] (step=0015194) Train Loss: 0.1607, Train Steps/Sec: 0.28, Epoch: 0.29525845316750876, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15195, "loss": 0.1741591989994049, "memory_gb": 7.721559524536133, "step_time_ms": 3358.4532737731934, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:12] (step=0015195) Train Loss: 0.1683, Train Steps/Sec: 0.28, Epoch: 0.2952778857364944, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15196, "loss": 0.2624521255493164, "memory_gb": 7.721559524536133, "step_time_ms": 3364.8641109466553, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:15] (step=0015196) Train Loss: 0.2670, Train Steps/Sec: 0.28, Epoch: 0.29529731830548, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15197, "loss": 0.1673070341348648, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9891147613525, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:19] (step=0015197) Train Loss: 0.2109, Train Steps/Sec: 0.28, Epoch: 0.29531675087446563, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15198, "loss": 0.33685821294784546, "memory_gb": 7.721559524536133, "step_time_ms": 3364.069938659668, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:23] (step=0015198) Train Loss: 0.2596, Train Steps/Sec: 0.28, Epoch: 0.29533618344345125, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15199, "loss": 0.19313865900039673, "memory_gb": 7.721559524536133, "step_time_ms": 3365.205764770508, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:26] (step=0015199) Train Loss: 0.1854, Train Steps/Sec: 0.28, Epoch: 0.2953556160124368, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15200, "loss": 0.1753355860710144, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2817993164062, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:30] (step=0015200) Train Loss: 0.1943, Train Steps/Sec: 0.28, Epoch: 0.29537504858142244, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15201, "loss": 0.2692730128765106, "memory_gb": 7.721559524536133, "step_time_ms": 3361.792802810669, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:34] (step=0015201) Train Loss: 0.2644, Train Steps/Sec: 0.28, Epoch: 0.29539448115040806, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15202, "loss": 0.23456503450870514, "memory_gb": 7.721559524536133, "step_time_ms": 3363.5923862457275, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:37] (step=0015202) Train Loss: 0.2157, Train Steps/Sec: 0.28, Epoch: 0.2954139137193937, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15203, "loss": 0.19433346390724182, "memory_gb": 7.721559524536133, "step_time_ms": 3342.44441986084, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:41] (step=0015203) Train Loss: 0.2193, Train Steps/Sec: 0.28, Epoch: 0.2954333462883793, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15204, "loss": 0.24139955639839172, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5919609069824, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:44] (step=0015204) Train Loss: 0.3121, Train Steps/Sec: 0.28, Epoch: 0.29545277885736493, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15205, "loss": 0.3152073621749878, "memory_gb": 7.721559524536133, "step_time_ms": 3503.830671310425, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:48] (step=0015205) Train Loss: 0.3224, Train Steps/Sec: 0.28, Epoch: 0.29547221142635055, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15206, "loss": 0.2504385709762573, "memory_gb": 7.721559524536133, "step_time_ms": 3342.1194553375244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:52] (step=0015206) Train Loss: 0.2462, Train Steps/Sec: 0.28, Epoch: 0.2954916439953362, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15207, "loss": 0.3454424738883972, "memory_gb": 7.721559524536133, "step_time_ms": 3361.086130142212, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:55] (step=0015207) Train Loss: 0.3083, Train Steps/Sec: 0.28, Epoch: 0.2955110765643218, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:12:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15208, "loss": 0.19832679629325867, "memory_gb": 7.721559524536133, "step_time_ms": 3352.5919914245605, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:12:59] (step=0015208) Train Loss: 0.2379, Train Steps/Sec: 0.28, Epoch: 0.2955305091333074, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15209, "loss": 0.2550012469291687, "memory_gb": 7.721559524536133, "step_time_ms": 3352.2579669952393, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:02] (step=0015209) Train Loss: 0.2515, Train Steps/Sec: 0.28, Epoch: 0.29554994170229304, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15210, "loss": 0.28959453105926514, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7704849243164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:06] (step=0015210) Train Loss: 0.2761, Train Steps/Sec: 0.28, Epoch: 0.29556937427127866, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15211, "loss": 0.13818922638893127, "memory_gb": 7.721559524536133, "step_time_ms": 3357.783555984497, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:10] (step=0015211) Train Loss: 0.2037, Train Steps/Sec: 0.28, Epoch: 0.2955888068402643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15212, "loss": 0.23366661369800568, "memory_gb": 7.721559524536133, "step_time_ms": 3365.623950958252, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:13] (step=0015212) Train Loss: 0.2369, Train Steps/Sec: 0.28, Epoch: 0.2956082394092499, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15213, "loss": 0.23469695448875427, "memory_gb": 7.721559524536133, "step_time_ms": 3361.3431453704834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:17] (step=0015213) Train Loss: 0.2525, Train Steps/Sec: 0.28, Epoch: 0.29562767197823553, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15214, "loss": 0.30350184440612793, "memory_gb": 7.721559524536133, "step_time_ms": 3356.7779064178467, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:20] (step=0015214) Train Loss: 0.3136, Train Steps/Sec: 0.28, Epoch: 0.29564710454722115, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15215, "loss": 0.28758347034454346, "memory_gb": 7.721559524536133, "step_time_ms": 3364.000082015991, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:24] (step=0015215) Train Loss: 0.2534, Train Steps/Sec: 0.28, Epoch: 0.2956665371162068, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15216, "loss": 0.2602527141571045, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3880615234375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:28] (step=0015216) Train Loss: 0.2265, Train Steps/Sec: 0.28, Epoch: 0.2956859696851924, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15217, "loss": 0.22624531388282776, "memory_gb": 7.721559524536133, "step_time_ms": 3356.192111968994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:31] (step=0015217) Train Loss: 0.2460, Train Steps/Sec: 0.28, Epoch: 0.295705402254178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15218, "loss": 0.3181741237640381, "memory_gb": 7.721559524536133, "step_time_ms": 3362.612247467041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:35] (step=0015218) Train Loss: 0.3368, Train Steps/Sec: 0.28, Epoch: 0.29572483482316364, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15219, "loss": 0.19407306611537933, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5176677703857, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:39] (step=0015219) Train Loss: 0.2065, Train Steps/Sec: 0.28, Epoch: 0.29574426739214926, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 15220, "loss": 0.2887049615383148, "memory_gb": 7.721559524536133, "step_time_ms": 3365.971088409424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:42] (step=0015220) Train Loss: 0.3001, Train Steps/Sec: 0.28, Epoch: 0.2957636999611349, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15221, "loss": 0.22288194298744202, "memory_gb": 7.721559524536133, "step_time_ms": 3368.1609630584717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:46] (step=0015221) Train Loss: 0.2386, Train Steps/Sec: 0.28, Epoch: 0.2957831325301205, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15222, "loss": 0.21803011000156403, "memory_gb": 7.721559524536133, "step_time_ms": 3360.961437225342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:49] (step=0015222) Train Loss: 0.2863, Train Steps/Sec: 0.28, Epoch: 0.29580256509910613, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 15223, "loss": 0.16971687972545624, "memory_gb": 7.721559524536133, "step_time_ms": 3364.2587661743164, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:53] (step=0015223) Train Loss: 0.1706, Train Steps/Sec: 0.28, Epoch: 0.2958219976680917, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:13:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15224, "loss": 0.163224458694458, "memory_gb": 7.721559524536133, "step_time_ms": 3351.9809246063232, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:13:57] (step=0015224) Train Loss: 0.1968, Train Steps/Sec: 0.28, Epoch: 0.2958414302370773, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 15225, "loss": 0.12340596318244934, "memory_gb": 7.721559524536133, "step_time_ms": 3360.957145690918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:00] (step=0015225) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.29586086280606294, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15226, "loss": 0.30811989307403564, "memory_gb": 7.721559524536133, "step_time_ms": 3358.8130474090576, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:04] (step=0015226) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.29588029537504856, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15227, "loss": 0.2375006377696991, "memory_gb": 7.721559524536133, "step_time_ms": 3352.213144302368, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:08] (step=0015227) Train Loss: 0.2166, Train Steps/Sec: 0.28, Epoch: 0.2958997279440342, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15228, "loss": 0.1396569311618805, "memory_gb": 7.721559524536133, "step_time_ms": 3355.853319168091, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:11] (step=0015228) Train Loss: 0.1615, Train Steps/Sec: 0.28, Epoch: 0.2959191605130198, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15229, "loss": 0.32733744382858276, "memory_gb": 7.721559524536133, "step_time_ms": 3353.346347808838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:15] (step=0015229) Train Loss: 0.2579, Train Steps/Sec: 0.28, Epoch: 0.29593859308200543, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15230, "loss": 0.2608746886253357, "memory_gb": 7.721559524536133, "step_time_ms": 3344.268321990967, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:18] (step=0015230) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.29595802565099105, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15231, "loss": 0.2544410824775696, "memory_gb": 7.721559524536133, "step_time_ms": 3361.5190982818604, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:22] (step=0015231) Train Loss: 0.2306, Train Steps/Sec: 0.28, Epoch: 0.2959774582199767, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15232, "loss": 0.21069550514221191, "memory_gb": 7.721559524536133, "step_time_ms": 3363.607168197632, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:26] (step=0015232) Train Loss: 0.2354, Train Steps/Sec: 0.28, Epoch: 0.2959968907889623, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15233, "loss": 0.17999494075775146, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0153408050537, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:29] (step=0015233) Train Loss: 0.2086, Train Steps/Sec: 0.28, Epoch: 0.2960163233579479, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15234, "loss": 0.2375715672969818, "memory_gb": 7.721559524536133, "step_time_ms": 3363.9984130859375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:33] (step=0015234) Train Loss: 0.1806, Train Steps/Sec: 0.28, Epoch: 0.29603575592693354, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15235, "loss": 0.24391712248325348, "memory_gb": 7.721559524536133, "step_time_ms": 3363.6412620544434, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:36] (step=0015235) Train Loss: 0.2072, Train Steps/Sec: 0.28, Epoch: 0.29605518849591916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15236, "loss": 0.12879785895347595, "memory_gb": 7.721559524536133, "step_time_ms": 3363.3246421813965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:40] (step=0015236) Train Loss: 0.1997, Train Steps/Sec: 0.28, Epoch: 0.2960746210649048, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15237, "loss": 0.22414398193359375, "memory_gb": 7.721559524536133, "step_time_ms": 3360.729455947876, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:44] (step=0015237) Train Loss: 0.2048, Train Steps/Sec: 0.27, Epoch: 0.2960940536338904, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 15238, "loss": 0.18479064106941223, "memory_gb": 7.721559524536133, "step_time_ms": 3373.744010925293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:47] (step=0015238) Train Loss: 0.1903, Train Steps/Sec: 0.28, Epoch: 0.29611348620287603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15239, "loss": 0.2092421054840088, "memory_gb": 7.721559524536133, "step_time_ms": 3362.67352104187, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:51] (step=0015239) Train Loss: 0.2471, Train Steps/Sec: 0.28, Epoch: 0.29613291877186165, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15240, "loss": 0.15264005959033966, "memory_gb": 7.715639114379883, "step_time_ms": 3395.1151371002197, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:55] (step=0015240) Train Loss: 0.1865, Train Steps/Sec: 0.27, Epoch: 0.2961523513408473, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:14:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 15241, "loss": 0.22077986598014832, "memory_gb": 7.721559524536133, "step_time_ms": 3426.0127544403076, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:14:58] (step=0015241) Train Loss: 0.2530, Train Steps/Sec: 0.27, Epoch: 0.2961717839098329, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15242, "loss": 0.24182164669036865, "memory_gb": 7.721559524536133, "step_time_ms": 3421.593427658081, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:02] (step=0015242) Train Loss: 0.2258, Train Steps/Sec: 0.27, Epoch: 0.2961912164788185, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 15243, "loss": 0.16886936128139496, "memory_gb": 7.721559524536133, "step_time_ms": 4428.789377212524, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:07] (step=0015243) Train Loss: 0.2105, Train Steps/Sec: 0.22, Epoch: 0.29621064904780414, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15244, "loss": 0.11583300679922104, "memory_gb": 7.721559524536133, "step_time_ms": 6329.7436237335205, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:13] (step=0015244) Train Loss: 0.1642, Train Steps/Sec: 0.15, Epoch: 0.29623008161678976, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15245, "loss": 0.25970786809921265, "memory_gb": 7.721559524536133, "step_time_ms": 5531.094789505005, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:20] (step=0015245) Train Loss: 0.2350, Train Steps/Sec: 0.14, Epoch: 0.2962495141857754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 15246, "loss": 0.2400842308998108, "memory_gb": 7.721559524536133, "step_time_ms": 5585.722208023071, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:27] (step=0015246) Train Loss: 0.2738, Train Steps/Sec: 0.15, Epoch: 0.29626894675476095, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15247, "loss": 0.2372531145811081, "memory_gb": 7.721559524536133, "step_time_ms": 5674.100160598755, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:34] (step=0015247) Train Loss: 0.2089, Train Steps/Sec: 0.14, Epoch: 0.2962883793237466, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15248, "loss": 0.141059011220932, "memory_gb": 7.721559524536133, "step_time_ms": 6231.762886047363, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:40] (step=0015248) Train Loss: 0.1855, Train Steps/Sec: 0.16, Epoch: 0.2963078118927322, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15249, "loss": 0.24654945731163025, "memory_gb": 7.721559524536133, "step_time_ms": 6072.200059890747, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:46] (step=0015249) Train Loss: 0.2677, Train Steps/Sec: 0.16, Epoch: 0.2963272444617178, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15250, "loss": 0.26702767610549927, "memory_gb": 7.721559524536133, "step_time_ms": 5524.350166320801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:52] (step=0015250) Train Loss: 0.2272, Train Steps/Sec: 0.18, Epoch: 0.29634667703070344, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:15:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15251, "loss": 0.26603028178215027, "memory_gb": 7.721559524536133, "step_time_ms": 5279.375791549683, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:15:57] (step=0015251) Train Loss: 0.2125, Train Steps/Sec: 0.19, Epoch: 0.29636610959968906, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 15252, "loss": 0.29182207584381104, "memory_gb": 7.721559524536133, "step_time_ms": 5069.95964050293, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:03] (step=0015252) Train Loss: 0.2469, Train Steps/Sec: 0.18, Epoch: 0.2963855421686747, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15253, "loss": 0.20306837558746338, "memory_gb": 7.721559524536133, "step_time_ms": 4556.715488433838, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:09] (step=0015253) Train Loss: 0.1791, Train Steps/Sec: 0.18, Epoch: 0.2964049747376603, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15254, "loss": 0.17458149790763855, "memory_gb": 7.721559524536133, "step_time_ms": 3379.7402381896973, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:12] (step=0015254) Train Loss: 0.2275, Train Steps/Sec: 0.27, Epoch: 0.29642440730664593, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:16] EFFICIENCY_METRICS: {"epoch": 0, "step": 15255, "loss": 0.24506784975528717, "memory_gb": 7.715639114379883, "step_time_ms": 3330.9872150421143, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:16] (step=0015255) Train Loss: 0.2391, Train Steps/Sec: 0.28, Epoch: 0.29644383987563155, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15256, "loss": 0.14239299297332764, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8137531280518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:20] (step=0015256) Train Loss: 0.1660, Train Steps/Sec: 0.28, Epoch: 0.2964632724446172, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15257, "loss": 0.3331018090248108, "memory_gb": 7.721559524536133, "step_time_ms": 3361.2849712371826, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:23] (step=0015257) Train Loss: 0.3112, Train Steps/Sec: 0.28, Epoch: 0.2964827050136028, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 15258, "loss": 0.19656690955162048, "memory_gb": 7.721559524536133, "step_time_ms": 3370.7826137542725, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:27] (step=0015258) Train Loss: 0.2053, Train Steps/Sec: 0.28, Epoch: 0.2965021375825884, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15259, "loss": 0.2634344696998596, "memory_gb": 7.721559524536133, "step_time_ms": 3368.122339248657, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:30] (step=0015259) Train Loss: 0.2786, Train Steps/Sec: 0.28, Epoch: 0.29652157015157404, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:34] EFFICIENCY_METRICS: {"epoch": 0, "step": 15260, "loss": 0.1904803216457367, "memory_gb": 7.721559524536133, "step_time_ms": 3363.7685775756836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:34] (step=0015260) Train Loss: 0.2094, Train Steps/Sec: 0.28, Epoch: 0.29654100272055967, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 15261, "loss": 0.12277528643608093, "memory_gb": 7.721559524536133, "step_time_ms": 3373.3067512512207, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:38] (step=0015261) Train Loss: 0.1446, Train Steps/Sec: 0.28, Epoch: 0.2965604352895453, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15262, "loss": 0.17975591123104095, "memory_gb": 7.721559524536133, "step_time_ms": 3374.056816101074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:41] (step=0015262) Train Loss: 0.1896, Train Steps/Sec: 0.28, Epoch: 0.2965798678585309, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 15263, "loss": 0.20791590213775635, "memory_gb": 7.721559524536133, "step_time_ms": 3373.0862140655518, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:45] (step=0015263) Train Loss: 0.1762, Train Steps/Sec: 0.28, Epoch: 0.29659930042751653, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15264, "loss": 0.14538592100143433, "memory_gb": 7.721559524536133, "step_time_ms": 3375.5064010620117, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:48] (step=0015264) Train Loss: 0.1954, Train Steps/Sec: 0.28, Epoch: 0.29661873299650215, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15265, "loss": 0.32536908984184265, "memory_gb": 7.721559524536133, "step_time_ms": 3378.725528717041, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:52] (step=0015265) Train Loss: 0.2991, Train Steps/Sec: 0.27, Epoch: 0.2966381655654878, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 15266, "loss": 0.236395001411438, "memory_gb": 7.721559524536133, "step_time_ms": 3375.7739067077637, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:56] (step=0015266) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.2966575981344734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:16:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15267, "loss": 0.24085965752601624, "memory_gb": 7.721559524536133, "step_time_ms": 3377.5954246520996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:16:59] (step=0015267) Train Loss: 0.2768, Train Steps/Sec: 0.27, Epoch: 0.296677030703459, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:03] EFFICIENCY_METRICS: {"epoch": 0, "step": 15268, "loss": 0.33085566759109497, "memory_gb": 7.721559524536133, "step_time_ms": 3376.981496810913, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:03] (step=0015268) Train Loss: 0.2639, Train Steps/Sec: 0.28, Epoch: 0.29669646327244464, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 15269, "loss": 0.30220356583595276, "memory_gb": 7.721559524536133, "step_time_ms": 3379.345178604126, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:07] (step=0015269) Train Loss: 0.3233, Train Steps/Sec: 0.28, Epoch: 0.2967158958414302, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15270, "loss": 0.15230774879455566, "memory_gb": 7.721559524536133, "step_time_ms": 3369.863510131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:10] (step=0015270) Train Loss: 0.2003, Train Steps/Sec: 0.28, Epoch: 0.29673532841041583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 15271, "loss": 0.3019298315048218, "memory_gb": 7.721559524536133, "step_time_ms": 3386.064291000366, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:14] (step=0015271) Train Loss: 0.2697, Train Steps/Sec: 0.28, Epoch: 0.29675476097940146, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15272, "loss": 0.24064816534519196, "memory_gb": 7.721559524536133, "step_time_ms": 3390.1398181915283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:18] (step=0015272) Train Loss: 0.2218, Train Steps/Sec: 0.27, Epoch: 0.2967741935483871, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 15273, "loss": 0.17676663398742676, "memory_gb": 7.721559524536133, "step_time_ms": 3387.796401977539, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:21] (step=0015273) Train Loss: 0.2146, Train Steps/Sec: 0.28, Epoch: 0.2967936261173727, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15274, "loss": 0.20215338468551636, "memory_gb": 7.721559524536133, "step_time_ms": 3390.592336654663, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:25] (step=0015274) Train Loss: 0.2030, Train Steps/Sec: 0.27, Epoch: 0.2968130586863583, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15275, "loss": 0.2633441090583801, "memory_gb": 7.721559524536133, "step_time_ms": 3369.3387508392334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:28] (step=0015275) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.29683249125534394, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15276, "loss": 0.3368711471557617, "memory_gb": 7.721559524536133, "step_time_ms": 3383.718729019165, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:32] (step=0015276) Train Loss: 0.2956, Train Steps/Sec: 0.28, Epoch: 0.29685192382432957, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15277, "loss": 0.25745660066604614, "memory_gb": 7.721559524536133, "step_time_ms": 3383.578062057495, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:36] (step=0015277) Train Loss: 0.2815, Train Steps/Sec: 0.28, Epoch: 0.2968713563933152, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15278, "loss": 0.1867455095052719, "memory_gb": 7.721559524536133, "step_time_ms": 3383.906364440918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:39] (step=0015278) Train Loss: 0.1533, Train Steps/Sec: 0.28, Epoch: 0.2968907889623008, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15279, "loss": 0.3002568185329437, "memory_gb": 7.721559524536133, "step_time_ms": 3379.328489303589, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:43] (step=0015279) Train Loss: 0.2659, Train Steps/Sec: 0.28, Epoch: 0.29691022153128643, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15280, "loss": 0.1930118352174759, "memory_gb": 7.721559524536133, "step_time_ms": 3373.274564743042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:46] (step=0015280) Train Loss: 0.2136, Train Steps/Sec: 0.28, Epoch: 0.29692965410027206, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15281, "loss": 0.231153666973114, "memory_gb": 7.721559524536133, "step_time_ms": 3368.466377258301, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:50] (step=0015281) Train Loss: 0.2073, Train Steps/Sec: 0.28, Epoch: 0.2969490866692577, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15282, "loss": 0.22290939092636108, "memory_gb": 7.721559524536133, "step_time_ms": 3370.8276748657227, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:54] (step=0015282) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.2969685192382433, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:17:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15283, "loss": 0.2776256501674652, "memory_gb": 7.721559524536133, "step_time_ms": 3366.999864578247, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:17:57] (step=0015283) Train Loss: 0.2284, Train Steps/Sec: 0.28, Epoch: 0.2969879518072289, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15284, "loss": 0.21957996487617493, "memory_gb": 7.721559524536133, "step_time_ms": 3366.014242172241, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:01] (step=0015284) Train Loss: 0.2366, Train Steps/Sec: 0.28, Epoch: 0.29700738437621454, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15285, "loss": 0.22718022763729095, "memory_gb": 7.721559524536133, "step_time_ms": 3362.131357192993, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:04] (step=0015285) Train Loss: 0.2686, Train Steps/Sec: 0.27, Epoch: 0.29702681694520017, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15286, "loss": 0.22192981839179993, "memory_gb": 7.721559524536133, "step_time_ms": 3348.0215072631836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:08] (step=0015286) Train Loss: 0.2218, Train Steps/Sec: 0.29, Epoch: 0.2970462495141858, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15287, "loss": 0.12724515795707703, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9027137756348, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:11] (step=0015287) Train Loss: 0.2260, Train Steps/Sec: 0.28, Epoch: 0.2970656820831714, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15288, "loss": 0.3513481318950653, "memory_gb": 7.721559524536133, "step_time_ms": 3361.752510070801, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:15] (step=0015288) Train Loss: 0.2923, Train Steps/Sec: 0.28, Epoch: 0.29708511465215703, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15289, "loss": 0.24790948629379272, "memory_gb": 7.721559524536133, "step_time_ms": 3361.7961406707764, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:18] (step=0015289) Train Loss: 0.2484, Train Steps/Sec: 0.28, Epoch: 0.29710454722114266, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15290, "loss": 0.18144488334655762, "memory_gb": 7.721559524536133, "step_time_ms": 3359.1248989105225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:22] (step=0015290) Train Loss: 0.2004, Train Steps/Sec: 0.28, Epoch: 0.2971239797901283, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15291, "loss": 0.28347182273864746, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6630821228027, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:25] (step=0015291) Train Loss: 0.2028, Train Steps/Sec: 0.28, Epoch: 0.2971434123591139, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15292, "loss": 0.2278130054473877, "memory_gb": 7.721559524536133, "step_time_ms": 3359.3666553497314, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:29] (step=0015292) Train Loss: 0.2060, Train Steps/Sec: 0.28, Epoch: 0.29716284492809947, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15293, "loss": 0.14845603704452515, "memory_gb": 7.721559524536133, "step_time_ms": 3358.0734729766846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:33] (step=0015293) Train Loss: 0.1866, Train Steps/Sec: 0.28, Epoch: 0.2971822774970851, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15294, "loss": 0.16914421319961548, "memory_gb": 7.721559524536133, "step_time_ms": 3504.746198654175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:36] (step=0015294) Train Loss: 0.2010, Train Steps/Sec: 0.28, Epoch: 0.2972017100660707, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15295, "loss": 0.16182935237884521, "memory_gb": 7.721559524536133, "step_time_ms": 3347.1062183380127, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:40] (step=0015295) Train Loss: 0.1418, Train Steps/Sec: 0.28, Epoch: 0.29722114263505633, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15296, "loss": 0.19450528919696808, "memory_gb": 7.721559524536133, "step_time_ms": 3361.0188961029053, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:43] (step=0015296) Train Loss: 0.2214, Train Steps/Sec: 0.28, Epoch: 0.29724057520404196, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 15297, "loss": 0.21195754408836365, "memory_gb": 7.721559524536133, "step_time_ms": 3358.125925064087, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:47] (step=0015297) Train Loss: 0.2015, Train Steps/Sec: 0.28, Epoch: 0.2972600077730276, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15298, "loss": 0.250779390335083, "memory_gb": 7.721559524536133, "step_time_ms": 3361.159324645996, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:50] (step=0015298) Train Loss: 0.1956, Train Steps/Sec: 0.28, Epoch: 0.2972794403420132, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15299, "loss": 0.22682394087314606, "memory_gb": 7.721559524536133, "step_time_ms": 3359.414577484131, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:54] (step=0015299) Train Loss: 0.1992, Train Steps/Sec: 0.28, Epoch: 0.2972988729109988, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:18:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15300, "loss": 0.15577656030654907, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5238456726074, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:18:57] (step=0015300) Train Loss: 0.1989, Train Steps/Sec: 0.28, Epoch: 0.29731830547998445, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15301, "loss": 0.2652985453605652, "memory_gb": 7.721559524536133, "step_time_ms": 3358.78849029541, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:01] (step=0015301) Train Loss: 0.2810, Train Steps/Sec: 0.28, Epoch: 0.29733773804897007, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15302, "loss": 0.26799681782722473, "memory_gb": 7.721559524536133, "step_time_ms": 3360.335111618042, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:04] (step=0015302) Train Loss: 0.2253, Train Steps/Sec: 0.28, Epoch: 0.2973571706179557, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15303, "loss": 0.3514549732208252, "memory_gb": 7.721559524536133, "step_time_ms": 3357.6176166534424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:08] (step=0015303) Train Loss: 0.3343, Train Steps/Sec: 0.28, Epoch: 0.2973766031869413, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15304, "loss": 0.20460033416748047, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0783348083496, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:11] (step=0015304) Train Loss: 0.2544, Train Steps/Sec: 0.28, Epoch: 0.29739603575592694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15305, "loss": 0.3437066674232483, "memory_gb": 7.721559524536133, "step_time_ms": 3358.5612773895264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:15] (step=0015305) Train Loss: 0.2891, Train Steps/Sec: 0.28, Epoch: 0.29741546832491256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15306, "loss": 0.2771798372268677, "memory_gb": 7.721559524536133, "step_time_ms": 3356.4200401306152, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:18] (step=0015306) Train Loss: 0.2805, Train Steps/Sec: 0.28, Epoch: 0.2974349008938982, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15307, "loss": 0.21342405676841736, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5592765808105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:22] (step=0015307) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.2974543334628838, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15308, "loss": 0.305523157119751, "memory_gb": 7.721559524536133, "step_time_ms": 3356.618881225586, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:26] (step=0015308) Train Loss: 0.2392, Train Steps/Sec: 0.28, Epoch: 0.2974737660318694, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15309, "loss": 0.2519451081752777, "memory_gb": 7.721559524536133, "step_time_ms": 3360.2678775787354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:29] (step=0015309) Train Loss: 0.2397, Train Steps/Sec: 0.28, Epoch: 0.29749319860085505, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15310, "loss": 0.2629469037055969, "memory_gb": 7.721559524536133, "step_time_ms": 3356.3942909240723, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:33] (step=0015310) Train Loss: 0.2614, Train Steps/Sec: 0.28, Epoch: 0.29751263116984067, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15311, "loss": 0.11835827678442001, "memory_gb": 7.721559524536133, "step_time_ms": 3353.2466888427734, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:36] (step=0015311) Train Loss: 0.1309, Train Steps/Sec: 0.28, Epoch: 0.2975320637388263, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15312, "loss": 0.1526225507259369, "memory_gb": 7.721559524536133, "step_time_ms": 3355.63588142395, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:40] (step=0015312) Train Loss: 0.2180, Train Steps/Sec: 0.28, Epoch: 0.2975514963078119, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15313, "loss": 0.23823995888233185, "memory_gb": 7.721559524536133, "step_time_ms": 3355.9930324554443, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:43] (step=0015313) Train Loss: 0.2400, Train Steps/Sec: 0.28, Epoch: 0.29757092887679754, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:47] EFFICIENCY_METRICS: {"epoch": 0, "step": 15314, "loss": 0.20280298590660095, "memory_gb": 7.721559524536133, "step_time_ms": 3356.5571308135986, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:47] (step=0015314) Train Loss: 0.1953, Train Steps/Sec: 0.28, Epoch: 0.29759036144578316, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15315, "loss": 0.29373812675476074, "memory_gb": 7.721559524536133, "step_time_ms": 3374.392509460449, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:50] (step=0015315) Train Loss: 0.2550, Train Steps/Sec: 0.28, Epoch: 0.2976097940147687, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15316, "loss": 0.26596593856811523, "memory_gb": 7.721559524536133, "step_time_ms": 3366.80006980896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:54] (step=0015316) Train Loss: 0.2802, Train Steps/Sec: 0.28, Epoch: 0.29762922658375435, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:19:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 15317, "loss": 0.18565690517425537, "memory_gb": 7.721559524536133, "step_time_ms": 3425.478219985962, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:19:58] (step=0015317) Train Loss: 0.2501, Train Steps/Sec: 0.28, Epoch: 0.29764865915273997, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15318, "loss": 0.3284914791584015, "memory_gb": 7.721559524536133, "step_time_ms": 3424.3085384368896, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:01] (step=0015318) Train Loss: 0.2804, Train Steps/Sec: 0.28, Epoch: 0.2976680917217256, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 15319, "loss": 0.21792840957641602, "memory_gb": 7.721559524536133, "step_time_ms": 3430.8202266693115, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:05] (step=0015319) Train Loss: 0.2352, Train Steps/Sec: 0.27, Epoch: 0.2976875242907112, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15320, "loss": 0.2940099239349365, "memory_gb": 7.721559524536133, "step_time_ms": 3400.968551635742, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:09] (step=0015320) Train Loss: 0.2518, Train Steps/Sec: 0.28, Epoch: 0.29770695685969684, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 15321, "loss": 0.2443968802690506, "memory_gb": 7.721559524536133, "step_time_ms": 4932.098150253296, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:14] (step=0015321) Train Loss: 0.2186, Train Steps/Sec: 0.19, Epoch: 0.29772638942868246, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15322, "loss": 0.3192799389362335, "memory_gb": 7.721559524536133, "step_time_ms": 4282.147645950317, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:18] (step=0015322) Train Loss: 0.2573, Train Steps/Sec: 0.23, Epoch: 0.2977458219976681, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15323, "loss": 0.2648699879646301, "memory_gb": 7.721559524536133, "step_time_ms": 4225.78501701355, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:25] (step=0015323) Train Loss: 0.2724, Train Steps/Sec: 0.14, Epoch: 0.2977652545666537, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15324, "loss": 0.3737904131412506, "memory_gb": 7.721559524536133, "step_time_ms": 4504.538536071777, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:32] (step=0015324) Train Loss: 0.2890, Train Steps/Sec: 0.14, Epoch: 0.2977846871356393, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15325, "loss": 0.2843354344367981, "memory_gb": 7.721559524536133, "step_time_ms": 4940.248250961304, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:39] (step=0015325) Train Loss: 0.2575, Train Steps/Sec: 0.15, Epoch: 0.29780411970462495, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:45] EFFICIENCY_METRICS: {"epoch": 0, "step": 15326, "loss": 0.24713164567947388, "memory_gb": 7.721559524536133, "step_time_ms": 5700.357913970947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:45] (step=0015326) Train Loss: 0.3106, Train Steps/Sec: 0.16, Epoch: 0.29782355227361057, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:52] EFFICIENCY_METRICS: {"epoch": 0, "step": 15327, "loss": 0.25643959641456604, "memory_gb": 7.721559524536133, "step_time_ms": 5544.849395751953, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:52] (step=0015327) Train Loss: 0.2791, Train Steps/Sec: 0.15, Epoch: 0.2978429848425962, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:20:59] EFFICIENCY_METRICS: {"epoch": 0, "step": 15328, "loss": 0.2678789496421814, "memory_gb": 7.721559524536133, "step_time_ms": 4936.980724334717, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:20:59] (step=0015328) Train Loss: 0.2491, Train Steps/Sec: 0.14, Epoch: 0.2978624174115818, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15329, "loss": 0.23904259502887726, "memory_gb": 7.721559524536133, "step_time_ms": 3899.6927738189697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:06] (step=0015329) Train Loss: 0.2583, Train Steps/Sec: 0.14, Epoch: 0.29788184998056744, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15330, "loss": 0.25258868932724, "memory_gb": 7.721559524536133, "step_time_ms": 5176.928997039795, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:11] (step=0015330) Train Loss: 0.2245, Train Steps/Sec: 0.19, Epoch: 0.29790128254955306, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15331, "loss": 0.19099682569503784, "memory_gb": 7.721559524536133, "step_time_ms": 3457.253932952881, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:15] (step=0015331) Train Loss: 0.1995, Train Steps/Sec: 0.27, Epoch: 0.2979207151185387, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15332, "loss": 0.2461755871772766, "memory_gb": 7.721559524536133, "step_time_ms": 3358.633518218994, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:18] (step=0015332) Train Loss: 0.2729, Train Steps/Sec: 0.28, Epoch: 0.2979401476875243, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:22] EFFICIENCY_METRICS: {"epoch": 0, "step": 15333, "loss": 0.2654058337211609, "memory_gb": 7.721559524536133, "step_time_ms": 3360.983371734619, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:22] (step=0015333) Train Loss: 0.2669, Train Steps/Sec: 0.28, Epoch: 0.2979595802565099, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15334, "loss": 0.23745445907115936, "memory_gb": 7.721559524536133, "step_time_ms": 3506.3416957855225, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:26] (step=0015334) Train Loss: 0.2679, Train Steps/Sec: 0.28, Epoch: 0.29797901282549555, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15335, "loss": 0.16194990277290344, "memory_gb": 7.721559524536133, "step_time_ms": 3365.1413917541504, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:29] (step=0015335) Train Loss: 0.2016, Train Steps/Sec: 0.27, Epoch: 0.29799844539448117, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15336, "loss": 0.254285991191864, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9635105133057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:33] (step=0015336) Train Loss: 0.1993, Train Steps/Sec: 0.28, Epoch: 0.2980178779634668, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15337, "loss": 0.22440844774246216, "memory_gb": 7.721559524536133, "step_time_ms": 3360.0311279296875, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:37] (step=0015337) Train Loss: 0.2705, Train Steps/Sec: 0.28, Epoch: 0.2980373105324524, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:40] EFFICIENCY_METRICS: {"epoch": 0, "step": 15338, "loss": 0.22051364183425903, "memory_gb": 7.721559524536133, "step_time_ms": 3360.966444015503, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:40] (step=0015338) Train Loss: 0.2068, Train Steps/Sec: 0.28, Epoch: 0.29805674310143804, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15339, "loss": 0.25738662481307983, "memory_gb": 7.721559524536133, "step_time_ms": 3365.3478622436523, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:44] (step=0015339) Train Loss: 0.2668, Train Steps/Sec: 0.28, Epoch: 0.2980761756704236, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15340, "loss": 0.15209244191646576, "memory_gb": 7.721559524536133, "step_time_ms": 3370.4373836517334, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:48] (step=0015340) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.2980956082394092, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:51] EFFICIENCY_METRICS: {"epoch": 0, "step": 15341, "loss": 0.21656224131584167, "memory_gb": 7.721559524536133, "step_time_ms": 3364.881753921509, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:51] (step=0015341) Train Loss: 0.1946, Train Steps/Sec: 0.28, Epoch: 0.29811504080839485, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15342, "loss": 0.15100190043449402, "memory_gb": 7.721559524536133, "step_time_ms": 3369.60506439209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:55] (step=0015342) Train Loss: 0.1600, Train Steps/Sec: 0.28, Epoch: 0.29813447337738047, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:21:58] EFFICIENCY_METRICS: {"epoch": 0, "step": 15343, "loss": 0.3064503073692322, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0666217803955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:21:58] (step=0015343) Train Loss: 0.2952, Train Steps/Sec: 0.28, Epoch: 0.2981539059463661, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:02] EFFICIENCY_METRICS: {"epoch": 0, "step": 15344, "loss": 0.2564738988876343, "memory_gb": 7.721559524536133, "step_time_ms": 3369.013786315918, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:02] (step=0015344) Train Loss: 0.2503, Train Steps/Sec: 0.28, Epoch: 0.2981733385153517, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15345, "loss": 0.1811356395483017, "memory_gb": 7.721559524536133, "step_time_ms": 3378.941297531128, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:06] (step=0015345) Train Loss: 0.2317, Train Steps/Sec: 0.28, Epoch: 0.29819277108433734, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:09] EFFICIENCY_METRICS: {"epoch": 0, "step": 15346, "loss": 0.30939316749572754, "memory_gb": 7.721559524536133, "step_time_ms": 3363.1463050842285, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:09] (step=0015346) Train Loss: 0.2774, Train Steps/Sec: 0.28, Epoch: 0.29821220365332296, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:13] EFFICIENCY_METRICS: {"epoch": 0, "step": 15347, "loss": 0.19590452313423157, "memory_gb": 7.721559524536133, "step_time_ms": 3366.852045059204, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:13] (step=0015347) Train Loss: 0.2308, Train Steps/Sec: 0.28, Epoch: 0.2982316362223086, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15348, "loss": 0.1951492577791214, "memory_gb": 7.721559524536133, "step_time_ms": 3360.841751098633, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:17] (step=0015348) Train Loss: 0.2201, Train Steps/Sec: 0.28, Epoch: 0.2982510687912942, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:20] EFFICIENCY_METRICS: {"epoch": 0, "step": 15349, "loss": 0.2565383315086365, "memory_gb": 7.721559524536133, "step_time_ms": 3368.793249130249, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:20] (step=0015349) Train Loss: 0.2660, Train Steps/Sec: 0.28, Epoch: 0.2982705013602798, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:24] EFFICIENCY_METRICS: {"epoch": 0, "step": 15350, "loss": 0.23308664560317993, "memory_gb": 7.721559524536133, "step_time_ms": 3368.988275527954, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:24] (step=0015350) Train Loss: 0.2313, Train Steps/Sec: 0.28, Epoch: 0.29828993392926545, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:27] EFFICIENCY_METRICS: {"epoch": 0, "step": 15351, "loss": 0.1931694895029068, "memory_gb": 7.721559524536133, "step_time_ms": 3365.7712936401367, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:27] (step=0015351) Train Loss: 0.1713, Train Steps/Sec: 0.28, Epoch: 0.29830936649825107, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:31] EFFICIENCY_METRICS: {"epoch": 0, "step": 15352, "loss": 0.17164312303066254, "memory_gb": 7.721559524536133, "step_time_ms": 3367.783308029175, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:31] (step=0015352) Train Loss: 0.2457, Train Steps/Sec: 0.28, Epoch: 0.2983287990672367, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15353, "loss": 0.2903705835342407, "memory_gb": 7.721559524536133, "step_time_ms": 3359.5080375671387, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:35] (step=0015353) Train Loss: 0.2883, Train Steps/Sec: 0.28, Epoch: 0.2983482316362223, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:38] EFFICIENCY_METRICS: {"epoch": 0, "step": 15354, "loss": 0.22241009771823883, "memory_gb": 7.721559524536133, "step_time_ms": 3362.9226684570312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:38] (step=0015354) Train Loss: 0.2067, Train Steps/Sec: 0.28, Epoch: 0.29836766420520794, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:42] EFFICIENCY_METRICS: {"epoch": 0, "step": 15355, "loss": 0.1705862432718277, "memory_gb": 7.721559524536133, "step_time_ms": 3366.3980960845947, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:42] (step=0015355) Train Loss: 0.2256, Train Steps/Sec: 0.28, Epoch: 0.29838709677419356, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15356, "loss": 0.18798236548900604, "memory_gb": 7.721559524536133, "step_time_ms": 3371.6800212860107, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:46] (step=0015356) Train Loss: 0.2373, Train Steps/Sec: 0.28, Epoch: 0.2984065293431792, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15357, "loss": 0.24927569925785065, "memory_gb": 7.721559524536133, "step_time_ms": 3368.736743927002, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:49] (step=0015357) Train Loss: 0.2307, Train Steps/Sec: 0.28, Epoch: 0.2984259619121648, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:53] EFFICIENCY_METRICS: {"epoch": 0, "step": 15358, "loss": 0.3125106394290924, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4747219085693, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:53] (step=0015358) Train Loss: 0.2787, Train Steps/Sec: 0.28, Epoch: 0.29844539448115043, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:22:56] EFFICIENCY_METRICS: {"epoch": 0, "step": 15359, "loss": 0.11499005556106567, "memory_gb": 7.721559524536133, "step_time_ms": 3367.8925037384033, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:22:56] (step=0015359) Train Loss: 0.2075, Train Steps/Sec: 0.28, Epoch: 0.29846482705013605, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:00] EFFICIENCY_METRICS: {"epoch": 0, "step": 15360, "loss": 0.2702149748802185, "memory_gb": 7.721559524536133, "step_time_ms": 3373.0881214141846, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:00] (step=0015360) Train Loss: 0.2822, Train Steps/Sec: 0.28, Epoch: 0.2984842596191217, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15361, "loss": 0.2059195339679718, "memory_gb": 7.721559524536133, "step_time_ms": 3370.107650756836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:04] (step=0015361) Train Loss: 0.2113, Train Steps/Sec: 0.28, Epoch: 0.2985036921881073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:07] EFFICIENCY_METRICS: {"epoch": 0, "step": 15362, "loss": 0.277572900056839, "memory_gb": 7.721559524536133, "step_time_ms": 3360.4743480682373, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:07] (step=0015362) Train Loss: 0.2621, Train Steps/Sec: 0.28, Epoch: 0.29852312475709286, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:11] EFFICIENCY_METRICS: {"epoch": 0, "step": 15363, "loss": 0.36632493138313293, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1414432525635, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:11] (step=0015363) Train Loss: 0.3223, Train Steps/Sec: 0.28, Epoch: 0.2985425573260785, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 15364, "loss": 0.2220020294189453, "memory_gb": 7.721559524536133, "step_time_ms": 3358.6747646331787, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:14] (step=0015364) Train Loss: 0.2129, Train Steps/Sec: 0.28, Epoch: 0.2985619898950641, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:18] EFFICIENCY_METRICS: {"epoch": 0, "step": 15365, "loss": 0.2990211844444275, "memory_gb": 7.721559524536133, "step_time_ms": 3361.1137866973877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:18] (step=0015365) Train Loss: 0.3213, Train Steps/Sec: 0.28, Epoch: 0.29858142246404973, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 15366, "loss": 0.3649861514568329, "memory_gb": 7.715639114379883, "step_time_ms": 3329.0045261383057, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:21] (step=0015366) Train Loss: 0.3055, Train Steps/Sec: 0.28, Epoch: 0.29860085503303535, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15367, "loss": 0.26791995763778687, "memory_gb": 7.721559524536133, "step_time_ms": 3359.375238418579, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:25] (step=0015367) Train Loss: 0.2675, Train Steps/Sec: 0.28, Epoch: 0.298620287602021, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:29] EFFICIENCY_METRICS: {"epoch": 0, "step": 15368, "loss": 0.24704957008361816, "memory_gb": 7.721559524536133, "step_time_ms": 3362.2233867645264, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:29] (step=0015368) Train Loss: 0.2390, Train Steps/Sec: 0.28, Epoch: 0.2986397201710066, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15369, "loss": 0.2825302481651306, "memory_gb": 7.721559524536133, "step_time_ms": 3366.8224811553955, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:32] (step=0015369) Train Loss: 0.2413, Train Steps/Sec: 0.28, Epoch: 0.2986591527399922, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:36] EFFICIENCY_METRICS: {"epoch": 0, "step": 15370, "loss": 0.30643224716186523, "memory_gb": 7.721559524536133, "step_time_ms": 3365.821123123169, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:36] (step=0015370) Train Loss: 0.2688, Train Steps/Sec: 0.28, Epoch: 0.29867858530897784, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15371, "loss": 0.25766944885253906, "memory_gb": 7.721559524536133, "step_time_ms": 3360.944986343384, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:39] (step=0015371) Train Loss: 0.2521, Train Steps/Sec: 0.28, Epoch: 0.29869801787796346, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15372, "loss": 0.16539029777050018, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4302101135254, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:43] (step=0015372) Train Loss: 0.2309, Train Steps/Sec: 0.28, Epoch: 0.2987174504469491, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15373, "loss": 0.2748441696166992, "memory_gb": 7.721559524536133, "step_time_ms": 3380.706548690796, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:46] (step=0015373) Train Loss: 0.2911, Train Steps/Sec: 0.28, Epoch: 0.2987368830159347, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15374, "loss": 0.28076332807540894, "memory_gb": 7.721559524536133, "step_time_ms": 3365.9963607788086, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:50] (step=0015374) Train Loss: 0.2728, Train Steps/Sec: 0.27, Epoch: 0.29875631558492033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15375, "loss": 0.2517397403717041, "memory_gb": 7.721559524536133, "step_time_ms": 3470.499038696289, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:54] (step=0015375) Train Loss: 0.2116, Train Steps/Sec: 0.27, Epoch: 0.29877574815390595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:23:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15376, "loss": 0.2342892289161682, "memory_gb": 7.721559524536133, "step_time_ms": 3422.673225402832, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:23:57] (step=0015376) Train Loss: 0.2356, Train Steps/Sec: 0.28, Epoch: 0.2987951807228916, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15377, "loss": 0.2585550844669342, "memory_gb": 7.721559524536133, "step_time_ms": 3427.226781845093, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:01] (step=0015377) Train Loss: 0.2364, Train Steps/Sec: 0.28, Epoch: 0.2988146132918772, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:05] EFFICIENCY_METRICS: {"epoch": 0, "step": 15378, "loss": 0.166113018989563, "memory_gb": 7.721559524536133, "step_time_ms": 3409.900188446045, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:05] (step=0015378) Train Loss: 0.1694, Train Steps/Sec: 0.27, Epoch: 0.2988340458608628, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15379, "loss": 0.1517319530248642, "memory_gb": 7.721559524536133, "step_time_ms": 4960.809707641602, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:10] (step=0015379) Train Loss: 0.2438, Train Steps/Sec: 0.19, Epoch: 0.29885347842984844, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15380, "loss": 0.15756329894065857, "memory_gb": 7.721559524536133, "step_time_ms": 6082.041025161743, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:17] (step=0015380) Train Loss: 0.2334, Train Steps/Sec: 0.15, Epoch: 0.29887291099883406, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15381, "loss": 0.22642135620117188, "memory_gb": 7.721559524536133, "step_time_ms": 6176.799774169922, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:23] (step=0015381) Train Loss: 0.2213, Train Steps/Sec: 0.15, Epoch: 0.2988923435678197, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15382, "loss": 0.14869211614131927, "memory_gb": 7.721559524536133, "step_time_ms": 3985.215902328491, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:30] (step=0015382) Train Loss: 0.1742, Train Steps/Sec: 0.15, Epoch: 0.2989117761368053, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15383, "loss": 0.16446126997470856, "memory_gb": 7.721559524536133, "step_time_ms": 6201.413631439209, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:37] (step=0015383) Train Loss: 0.2054, Train Steps/Sec: 0.16, Epoch: 0.29893120870579093, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15384, "loss": 0.20030350983142853, "memory_gb": 7.721559524536133, "step_time_ms": 6156.52871131897, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:43] (step=0015384) Train Loss: 0.2489, Train Steps/Sec: 0.16, Epoch: 0.29895064127477655, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:49] EFFICIENCY_METRICS: {"epoch": 0, "step": 15385, "loss": 0.22709104418754578, "memory_gb": 7.721559524536133, "step_time_ms": 5793.418645858765, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:49] (step=0015385) Train Loss: 0.1959, Train Steps/Sec: 0.17, Epoch: 0.2989700738437621, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:24:55] EFFICIENCY_METRICS: {"epoch": 0, "step": 15386, "loss": 0.13402049243450165, "memory_gb": 7.721559524536133, "step_time_ms": 6238.09814453125, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:24:55] (step=0015386) Train Loss: 0.1861, Train Steps/Sec: 0.16, Epoch: 0.29898950641274774, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15387, "loss": 0.20521219074726105, "memory_gb": 7.721559524536133, "step_time_ms": 5418.314456939697, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:01] (step=0015387) Train Loss: 0.2644, Train Steps/Sec: 0.17, Epoch: 0.29900893898173336, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:06] EFFICIENCY_METRICS: {"epoch": 0, "step": 15388, "loss": 0.2441803514957428, "memory_gb": 7.721559524536133, "step_time_ms": 5127.280712127686, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:06] (step=0015388) Train Loss: 0.2461, Train Steps/Sec: 0.19, Epoch: 0.299028371550719, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:10] EFFICIENCY_METRICS: {"epoch": 0, "step": 15389, "loss": 0.22011345624923706, "memory_gb": 7.721559524536133, "step_time_ms": 3368.8652515411377, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:10] (step=0015389) Train Loss: 0.2658, Train Steps/Sec: 0.28, Epoch: 0.2990478041197046, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:14] EFFICIENCY_METRICS: {"epoch": 0, "step": 15390, "loss": 0.24950900673866272, "memory_gb": 7.721559524536133, "step_time_ms": 3372.9238510131836, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:14] (step=0015390) Train Loss: 0.2294, Train Steps/Sec: 0.28, Epoch: 0.29906723668869023, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:17] EFFICIENCY_METRICS: {"epoch": 0, "step": 15391, "loss": 0.15618100762367249, "memory_gb": 7.721559524536133, "step_time_ms": 3364.27903175354, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:17] (step=0015391) Train Loss: 0.1859, Train Steps/Sec: 0.28, Epoch: 0.29908666925767585, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:21] EFFICIENCY_METRICS: {"epoch": 0, "step": 15392, "loss": 0.353102445602417, "memory_gb": 7.721559524536133, "step_time_ms": 3367.4142360687256, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:21] (step=0015392) Train Loss: 0.3216, Train Steps/Sec: 0.28, Epoch: 0.2991061018266615, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:25] EFFICIENCY_METRICS: {"epoch": 0, "step": 15393, "loss": 0.3128633499145508, "memory_gb": 7.721559524536133, "step_time_ms": 3360.9466552734375, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:25] (step=0015393) Train Loss: 0.2653, Train Steps/Sec: 0.28, Epoch: 0.2991255343956471, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:28] EFFICIENCY_METRICS: {"epoch": 0, "step": 15394, "loss": 0.12751340866088867, "memory_gb": 7.721559524536133, "step_time_ms": 3367.0217990875244, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:28] (step=0015394) Train Loss: 0.2071, Train Steps/Sec: 0.28, Epoch: 0.2991449669646327, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:32] EFFICIENCY_METRICS: {"epoch": 0, "step": 15395, "loss": 0.28417861461639404, "memory_gb": 7.721559524536133, "step_time_ms": 3368.4158325195312, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:32] (step=0015395) Train Loss: 0.2237, Train Steps/Sec: 0.28, Epoch: 0.29916439953361834, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:35] EFFICIENCY_METRICS: {"epoch": 0, "step": 15396, "loss": 0.23013126850128174, "memory_gb": 7.721559524536133, "step_time_ms": 3367.431163787842, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:35] (step=0015396) Train Loss: 0.1761, Train Steps/Sec: 0.28, Epoch: 0.29918383210260396, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:39] EFFICIENCY_METRICS: {"epoch": 0, "step": 15397, "loss": 0.16620200872421265, "memory_gb": 7.721559524536133, "step_time_ms": 3368.7775135040283, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:39] (step=0015397) Train Loss: 0.1940, Train Steps/Sec: 0.28, Epoch: 0.2992032646715896, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:43] EFFICIENCY_METRICS: {"epoch": 0, "step": 15398, "loss": 0.3136511445045471, "memory_gb": 7.721559524536133, "step_time_ms": 3374.3369579315186, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:43] (step=0015398) Train Loss: 0.2959, Train Steps/Sec: 0.28, Epoch: 0.2992226972405752, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:46] EFFICIENCY_METRICS: {"epoch": 0, "step": 15399, "loss": 0.2771787643432617, "memory_gb": 7.721559524536133, "step_time_ms": 3374.3271827697754, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:46] (step=0015399) Train Loss: 0.2548, Train Steps/Sec: 0.28, Epoch: 0.29924212980956083, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:50] EFFICIENCY_METRICS: {"epoch": 0, "step": 15400, "loss": 0.24340130388736725, "memory_gb": 7.721559524536133, "step_time_ms": 3364.0081882476807, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:50] (step=0015400) Train Loss: 0.2320, Train Steps/Sec: 0.28, Epoch: 0.29926156237854645, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:54] EFFICIENCY_METRICS: {"epoch": 0, "step": 15401, "loss": 0.2266361266374588, "memory_gb": 7.721559524536133, "step_time_ms": 3366.564989089966, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:54] (step=0015401) Train Loss: 0.2439, Train Steps/Sec: 0.28, Epoch: 0.2992809949475321, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:25:57] EFFICIENCY_METRICS: {"epoch": 0, "step": 15402, "loss": 0.31275004148483276, "memory_gb": 7.721559524536133, "step_time_ms": 3369.6532249450684, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:25:57] (step=0015402) Train Loss: 0.3034, Train Steps/Sec: 0.28, Epoch: 0.2993004275165177, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:01] EFFICIENCY_METRICS: {"epoch": 0, "step": 15403, "loss": 0.25420576333999634, "memory_gb": 7.721559524536133, "step_time_ms": 3368.288516998291, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:01] (step=0015403) Train Loss: 0.2465, Train Steps/Sec: 0.28, Epoch: 0.2993198600855033, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:04] EFFICIENCY_METRICS: {"epoch": 0, "step": 15404, "loss": 0.2853851318359375, "memory_gb": 7.721559524536133, "step_time_ms": 3366.332530975342, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:04] (step=0015404) Train Loss: 0.2613, Train Steps/Sec: 0.28, Epoch: 0.29933929265448894, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:08] EFFICIENCY_METRICS: {"epoch": 0, "step": 15405, "loss": 0.22337597608566284, "memory_gb": 7.721559524536133, "step_time_ms": 3368.2234287261963, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:08] (step=0015405) Train Loss: 0.2271, Train Steps/Sec: 0.28, Epoch: 0.29935872522347456, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:12] EFFICIENCY_METRICS: {"epoch": 0, "step": 15406, "loss": 0.31251850724220276, "memory_gb": 7.721559524536133, "step_time_ms": 3367.1793937683105, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:12] (step=0015406) Train Loss: 0.2937, Train Steps/Sec: 0.28, Epoch: 0.2993781577924602, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:15] EFFICIENCY_METRICS: {"epoch": 0, "step": 15407, "loss": 0.26549777388572693, "memory_gb": 7.721559524536133, "step_time_ms": 3360.1181507110596, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:15] (step=0015407) Train Loss: 0.2277, Train Steps/Sec: 0.28, Epoch: 0.2993975903614458, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:19] EFFICIENCY_METRICS: {"epoch": 0, "step": 15408, "loss": 0.2662068009376526, "memory_gb": 7.721559524536133, "step_time_ms": 3361.9279861450195, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:19] (step=0015408) Train Loss: 0.2318, Train Steps/Sec: 0.28, Epoch: 0.2994170229304314, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:23] EFFICIENCY_METRICS: {"epoch": 0, "step": 15409, "loss": 0.2030162513256073, "memory_gb": 7.721559524536133, "step_time_ms": 3367.680072784424, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:23] (step=0015409) Train Loss: 0.2036, Train Steps/Sec: 0.28, Epoch: 0.299436455499417, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:26] EFFICIENCY_METRICS: {"epoch": 0, "step": 15410, "loss": 0.33064189553260803, "memory_gb": 7.715639114379883, "step_time_ms": 3327.446460723877, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:26] (step=0015410) Train Loss: 0.2884, Train Steps/Sec: 0.28, Epoch: 0.2994558880684026, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:30] EFFICIENCY_METRICS: {"epoch": 0, "step": 15411, "loss": 0.21336591243743896, "memory_gb": 7.721559524536133, "step_time_ms": 3366.4512634277344, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:30] (step=0015411) Train Loss: 0.1915, Train Steps/Sec: 0.28, Epoch: 0.29947532063738824, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:33] EFFICIENCY_METRICS: {"epoch": 0, "step": 15412, "loss": 0.1503259241580963, "memory_gb": 7.721559524536133, "step_time_ms": 3364.935874938965, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:33] (step=0015412) Train Loss: 0.1622, Train Steps/Sec: 0.28, Epoch: 0.29949475320637386, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:37] EFFICIENCY_METRICS: {"epoch": 0, "step": 15413, "loss": 0.20225116610527039, "memory_gb": 7.721559524536133, "step_time_ms": 3366.7783737182617, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:37] (step=0015413) Train Loss: 0.2481, Train Steps/Sec: 0.28, Epoch: 0.2995141857753595, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:41] EFFICIENCY_METRICS: {"epoch": 0, "step": 15414, "loss": 0.23575788736343384, "memory_gb": 7.721559524536133, "step_time_ms": 3365.933418273926, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:41] (step=0015414) Train Loss: 0.2521, Train Steps/Sec: 0.27, Epoch: 0.2995336183443451, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:44] EFFICIENCY_METRICS: {"epoch": 0, "step": 15415, "loss": 0.21141397953033447, "memory_gb": 7.721559524536133, "step_time_ms": 3359.0526580810547, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:44] (step=0015415) Train Loss: 0.2196, Train Steps/Sec: 0.28, Epoch: 0.29955305091333073, LR: 0.001, Memory: 7.72GB, Params: 4,718,592 [2025-07-29 15:26:48] EFFICIENCY_METRICS: {"epoch": 0, "step": 15416, "loss": 0.3134481608867645, "memory_gb": 7.721559524536133, "step_time_ms": 3364.6390438079834, "trainable_params": 4718592, "method": "lora"} [2025-07-29 15:26:48] (step=0015416) Train Loss: 0.3188, Train Steps/Sec: 0.28, Epoch: 0.29957248348231635, LR: 0.001, Memory: 7.72GB, Params: 4,718,592