mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
Commit Graph
Select branches
Hide Pull Requests
add-json-export-indentation
adr-model-stages
cau/dpv4-test-updates
cau/fix-layout-vlm-pipeline-artifacts-path
cau/layout-vlm-pipeline-page-images
cau/multi-stage-vlm-pipeline
cau/new-layout-processing
cau/pin-docling-parse-pre-3.2
cau/test-dp-word-lines
cau/test-pypdfium2-beta
copilot/fix-document-timeout-bug
copilot/fix-keyerror-in-docling
copilot/fix-page-range-bug
cp_main_20250602
demo
dev-granite-docling-table
dev/add-asr-pipeline
dev/add-granite-docling-extension
dev/add-granite-docling-preview
dev/add-r2l-tests
dev/add-reading-order-model
dev/add-two-stage-vlm
dev/analysis-for-granite-docling
dev/doctag_backend
dev/fix_msword_backend_identify_text_after_image
dev/table-orientation
dev/update-html-parser-with-h1
dev/update-to-latest-docling-parse-again
docs/add-extraction-script
elh/update_2stage_inference
extend-metadata-in-examples
gh-pages
main
mao/doctags
mly/smol-docling-integration
nli/fix_glm_utils
nli/fix_ocr_tests
nli/layout_dfine
nli/layout_heron2
nli/layout_rtdetr_v2
nli/layoutmodel_improvements
nli/tesseract_ocr_models
ocr-enrichment
pretest-core-2-51-0
propagate-core-fixes-20250502
remodel-lists-2
revert-803-refactor_viz
rtdl/docx_latex
rtdl/drawingml_import
vku/uspto_meta
#1
#10
#100
#101
#1010
#1015
#1017
#102
#1021
#1024
#1027
#103
#1038
#1039
#1040
#1041
#1051
#1052
#1053
#1054
#1055
#1057
#1061
#1062
#1077
#1096
#1097
#1098
#11
#110
#1100
#1106
#1107
#111
#1114
#1115
#1118
#1124
#1130
#1140
#1141
#1147
#1150
#1152
#1154
#1156
#1158
#1160
#1165
#1167
#117
#1173
#118
#1182
#1183
#1194
#1196
#1197
#1199
#12
#120
#1201
#121
#1210
#122
#1220
#1222
#1223
#123
#1231
#1238
#1239
#1241
#1244
#1247
#1248
#1261
#1263
#1268
#1270
#1286
#129
#1294
#1295
#13
#131
#1313
#1315
#1316
#1319
#132
#1320
#1326
#1328
#1332
#1334
#1337
#134
#1340
#1346
#135
#1350
#1355
#1359
#1363
#1371
#1375
#1377
#1378
#1379
#138
#1381
#1382
#1383
#1389
#139
#1392
#1399
#14
#140
#1400
#1402
#141
#1411
#1415
#1416
#1419
#1427
#1428
#143
#1430
#1436
#1442
#1449
#145
#1458
#1459
#1463
#1465
#1486
#149
#1490
#1492
#1494
#1496
#15
#150
#1500
#151
#1511
#1512
#152
#1520
#1523
#1524
#1525
#1526
#1527
#1528
#153
#1530
#1536
#1538
#154
#1548
#1549
#155
#1551
#1553
#1556
#1559
#156
#1560
#1561
#1563
#1566
#157
#1570
#1576
#1577
#158
#1582
#1583
#1587
#1589
#159
#1593
#1596
#16
#160
#1600
#1609
#161
#1610
#1615
#1617
#1619
#162
#1636
#164
#1658
#1659
#1660
#1663
#1664
#1665
#1667
#1671
#1673
#1676
#1679
#168
#1683
#1684
#1688
#1689
#169
#1691
#1698
#17
#170
#1700
#1701
#1706
#1707
#171
#1711
#1717
#1718
#1723
#1724
#1725
#1728
#173
#1734
#1735
#1745
#1746
#1747
#175
#1759
#1763
#1769
#177
#1772
#1775
#178
#179
#1791
#1795
#18
#180
#1802
#1804
#1808
#1810
#1812
#1815
#1816
#1819
#182
#1820
#1821
#1824
#1825
#1827
#183
#1836
#1838
#184
#1844
#1850
#1851
#1852
#1856
#1857
#186
#1863
#1866
#1867
#187
#1870
#1874
#1875
#1876
#188
#1884
#189
#1897
#1898
#1898
#19
#190
#1902
#1904
#1905
#1907
#1908
#1910
#1912
#1914
#1917
#1923
#1925
#1926
#1928
#193
#1931
#1934
#1937
#194
#1940
#1943
#1948
#1951
#1952
#196
#1960
#1969
#1970
#1971
#1975
#1981
#1982
#1984
#1986
#1988
#1989
#1992
#1995
#2
#20
#2001
#2002
#2006
#2011
#2017
#2018
#2024
#203
#2031
#2039
#2042
#2048
#2061
#2068
#2069
#2078
#2079
#2083
#2084
#2084
#2088
#2093
#2094
#2095
#2095
#21
#2100
#2105
#2106
#2110
#2111
#2112
#2113
#2114
#2114
#2122
#2123
#2124
#2126
#2131
#2132
#2133
#2138
#214
#2141
#2146
#2154
#2155
#2165
#2166
#2169
#217
#217
#2171
#2178
#218
#218
#2183
#2185
#2187
#219
#2199
#22
#2200
#2208
#2212
#2218
#2219
#2227
#2227
#2231
#2234
#2237
#2238
#224
#2242
#2244
#2251
#2252
#226
#2262
#2264
#2265
#2266
#2272
#228
#2281
#2284
#2284
#2287
#2288
#229
#2291
#2294
#2304
#2309
#2313
#2315
#2322
#2323
#2324
#233
#2339
#234
#2340
#2341
#235
#2357
#2359
#2361
#2365
#2366
#2371
#2372
#2373
#2378
#2378
#2382
#2383
#2388
#2391
#2394
#240
#2401
#2403
#2403
#2404
#2407
#2409
#2409
#241
#2410
#2411
#2413
#2415
#2418
#2420
#2421
#2422
#2423
#2424
#2425
#2426
#2427
#2429
#2430
#2431
#2433
#2436
#2441
#2442
#2445
#2445
#2447
#2452
#2453
#2454
#2458
#2459
#2468
#2473
#2474
#248
#2484
#2486
#2488
#2488
#2489
#2498
#2499
#2501
#2502
#2503
#251
#2511
#2512
#2513
#2517
#2519
#2520
#2521
#2526
#2527
#2530
#2531
#2533
#2541
#2543
#2546
#2548
#2549
#2553
#2563
#2569
#2571
#2573
#2578
#2582
#2585
#2587
#2587
#2588
#2589
#259
#2596
#2599
#26
#2600
#2605
#2613
#2618
#2622
#2622
#2624
#2627
#2636
#2637
#2638
#2639
#2640
#2641
#2644
#2645
#2645
#2648
#2649
#2651
#2653
#2656
#2658
#2659
#2660
#2662
#2664
#2665
#2669
#2671
#2674
#2676
#2676
#2678
#2678
#2682
#2682
#2689
#2692
#2693
#27
#2706
#2707
#2708
#2712
#2716
#2717
#2720
#2721
#2721
#2723
#2723
#2728
#2735
#2738
#2738
#2739
#2740
#2740
#2741
#2741
#275
#276
#279
#28
#282
#286
#29
#290
#3
#302
#305
#307
#31
#310
#312
#314
#315
#316
#319
#32
#320
#322
#323
#325
#33
#330
#332
#334
#339
#34
#340
#341
#349
#35
#350
#36
#37
#371
#374
#375
#378
#379
#38
#384
#388
#39
#392
#393
#396
#4
#40
#401
#407
#408
#409
#415
#416
#42
#429
#43
#430
#432
#44
#442
#449
#45
#451
#456
#457
#46
#466
#467
#468
#47
#472
#474
#475
#482
#484
#487
#49
#490
#492
#495
#496
#497
#5
#50
#500
#501
#502
#504
#51
#511
#512
#513
#514
#517
#52
#528
#53
#530
#531
#532
#533
#534
#537
#54
#544
#549
#550
#551
#552
#555
#556
#557
#558
#56
#569
#57
#58
#59
#593
#6
#604
#606
#608
#613
#615
#616
#618
#624
#628
#63
#630
#631
#633
#642
#65
#650
#655
#656
#662
#675
#679
#68
#69
#691
#693
#694
#695
#697
#698
#7
#70
#700
#701
#702
#708
#71
#716
#717
#718
#719
#72
#733
#735
#739
#742
#75
#752
#759
#769
#772
#777
#783
#786
#788
#79
#793
#8
#80
#800
#801
#803
#804
#805
#808
#81
#811
#814
#815
#816
#817
#818
#819
#82
#820
#821
#824
#825
#826
#827
#83
#830
#831
#832
#837
#839
#84
#841
#842
#843
#850
#852
#853
#854
#855
#856
#857
#86
#862
#868
#869
#872
#873
#874
#875
#876
#878
#88
#880
#881
#883
#896
#897
#90
#901
#903
#905
#91
#910
#912
#916
#919
#92
#929
#93
#932
#935
#940
#941
#945
#948
#949
#95
#951
#958
#96
#965
#966
#967
#98
#99
#999
v0.1.1
v0.2.0
v0.3.0
v0.3.1
v0.4.0
v1.0.0
v1.0.1
v1.0.2
v1.1.0
v1.1.1
v1.1.2
v1.10.0
v1.11.0
v1.12.0
v1.12.1
v1.12.2
v1.13.0
v1.13.1
v1.14.0
v1.15.0
v1.16.0
v1.16.1
v1.17.0
v1.18.0
v1.19.0
v1.19.1
v1.2.0
v1.2.1
v1.20.0
v1.3.0
v1.4.0
v1.5.0
v1.6.0
v1.6.1
v1.6.2
v1.6.3
v1.7.0
v1.7.1
v1.8.0
v1.8.1
v1.8.2
v1.8.3
v1.8.4
v1.8.5
v1.9.0
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.12.0
v2.13.0
v2.14.0
v2.15.0
v2.15.1
v2.16.0
v2.17.0
v2.18.0
v2.19.0
v2.2.0
v2.2.1
v2.20.0
v2.21.0
v2.22.0
v2.23.0
v2.23.1
v2.24.0
v2.25.0
v2.25.1
v2.25.2
v2.26.0
v2.27.0
v2.28.0
v2.28.1
v2.28.2
v2.28.3
v2.28.4
v2.29.0
v2.3.0
v2.3.1
v2.30.0
v2.31.0
v2.31.1
v2.31.2
v2.32.0
v2.33.0
v2.34.0
v2.35.0
v2.36.0
v2.36.1
v2.37.0
v2.38.0
v2.38.1
v2.39.0
v2.4.0
v2.4.1
v2.4.2
v2.40.0
v2.41.0
v2.42.0
v2.42.1
v2.42.2
v2.43.0
v2.44.0
v2.45.0
v2.46.0
v2.47.0
v2.47.1
v2.48.0
v2.49.0
v2.5.0
v2.5.1
v2.5.2
v2.50.0
v2.51.0
v2.52.0
v2.53.0
v2.54.0
v2.55.0
v2.55.1
v2.56.0
v2.56.1
v2.57.0
v2.58.0
v2.59.0
v2.6.0
v2.60.0
v2.60.1
v2.61.0
v2.61.1
v2.61.2
v2.62.0
v2.63.0
v2.64.0
v2.7.0
v2.7.1
v2.8.0
v2.8.1
v2.8.2
v2.8.3
v2.9.0
Select branches
Hide Pull Requests
add-json-export-indentation
adr-model-stages
cau/dpv4-test-updates
cau/fix-layout-vlm-pipeline-artifacts-path
cau/layout-vlm-pipeline-page-images
cau/multi-stage-vlm-pipeline
cau/new-layout-processing
cau/pin-docling-parse-pre-3.2
cau/test-dp-word-lines
cau/test-pypdfium2-beta
copilot/fix-document-timeout-bug
copilot/fix-keyerror-in-docling
copilot/fix-page-range-bug
cp_main_20250602
demo
dev-granite-docling-table
dev/add-asr-pipeline
dev/add-granite-docling-extension
dev/add-granite-docling-preview
dev/add-r2l-tests
dev/add-reading-order-model
dev/add-two-stage-vlm
dev/analysis-for-granite-docling
dev/doctag_backend
dev/fix_msword_backend_identify_text_after_image
dev/table-orientation
dev/update-html-parser-with-h1
dev/update-to-latest-docling-parse-again
docs/add-extraction-script
elh/update_2stage_inference
extend-metadata-in-examples
gh-pages
main
mao/doctags
mly/smol-docling-integration
nli/fix_glm_utils
nli/fix_ocr_tests
nli/layout_dfine
nli/layout_heron2
nli/layout_rtdetr_v2
nli/layoutmodel_improvements
nli/tesseract_ocr_models
ocr-enrichment
pretest-core-2-51-0
propagate-core-fixes-20250502
remodel-lists-2
revert-803-refactor_viz
rtdl/docx_latex
rtdl/drawingml_import
vku/uspto_meta
#1
#10
#100
#101
#1010
#1015
#1017
#102
#1021
#1024
#1027
#103
#1038
#1039
#1040
#1041
#1051
#1052
#1053
#1054
#1055
#1057
#1061
#1062
#1077
#1096
#1097
#1098
#11
#110
#1100
#1106
#1107
#111
#1114
#1115
#1118
#1124
#1130
#1140
#1141
#1147
#1150
#1152
#1154
#1156
#1158
#1160
#1165
#1167
#117
#1173
#118
#1182
#1183
#1194
#1196
#1197
#1199
#12
#120
#1201
#121
#1210
#122
#1220
#1222
#1223
#123
#1231
#1238
#1239
#1241
#1244
#1247
#1248
#1261
#1263
#1268
#1270
#1286
#129
#1294
#1295
#13
#131
#1313
#1315
#1316
#1319
#132
#1320
#1326
#1328
#1332
#1334
#1337
#134
#1340
#1346
#135
#1350
#1355
#1359
#1363
#1371
#1375
#1377
#1378
#1379
#138
#1381
#1382
#1383
#1389
#139
#1392
#1399
#14
#140
#1400
#1402
#141
#1411
#1415
#1416
#1419
#1427
#1428
#143
#1430
#1436
#1442
#1449
#145
#1458
#1459
#1463
#1465
#1486
#149
#1490
#1492
#1494
#1496
#15
#150
#1500
#151
#1511
#1512
#152
#1520
#1523
#1524
#1525
#1526
#1527
#1528
#153
#1530
#1536
#1538
#154
#1548
#1549
#155
#1551
#1553
#1556
#1559
#156
#1560
#1561
#1563
#1566
#157
#1570
#1576
#1577
#158
#1582
#1583
#1587
#1589
#159
#1593
#1596
#16
#160
#1600
#1609
#161
#1610
#1615
#1617
#1619
#162
#1636
#164
#1658
#1659
#1660
#1663
#1664
#1665
#1667
#1671
#1673
#1676
#1679
#168
#1683
#1684
#1688
#1689
#169
#1691
#1698
#17
#170
#1700
#1701
#1706
#1707
#171
#1711
#1717
#1718
#1723
#1724
#1725
#1728
#173
#1734
#1735
#1745
#1746
#1747
#175
#1759
#1763
#1769
#177
#1772
#1775
#178
#179
#1791
#1795
#18
#180
#1802
#1804
#1808
#1810
#1812
#1815
#1816
#1819
#182
#1820
#1821
#1824
#1825
#1827
#183
#1836
#1838
#184
#1844
#1850
#1851
#1852
#1856
#1857
#186
#1863
#1866
#1867
#187
#1870
#1874
#1875
#1876
#188
#1884
#189
#1897
#1898
#1898
#19
#190
#1902
#1904
#1905
#1907
#1908
#1910
#1912
#1914
#1917
#1923
#1925
#1926
#1928
#193
#1931
#1934
#1937
#194
#1940
#1943
#1948
#1951
#1952
#196
#1960
#1969
#1970
#1971
#1975
#1981
#1982
#1984
#1986
#1988
#1989
#1992
#1995
#2
#20
#2001
#2002
#2006
#2011
#2017
#2018
#2024
#203
#2031
#2039
#2042
#2048
#2061
#2068
#2069
#2078
#2079
#2083
#2084
#2084
#2088
#2093
#2094
#2095
#2095
#21
#2100
#2105
#2106
#2110
#2111
#2112
#2113
#2114
#2114
#2122
#2123
#2124
#2126
#2131
#2132
#2133
#2138
#214
#2141
#2146
#2154
#2155
#2165
#2166
#2169
#217
#217
#2171
#2178
#218
#218
#2183
#2185
#2187
#219
#2199
#22
#2200
#2208
#2212
#2218
#2219
#2227
#2227
#2231
#2234
#2237
#2238
#224
#2242
#2244
#2251
#2252
#226
#2262
#2264
#2265
#2266
#2272
#228
#2281
#2284
#2284
#2287
#2288
#229
#2291
#2294
#2304
#2309
#2313
#2315
#2322
#2323
#2324
#233
#2339
#234
#2340
#2341
#235
#2357
#2359
#2361
#2365
#2366
#2371
#2372
#2373
#2378
#2378
#2382
#2383
#2388
#2391
#2394
#240
#2401
#2403
#2403
#2404
#2407
#2409
#2409
#241
#2410
#2411
#2413
#2415
#2418
#2420
#2421
#2422
#2423
#2424
#2425
#2426
#2427
#2429
#2430
#2431
#2433
#2436
#2441
#2442
#2445
#2445
#2447
#2452
#2453
#2454
#2458
#2459
#2468
#2473
#2474
#248
#2484
#2486
#2488
#2488
#2489
#2498
#2499
#2501
#2502
#2503
#251
#2511
#2512
#2513
#2517
#2519
#2520
#2521
#2526
#2527
#2530
#2531
#2533
#2541
#2543
#2546
#2548
#2549
#2553
#2563
#2569
#2571
#2573
#2578
#2582
#2585
#2587
#2587
#2588
#2589
#259
#2596
#2599
#26
#2600
#2605
#2613
#2618
#2622
#2622
#2624
#2627
#2636
#2637
#2638
#2639
#2640
#2641
#2644
#2645
#2645
#2648
#2649
#2651
#2653
#2656
#2658
#2659
#2660
#2662
#2664
#2665
#2669
#2671
#2674
#2676
#2676
#2678
#2678
#2682
#2682
#2689
#2692
#2693
#27
#2706
#2707
#2708
#2712
#2716
#2717
#2720
#2721
#2721
#2723
#2723
#2728
#2735
#2738
#2738
#2739
#2740
#2740
#2741
#2741
#275
#276
#279
#28
#282
#286
#29
#290
#3
#302
#305
#307
#31
#310
#312
#314
#315
#316
#319
#32
#320
#322
#323
#325
#33
#330
#332
#334
#339
#34
#340
#341
#349
#35
#350
#36
#37
#371
#374
#375
#378
#379
#38
#384
#388
#39
#392
#393
#396
#4
#40
#401
#407
#408
#409
#415
#416
#42
#429
#43
#430
#432
#44
#442
#449
#45
#451
#456
#457
#46
#466
#467
#468
#47
#472
#474
#475
#482
#484
#487
#49
#490
#492
#495
#496
#497
#5
#50
#500
#501
#502
#504
#51
#511
#512
#513
#514
#517
#52
#528
#53
#530
#531
#532
#533
#534
#537
#54
#544
#549
#550
#551
#552
#555
#556
#557
#558
#56
#569
#57
#58
#59
#593
#6
#604
#606
#608
#613
#615
#616
#618
#624
#628
#63
#630
#631
#633
#642
#65
#650
#655
#656
#662
#675
#679
#68
#69
#691
#693
#694
#695
#697
#698
#7
#70
#700
#701
#702
#708
#71
#716
#717
#718
#719
#72
#733
#735
#739
#742
#75
#752
#759
#769
#772
#777
#783
#786
#788
#79
#793
#8
#80
#800
#801
#803
#804
#805
#808
#81
#811
#814
#815
#816
#817
#818
#819
#82
#820
#821
#824
#825
#826
#827
#83
#830
#831
#832
#837
#839
#84
#841
#842
#843
#850
#852
#853
#854
#855
#856
#857
#86
#862
#868
#869
#872
#873
#874
#875
#876
#878
#88
#880
#881
#883
#896
#897
#90
#901
#903
#905
#91
#910
#912
#916
#919
#92
#929
#93
#932
#935
#940
#941
#945
#948
#949
#95
#951
#958
#96
#965
#966
#967
#98
#99
#999
v0.1.1
v0.2.0
v0.3.0
v0.3.1
v0.4.0
v1.0.0
v1.0.1
v1.0.2
v1.1.0
v1.1.1
v1.1.2
v1.10.0
v1.11.0
v1.12.0
v1.12.1
v1.12.2
v1.13.0
v1.13.1
v1.14.0
v1.15.0
v1.16.0
v1.16.1
v1.17.0
v1.18.0
v1.19.0
v1.19.1
v1.2.0
v1.2.1
v1.20.0
v1.3.0
v1.4.0
v1.5.0
v1.6.0
v1.6.1
v1.6.2
v1.6.3
v1.7.0
v1.7.1
v1.8.0
v1.8.1
v1.8.2
v1.8.3
v1.8.4
v1.8.5
v1.9.0
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.12.0
v2.13.0
v2.14.0
v2.15.0
v2.15.1
v2.16.0
v2.17.0
v2.18.0
v2.19.0
v2.2.0
v2.2.1
v2.20.0
v2.21.0
v2.22.0
v2.23.0
v2.23.1
v2.24.0
v2.25.0
v2.25.1
v2.25.2
v2.26.0
v2.27.0
v2.28.0
v2.28.1
v2.28.2
v2.28.3
v2.28.4
v2.29.0
v2.3.0
v2.3.1
v2.30.0
v2.31.0
v2.31.1
v2.31.2
v2.32.0
v2.33.0
v2.34.0
v2.35.0
v2.36.0
v2.36.1
v2.37.0
v2.38.0
v2.38.1
v2.39.0
v2.4.0
v2.4.1
v2.4.2
v2.40.0
v2.41.0
v2.42.0
v2.42.1
v2.42.2
v2.43.0
v2.44.0
v2.45.0
v2.46.0
v2.47.0
v2.47.1
v2.48.0
v2.49.0
v2.5.0
v2.5.1
v2.5.2
v2.50.0
v2.51.0
v2.52.0
v2.53.0
v2.54.0
v2.55.0
v2.55.1
v2.56.0
v2.56.1
v2.57.0
v2.58.0
v2.59.0
v2.6.0
v2.60.0
v2.60.1
v2.61.0
v2.61.1
v2.61.2
v2.62.0
v2.63.0
v2.64.0
v2.7.0
v2.7.1
v2.8.0
v2.8.1
v2.8.2
v2.8.3
v2.9.0
-
e25873d557
fix: docs are missing osd packages for tesseract on RHEL (#1905)
VIktor Kuropiantnyk
2025-07-07 17:06:26 +02:00 -
b8813eea80
feat(vlm): Dynamic prompts (#1808)
Shkarupa Alex
2025-07-07 17:58:42 +03:00 -
edd4356aac
fix: use only backend for picture classifier (#1904)
Michele Dolfi
2025-07-07 16:23:16 +02:00 -
dd8fde7f19
fix: typo in asr options (#1902)
Michele Dolfi
2025-07-07 08:59:14 +02:00 -
f4a1c06937
chore: bump version to 2.40.0 [skip ci]
v2.40.0
github-actions[bot]
2025-07-04 15:31:36 +00:00 -
ec6cf6f7e8
feat: Introduce LayoutOptions to control layout postprocessing behaviour (#1870)
Christoph Auer
2025-07-04 15:36:13 +02:00 -
598c9c53d4
fix: Secure torch model inits with global locks (#1884)
Christoph Auer
2025-07-04 07:27:26 +02:00 -
13865c06f5
perf(msexcel): _find_table_bounds use iter_rows/iter_cols instead of Worksheet.cell (#1875)
Qiefan Jiang
2025-07-03 19:12:06 +08:00 -
3089cf2d26
perf: Move expensive imports closer to usage (#1863)
William Easton
2025-07-01 15:27:17 -05:00 -
56a0e104f7
feat: Integrate ListItemMarkerProcessor into document assembly (#1825)
Christoph Auer
2025-07-01 10:04:58 +02:00 -
bdfee4e2d0
chore: Safer unloading of DPv4 backend (#1867)
Christoph Auer
2025-06-30 14:41:21 +02:00 -
ae39a9411a
fix: Ensure that TesseractOcrModel does not crash in case OSD is not installed (#1866)
Nikos Livathinos
2025-06-30 10:55:56 +02:00 -
bb99be6c24
chore: bump version to 2.39.0 [skip ci]
v2.39.0
github-actions[bot]
2025-06-27 15:37:53 +00:00 -
0533da1923
feat: leverage new list modeling, capture default markers (#1856)
Panos Vagenas
2025-06-27 16:37:15 +02:00 -
6beec77788
update backends to leverage new list modeling
remodel-lists-2
Panos Vagenas
2025-06-27 10:21:57 +02:00 -
23dc50ee8f
chore: update docling-core & regenerate test data
Panos Vagenas
2025-06-27 06:52:31 +02:00 -
e79e4f0ab6
fix(markdown): make parsing of rich table cells valid (#1821)
Michael Honaker
2025-06-26 13:50:45 -04:00 -
ee4781075a
chore: bump version to 2.38.1 [skip ci]
v2.38.1
github-actions[bot]
2025-06-25 16:27:46 +00:00 -
d337825b8e
fix: updated granite vision model version for picture description (#1852)
pranaymiri
2025-06-25 21:19:56 +05:30 -
7c5614a37a
fix(markdown): fix single-formatted headings & list items (#1820)
Panos Vagenas
2025-06-25 13:05:06 +02:00 -
41e8cae26b
fix: fix response type of ollama (#1850)
Michele Dolfi
2025-06-25 04:33:09 -05:00 -
4002de1f92
fix: Handle missing runs to avoid out of range exception (#1844)
Allen N.
2025-06-24 22:55:27 -07:00 -
1dc63d0aa9
chore: bump version to 2.38.0 [skip ci]
v2.38.0
github-actions[bot]
2025-06-23 18:14:24 +00:00 -
f3ae3029b8
docs: update readme and add ASR example (#1836)
Peter W. J. Staar
2025-06-23 18:55:16 +02:00 -
1557e7ce3e
feat: Support audio input (#1763)
Peter W. J. Staar
2025-06-23 14:47:26 +02:00 -
d26dac61a8
fix(docx): ensure list items have a list parent (#1827)
Cesar Berrospi Ramis
2025-06-20 14:47:25 +02:00 -
1350a8d3e5
fix(msword_backend): Identify text in the same line after an image #1425 (#1610)
mkrssg
2025-06-20 10:55:30 +02:00 -
90da15f611
initial reference to granite-doclong
dev/add-granite-docling-preview
Peter Staar
2025-06-20 07:47:12 +02:00 -
64ac043786
docs: support running examples from root or subfolder (#1816)
Michele Dolfi
2025-06-19 04:10:40 -05:00 -
dd7f64ff28
fix: Ensure uninitialized pages are removed before assembling document (#1812)
Christoph Auer
2025-06-19 07:33:25 +02:00 -
861abcdcb0
feat(markdown): add formatting & improve inline support (#1804)
Panos Vagenas
2025-06-18 15:57:57 +02:00 -
215b540f6c
feat: Maximum image size for Vlm models (#1802)
Shkarupa Alex
2025-06-18 13:57:37 +03:00 -
dbab30e92c
fix: formula conversion with page_range param set (#1791)
Mahafuzur Rahman
2025-06-17 17:58:45 +06:00 -
c2ef69718a
chore: dco advisor (#1795)
Michele Dolfi
2025-06-17 02:45:56 -05:00 -
7bae3b6c06
chore: bump version to 2.37.0 [skip ci]
v2.37.0
github-actions[bot]
2025-06-16 11:02:54 +00:00 -
f28d23cf03
fix: pptx line break and space handling (#1664)
Martin Wind
2025-06-16 10:44:30 +02:00 -
b886e4df31
fix(asciidoc): set default size when missing in image directive (#1769)
Cesar Berrospi Ramis
2025-06-16 10:38:46 +02:00 -
7d3302cb48
feat: Make Page.parsed_page the only source of truth for text cells, add OCR cells to it (#1745)
Christoph Auer
2025-06-13 19:01:55 +02:00 -
0432a31b2f
docs: update vlm models api examples with LM Studio (#1759)
Michele Dolfi
2025-06-12 05:58:44 -05:00 -
7a275c7637
fix: Handle NoneType error in MsPowerpointDocumentBackend (#1747)
Bruno Rigal
2025-06-10 19:43:20 +02:00 -
df140227c3
feat: support xlsm files (#1520)
Ayraf
2025-06-10 20:25:59 +05:30 -
6613b9e98b
fix: prov for merged-elems (#1728)
Peter W. J. Staar
2025-06-10 11:22:42 +02:00 -
e979750ce9
fix(tesseract): initialize df_osd to avoid uninitialized variable error (#1718)
Maras Ioannis
2025-06-10 11:57:45 +03:00 -
f7f31137f1
fix: allow custom torch_dtype in vlm models (#1735)
Michele Dolfi
2025-06-10 03:52:15 -05:00 -
3a76433b83
Update test files
dev/fix_msword_backend_identify_text_after_image
Christoph Auer
2025-06-10 09:52:31 +02:00 -
5fac357995
Merge branch 'main' of github.com:docling-project/docling into dev/fix_msword_backend_identify_text_after_image
Christoph Auer
2025-06-10 09:52:15 +02:00 -
49b10e7419
docs: add open webui (#1734)
Michele Dolfi
2025-06-10 02:35:20 -05:00 -
52b8b9163f
Merge branch 'main' of https://github.com/docling-project/docling into dev/fix_msword_backend_identify_text_after_image
Michael Krissgau
2025-06-06 20:53:40 +02:00 -
9dbcb3d7d4
fix: Improve extraction from textboxes in Word docs (#1701)
AndrewTsai0406
2025-06-06 17:37:46 +08:00 -
2bc564ccef
Merge branch 'main' of https://github.com/docling-project/docling into dev/fix_msword_backend_identify_text_after_image
Michael Krissgau
2025-06-05 22:20:09 +02:00 -
a2b83fe4ae
fix: Add WEBP to the list of image file extensions (#1711)
Eugene
2025-06-05 11:09:27 +04:00 -
40df0d74ad
chore: bump version to 2.36.1 [skip ci]
v2.36.1
github-actions[bot]
2025-06-04 11:43:13 +00:00 -
8846f1a393
fix: remove typer and click constraints (#1707)
Michele Dolfi
2025-06-04 13:06:23 +02:00 -
be42b03f9b
docs: flash-attn usage and install (#1706)
Michele Dolfi
2025-06-04 11:09:54 +02:00 -
96c54dba91
chore: bump version to 2.36.0 [skip ci]
v2.36.0
github-actions[bot]
2025-06-03 13:54:25 +00:00 -
cdd401847a
feat: simplify dependencies, switch to uv (#1700)
Michele Dolfi
2025-06-03 15:18:54 +02:00 -
61d0d6c755
test: mark flaky test (#1698)
Panos Vagenas
2025-06-03 13:13:44 +02:00 -
cfdf4cea25
feat: new vlm-models support (#1570)
Peter W. J. Staar
2025-06-02 17:01:06 +02:00 -
08dcacc5cb
chore: bump version to 2.35.0 [skip ci]
v2.35.0
github-actions[bot]
2025-06-02 12:30:26 +00:00 -
11ca4f7a7b
docs: fix typo in index.md (#1676)
Edgar Hipp
2025-06-02 12:35:59 +02:00 -
1c8a1283c4
test: ensure utf-8 in test data utils (#1691)
Panos Vagenas
2025-06-02 12:13:19 +02:00 -
984cb137f6
fix: guess HTML content starting with script tag (#1673)
cp_main_20250602
Cesar Berrospi Ramis
2025-06-02 08:43:24 +02:00 -
fa561170f6
chore: Update lock with the dependencies for D-FINE
nli/layout_dfine
Nikos Livathinos
2025-05-31 16:57:09 +02:00 -
dcc63ae00b
Merge branch 'main' into nli/layout_rtdetr_v2
nli/layout_rtdetr_v2
Nikos Livathinos
2025-05-31 16:55:06 +02:00 -
7aa2be93d6
Merge branch 'main' into nli/layout_dfine
Nikos Livathinos
2025-05-31 16:48:28 +02:00 -
30dafd976d
chore: Update dependencies to docling-ibm-models and transformers to support D-FINE layout model
Nikos Livathinos
2025-05-31 16:39:25 +02:00 -
93d98dfa63
test: added groundtruth test files for fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Michael Krissgau
2025-05-29 15:12:55 +02:00 -
84dc120d39
Merge branch 'main' of https://github.com/docling-project/docling into dev/fix_msword_backend_identify_text_after_image
Michael Krissgau
2025-05-29 15:04:06 +02:00 -
3942923125
chore: fix or ignore runtime and deprecation warnings (#1660)
Cesar Berrospi Ramis
2025-05-28 17:55:31 +02:00 -
b3e0042813
chore: exclude data from GH Linguist (#1671)
Panos Vagenas
2025-05-28 15:42:34 +02:00 -
106951e71e
test: add missing ground truth files (#1667)
Cesar Berrospi Ramis
2025-05-28 13:26:49 +02:00 -
b356b33059
feat: Add visualization of bbox on page with html export. (#1663)
Peter W. J. Staar
2025-05-28 13:10:38 +02:00 -
51d3450915
fix: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid continuation byte (#1665)
DavidLee
2025-05-27 20:06:05 +08:00 -
2579d89510
chore: bump version to 2.34.0 [skip ci]
v2.34.0
github-actions[bot]
2025-05-22 18:44:45 +00:00 -
fffa865014
test: add test file and case for fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Michael Krissgau
2025-05-22 19:02:59 +02:00 -
af4aaa28af
fix(msword_backend): Identify text in the same line after an image / image anchor #1425
Michael Krissgau
2025-05-22 17:45:15 +02:00 -
c2f595d283
fix: fix ZeroDivisionError for cell_bbox.area() (#1636)
Said Gürbüz
2025-05-22 13:43:33 +02:00 -
45265bf8b1
feat(ocr): auto-detect rotated pages in Tesseract (#1167)
Clément Doumouro
2025-05-21 18:12:33 +02:00 -
90875247e5
feat: Establish confidence estimation for document and pages (#1313)
Christoph Auer
2025-05-21 12:32:49 +02:00 -
14d4f5b109
fix(integration): update the Apify Actor integration (#1619)
Václav Vančura
2025-05-21 02:47:55 +02:00 -
84d0889829
chore: bump version to 2.33.0 [skip ci]
v2.33.0
github-actions[bot]
2025-05-20 19:54:51 +00:00 -
f4d9d4111b
fix: Fix issue with detecting docx files, and files with upper case extensions (#1609)
MoheyElDin Badr
2025-05-20 20:42:37 +03:00 -
0e00a263fa
fix: load_from_doctags static usage (#1617)
Said Gürbüz
2025-05-20 15:06:12 +02:00 -
f2e9c0784c
fix: incorrect force_backend_text behaviour for VLM DocTag pipelines (#1371)
Krishnan
2025-05-20 13:29:38 +05:30 -
98b5eeb844
fix(pypdfium): resolve overlapping text when merging bounding boxes (#1549)
Pedro Ribeiro
2025-05-19 14:26:00 +01:00 -
12a0e64892
feat: add textbox content extraction in msword_backend (#1538)
AndrewTsai0406
2025-05-19 21:01:36 +08:00 -
7c4c356e76
chore: fix chunking example data link (#1596)
Panos Vagenas
2025-05-16 08:44:47 +02:00 -
aeb0716bbb
chore: bump version to 2.32.0 [skip ci]
v2.32.0
github-actions[bot]
2025-05-14 14:28:21 +00:00 -
3a04f2a367
feat: Improve parallelization for remote services API calls (#1548)
Vinay R Damodaran
2025-05-14 06:47:55 -07:00 -
9f8b479f17
fix(ocr): orig field in TesseractOcrCliModel as str (#1553)
jimkarag02
2025-05-14 16:05:52 +03:00 -
9f28abf061
docs: add advanced chunking & serialization example (#1589)
Panos Vagenas
2025-05-14 13:35:07 +01:00 -
2efb7a7c06
fix(settings): fix nested settings load via environment variables (#1551)
Alex Sokolov
2025-05-14 14:42:10 +03:00 -
12dab0a1e8
feat: support image/webp file type (#1415)
Elwin
2025-05-14 15:47:28 +08:00 -
23238c241f
chore: bump version to 2.31.2 [skip ci]
v2.31.2
github-actions[bot]
2025-05-13 10:09:19 +00:00 -
4046d0b2f3
fix: AsciiDoc header identification (#1562) (#1563)
Marco Fargetta
2025-05-13 11:17:26 +02:00 -
8baa85a49d
fix: restrict click version and update lock file (#1582)
Michele Dolfi
2025-05-13 10:40:08 +02:00 -
0d0fa6cbe3
chore: bump version to 2.31.1 [skip ci]
v2.31.1
github-actions[bot]
2025-05-12 09:44:26 +00:00 -
127e38646f
fix: add smoldocling in download utils (#1577)
Michele Dolfi
2025-05-12 10:48:07 +02:00 -
76501331d2
need to fix ruff linter
dev/add-asr-pipeline
Peter Staar
2025-05-12 07:34:24 +02:00 -
32ad65cb9f
work in progress: slowly adding ASR pipeline and its derivatives
Peter Staar
2025-05-12 07:33:38 +02:00