mirror of
https://github.com/DS4SD/docling.git
synced 2025-12-08 12:48:28 +00:00
Commit Graph
Select branches
Hide Pull Requests
add-json-export-indentation
adr-model-stages
cau/dpv4-test-updates
cau/fix-layout-vlm-pipeline-artifacts-path
cau/layout-vlm-pipeline-page-images
cau/multi-stage-vlm-pipeline
cau/new-layout-processing
cau/pin-docling-parse-pre-3.2
cau/test-dp-word-lines
cau/test-pypdfium2-beta
copilot/fix-document-timeout-bug
copilot/fix-keyerror-in-docling
copilot/fix-page-range-bug
cp_main_20250602
demo
dev-granite-docling-table
dev/add-asr-pipeline
dev/add-granite-docling-extension
dev/add-granite-docling-preview
dev/add-r2l-tests
dev/add-reading-order-model
dev/add-two-stage-vlm
dev/analysis-for-granite-docling
dev/doctag_backend
dev/fix_msword_backend_identify_text_after_image
dev/table-orientation
dev/update-html-parser-with-h1
dev/update-to-latest-docling-parse-again
docs/add-extraction-script
elh/update_2stage_inference
extend-metadata-in-examples
gh-pages
main
mao/doctags
mly/smol-docling-integration
nli/fix_glm_utils
nli/fix_ocr_tests
nli/layout_dfine
nli/layout_heron2
nli/layout_rtdetr_v2
nli/layoutmodel_improvements
nli/tesseract_ocr_models
ocr-enrichment
pretest-core-2-51-0
propagate-core-fixes-20250502
remodel-lists-2
revert-803-refactor_viz
rtdl/docx_latex
rtdl/drawingml_import
vku/uspto_meta
#1
#10
#100
#101
#1010
#1015
#1017
#102
#1021
#1024
#1027
#103
#1038
#1039
#1040
#1041
#1051
#1052
#1053
#1054
#1055
#1057
#1061
#1062
#1077
#1096
#1097
#1098
#11
#110
#1100
#1106
#1107
#111
#1114
#1115
#1118
#1124
#1130
#1140
#1141
#1147
#1150
#1152
#1154
#1156
#1158
#1160
#1165
#1167
#117
#1173
#118
#1182
#1183
#1194
#1196
#1197
#1199
#12
#120
#1201
#121
#1210
#122
#1220
#1222
#1223
#123
#1231
#1238
#1239
#1241
#1244
#1247
#1248
#1261
#1263
#1268
#1270
#1286
#129
#1294
#1295
#13
#131
#1313
#1315
#1316
#1319
#132
#1320
#1326
#1328
#1332
#1334
#1337
#134
#1340
#1346
#135
#1350
#1355
#1359
#1363
#1371
#1375
#1377
#1378
#1379
#138
#1381
#1382
#1383
#1389
#139
#1392
#1399
#14
#140
#1400
#1402
#141
#1411
#1415
#1416
#1419
#1427
#1428
#143
#1430
#1436
#1442
#1449
#145
#1458
#1459
#1463
#1465
#1486
#149
#1490
#1492
#1494
#1496
#15
#150
#1500
#151
#1511
#1512
#152
#1520
#1523
#1524
#1525
#1526
#1527
#1528
#153
#1530
#1536
#1538
#154
#1548
#1549
#155
#1551
#1553
#1556
#1559
#156
#1560
#1561
#1563
#1566
#157
#1570
#1576
#1577
#158
#1582
#1583
#1587
#1589
#159
#1593
#1596
#16
#160
#1600
#1609
#161
#1610
#1615
#1617
#1619
#162
#1636
#164
#1658
#1659
#1660
#1663
#1664
#1665
#1667
#1671
#1673
#1676
#1679
#168
#1683
#1684
#1688
#1689
#169
#1691
#1698
#17
#170
#1700
#1701
#1706
#1707
#171
#1711
#1717
#1718
#1723
#1724
#1725
#1728
#173
#1734
#1735
#1745
#1746
#1747
#175
#1759
#1763
#1769
#177
#1772
#1775
#178
#179
#1791
#1795
#18
#180
#1802
#1804
#1808
#1810
#1812
#1815
#1816
#1819
#182
#1820
#1821
#1824
#1825
#1827
#183
#1836
#1838
#184
#1844
#1850
#1851
#1852
#1856
#1857
#186
#1863
#1866
#1867
#187
#1870
#1874
#1875
#1876
#188
#1884
#189
#1897
#1898
#1898
#19
#190
#1902
#1904
#1905
#1907
#1908
#1910
#1912
#1914
#1917
#1923
#1925
#1926
#1928
#193
#1931
#1934
#1937
#194
#1940
#1943
#1948
#1951
#1952
#196
#1960
#1969
#1970
#1971
#1975
#1981
#1982
#1984
#1986
#1988
#1989
#1992
#1995
#2
#20
#2001
#2002
#2006
#2011
#2017
#2018
#2024
#203
#2031
#2039
#2042
#2048
#2061
#2068
#2069
#2078
#2079
#2083
#2084
#2084
#2088
#2093
#2094
#2095
#2095
#21
#2100
#2105
#2106
#2110
#2111
#2112
#2113
#2114
#2114
#2122
#2123
#2124
#2126
#2131
#2132
#2133
#2138
#214
#2141
#2146
#2154
#2155
#2165
#2166
#2169
#217
#217
#2171
#2178
#218
#218
#2183
#2185
#2187
#219
#2199
#22
#2200
#2208
#2212
#2218
#2219
#2227
#2227
#2231
#2234
#2237
#2238
#224
#2242
#2244
#2251
#2252
#226
#2262
#2264
#2265
#2266
#2272
#228
#2281
#2284
#2284
#2287
#2288
#229
#2291
#2294
#2304
#2309
#2313
#2315
#2322
#2323
#2324
#233
#2339
#234
#2340
#2341
#235
#2357
#2359
#2361
#2365
#2366
#2371
#2372
#2373
#2378
#2378
#2382
#2383
#2388
#2391
#2394
#240
#2401
#2403
#2403
#2404
#2407
#2409
#2409
#241
#2410
#2411
#2413
#2415
#2418
#2420
#2421
#2422
#2423
#2424
#2425
#2426
#2427
#2429
#2430
#2431
#2433
#2436
#2441
#2442
#2445
#2445
#2447
#2452
#2453
#2454
#2458
#2459
#2468
#2473
#2474
#248
#2484
#2486
#2488
#2488
#2489
#2498
#2499
#2501
#2502
#2503
#251
#2511
#2512
#2513
#2517
#2519
#2520
#2521
#2526
#2527
#2530
#2531
#2533
#2541
#2543
#2546
#2548
#2549
#2553
#2563
#2569
#2571
#2573
#2578
#2582
#2585
#2587
#2587
#2588
#2589
#259
#2596
#2599
#26
#2600
#2605
#2613
#2618
#2622
#2622
#2624
#2627
#2636
#2637
#2638
#2639
#2640
#2641
#2644
#2645
#2645
#2648
#2649
#2651
#2653
#2656
#2658
#2659
#2660
#2662
#2664
#2665
#2669
#2671
#2674
#2676
#2676
#2678
#2678
#2682
#2682
#2689
#2692
#2693
#27
#2706
#2707
#2708
#2712
#2716
#2717
#2720
#2721
#2721
#2723
#2723
#2728
#2735
#2738
#2738
#2739
#2740
#2740
#2741
#2741
#275
#276
#279
#28
#282
#286
#29
#290
#3
#302
#305
#307
#31
#310
#312
#314
#315
#316
#319
#32
#320
#322
#323
#325
#33
#330
#332
#334
#339
#34
#340
#341
#349
#35
#350
#36
#37
#371
#374
#375
#378
#379
#38
#384
#388
#39
#392
#393
#396
#4
#40
#401
#407
#408
#409
#415
#416
#42
#429
#43
#430
#432
#44
#442
#449
#45
#451
#456
#457
#46
#466
#467
#468
#47
#472
#474
#475
#482
#484
#487
#49
#490
#492
#495
#496
#497
#5
#50
#500
#501
#502
#504
#51
#511
#512
#513
#514
#517
#52
#528
#53
#530
#531
#532
#533
#534
#537
#54
#544
#549
#550
#551
#552
#555
#556
#557
#558
#56
#569
#57
#58
#59
#593
#6
#604
#606
#608
#613
#615
#616
#618
#624
#628
#63
#630
#631
#633
#642
#65
#650
#655
#656
#662
#675
#679
#68
#69
#691
#693
#694
#695
#697
#698
#7
#70
#700
#701
#702
#708
#71
#716
#717
#718
#719
#72
#733
#735
#739
#742
#75
#752
#759
#769
#772
#777
#783
#786
#788
#79
#793
#8
#80
#800
#801
#803
#804
#805
#808
#81
#811
#814
#815
#816
#817
#818
#819
#82
#820
#821
#824
#825
#826
#827
#83
#830
#831
#832
#837
#839
#84
#841
#842
#843
#850
#852
#853
#854
#855
#856
#857
#86
#862
#868
#869
#872
#873
#874
#875
#876
#878
#88
#880
#881
#883
#896
#897
#90
#901
#903
#905
#91
#910
#912
#916
#919
#92
#929
#93
#932
#935
#940
#941
#945
#948
#949
#95
#951
#958
#96
#965
#966
#967
#98
#99
#999
v0.1.1
v0.2.0
v0.3.0
v0.3.1
v0.4.0
v1.0.0
v1.0.1
v1.0.2
v1.1.0
v1.1.1
v1.1.2
v1.10.0
v1.11.0
v1.12.0
v1.12.1
v1.12.2
v1.13.0
v1.13.1
v1.14.0
v1.15.0
v1.16.0
v1.16.1
v1.17.0
v1.18.0
v1.19.0
v1.19.1
v1.2.0
v1.2.1
v1.20.0
v1.3.0
v1.4.0
v1.5.0
v1.6.0
v1.6.1
v1.6.2
v1.6.3
v1.7.0
v1.7.1
v1.8.0
v1.8.1
v1.8.2
v1.8.3
v1.8.4
v1.8.5
v1.9.0
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.12.0
v2.13.0
v2.14.0
v2.15.0
v2.15.1
v2.16.0
v2.17.0
v2.18.0
v2.19.0
v2.2.0
v2.2.1
v2.20.0
v2.21.0
v2.22.0
v2.23.0
v2.23.1
v2.24.0
v2.25.0
v2.25.1
v2.25.2
v2.26.0
v2.27.0
v2.28.0
v2.28.1
v2.28.2
v2.28.3
v2.28.4
v2.29.0
v2.3.0
v2.3.1
v2.30.0
v2.31.0
v2.31.1
v2.31.2
v2.32.0
v2.33.0
v2.34.0
v2.35.0
v2.36.0
v2.36.1
v2.37.0
v2.38.0
v2.38.1
v2.39.0
v2.4.0
v2.4.1
v2.4.2
v2.40.0
v2.41.0
v2.42.0
v2.42.1
v2.42.2
v2.43.0
v2.44.0
v2.45.0
v2.46.0
v2.47.0
v2.47.1
v2.48.0
v2.49.0
v2.5.0
v2.5.1
v2.5.2
v2.50.0
v2.51.0
v2.52.0
v2.53.0
v2.54.0
v2.55.0
v2.55.1
v2.56.0
v2.56.1
v2.57.0
v2.58.0
v2.59.0
v2.6.0
v2.60.0
v2.60.1
v2.61.0
v2.61.1
v2.61.2
v2.62.0
v2.63.0
v2.64.0
v2.7.0
v2.7.1
v2.8.0
v2.8.1
v2.8.2
v2.8.3
v2.9.0
Select branches
Hide Pull Requests
add-json-export-indentation
adr-model-stages
cau/dpv4-test-updates
cau/fix-layout-vlm-pipeline-artifacts-path
cau/layout-vlm-pipeline-page-images
cau/multi-stage-vlm-pipeline
cau/new-layout-processing
cau/pin-docling-parse-pre-3.2
cau/test-dp-word-lines
cau/test-pypdfium2-beta
copilot/fix-document-timeout-bug
copilot/fix-keyerror-in-docling
copilot/fix-page-range-bug
cp_main_20250602
demo
dev-granite-docling-table
dev/add-asr-pipeline
dev/add-granite-docling-extension
dev/add-granite-docling-preview
dev/add-r2l-tests
dev/add-reading-order-model
dev/add-two-stage-vlm
dev/analysis-for-granite-docling
dev/doctag_backend
dev/fix_msword_backend_identify_text_after_image
dev/table-orientation
dev/update-html-parser-with-h1
dev/update-to-latest-docling-parse-again
docs/add-extraction-script
elh/update_2stage_inference
extend-metadata-in-examples
gh-pages
main
mao/doctags
mly/smol-docling-integration
nli/fix_glm_utils
nli/fix_ocr_tests
nli/layout_dfine
nli/layout_heron2
nli/layout_rtdetr_v2
nli/layoutmodel_improvements
nli/tesseract_ocr_models
ocr-enrichment
pretest-core-2-51-0
propagate-core-fixes-20250502
remodel-lists-2
revert-803-refactor_viz
rtdl/docx_latex
rtdl/drawingml_import
vku/uspto_meta
#1
#10
#100
#101
#1010
#1015
#1017
#102
#1021
#1024
#1027
#103
#1038
#1039
#1040
#1041
#1051
#1052
#1053
#1054
#1055
#1057
#1061
#1062
#1077
#1096
#1097
#1098
#11
#110
#1100
#1106
#1107
#111
#1114
#1115
#1118
#1124
#1130
#1140
#1141
#1147
#1150
#1152
#1154
#1156
#1158
#1160
#1165
#1167
#117
#1173
#118
#1182
#1183
#1194
#1196
#1197
#1199
#12
#120
#1201
#121
#1210
#122
#1220
#1222
#1223
#123
#1231
#1238
#1239
#1241
#1244
#1247
#1248
#1261
#1263
#1268
#1270
#1286
#129
#1294
#1295
#13
#131
#1313
#1315
#1316
#1319
#132
#1320
#1326
#1328
#1332
#1334
#1337
#134
#1340
#1346
#135
#1350
#1355
#1359
#1363
#1371
#1375
#1377
#1378
#1379
#138
#1381
#1382
#1383
#1389
#139
#1392
#1399
#14
#140
#1400
#1402
#141
#1411
#1415
#1416
#1419
#1427
#1428
#143
#1430
#1436
#1442
#1449
#145
#1458
#1459
#1463
#1465
#1486
#149
#1490
#1492
#1494
#1496
#15
#150
#1500
#151
#1511
#1512
#152
#1520
#1523
#1524
#1525
#1526
#1527
#1528
#153
#1530
#1536
#1538
#154
#1548
#1549
#155
#1551
#1553
#1556
#1559
#156
#1560
#1561
#1563
#1566
#157
#1570
#1576
#1577
#158
#1582
#1583
#1587
#1589
#159
#1593
#1596
#16
#160
#1600
#1609
#161
#1610
#1615
#1617
#1619
#162
#1636
#164
#1658
#1659
#1660
#1663
#1664
#1665
#1667
#1671
#1673
#1676
#1679
#168
#1683
#1684
#1688
#1689
#169
#1691
#1698
#17
#170
#1700
#1701
#1706
#1707
#171
#1711
#1717
#1718
#1723
#1724
#1725
#1728
#173
#1734
#1735
#1745
#1746
#1747
#175
#1759
#1763
#1769
#177
#1772
#1775
#178
#179
#1791
#1795
#18
#180
#1802
#1804
#1808
#1810
#1812
#1815
#1816
#1819
#182
#1820
#1821
#1824
#1825
#1827
#183
#1836
#1838
#184
#1844
#1850
#1851
#1852
#1856
#1857
#186
#1863
#1866
#1867
#187
#1870
#1874
#1875
#1876
#188
#1884
#189
#1897
#1898
#1898
#19
#190
#1902
#1904
#1905
#1907
#1908
#1910
#1912
#1914
#1917
#1923
#1925
#1926
#1928
#193
#1931
#1934
#1937
#194
#1940
#1943
#1948
#1951
#1952
#196
#1960
#1969
#1970
#1971
#1975
#1981
#1982
#1984
#1986
#1988
#1989
#1992
#1995
#2
#20
#2001
#2002
#2006
#2011
#2017
#2018
#2024
#203
#2031
#2039
#2042
#2048
#2061
#2068
#2069
#2078
#2079
#2083
#2084
#2084
#2088
#2093
#2094
#2095
#2095
#21
#2100
#2105
#2106
#2110
#2111
#2112
#2113
#2114
#2114
#2122
#2123
#2124
#2126
#2131
#2132
#2133
#2138
#214
#2141
#2146
#2154
#2155
#2165
#2166
#2169
#217
#217
#2171
#2178
#218
#218
#2183
#2185
#2187
#219
#2199
#22
#2200
#2208
#2212
#2218
#2219
#2227
#2227
#2231
#2234
#2237
#2238
#224
#2242
#2244
#2251
#2252
#226
#2262
#2264
#2265
#2266
#2272
#228
#2281
#2284
#2284
#2287
#2288
#229
#2291
#2294
#2304
#2309
#2313
#2315
#2322
#2323
#2324
#233
#2339
#234
#2340
#2341
#235
#2357
#2359
#2361
#2365
#2366
#2371
#2372
#2373
#2378
#2378
#2382
#2383
#2388
#2391
#2394
#240
#2401
#2403
#2403
#2404
#2407
#2409
#2409
#241
#2410
#2411
#2413
#2415
#2418
#2420
#2421
#2422
#2423
#2424
#2425
#2426
#2427
#2429
#2430
#2431
#2433
#2436
#2441
#2442
#2445
#2445
#2447
#2452
#2453
#2454
#2458
#2459
#2468
#2473
#2474
#248
#2484
#2486
#2488
#2488
#2489
#2498
#2499
#2501
#2502
#2503
#251
#2511
#2512
#2513
#2517
#2519
#2520
#2521
#2526
#2527
#2530
#2531
#2533
#2541
#2543
#2546
#2548
#2549
#2553
#2563
#2569
#2571
#2573
#2578
#2582
#2585
#2587
#2587
#2588
#2589
#259
#2596
#2599
#26
#2600
#2605
#2613
#2618
#2622
#2622
#2624
#2627
#2636
#2637
#2638
#2639
#2640
#2641
#2644
#2645
#2645
#2648
#2649
#2651
#2653
#2656
#2658
#2659
#2660
#2662
#2664
#2665
#2669
#2671
#2674
#2676
#2676
#2678
#2678
#2682
#2682
#2689
#2692
#2693
#27
#2706
#2707
#2708
#2712
#2716
#2717
#2720
#2721
#2721
#2723
#2723
#2728
#2735
#2738
#2738
#2739
#2740
#2740
#2741
#2741
#275
#276
#279
#28
#282
#286
#29
#290
#3
#302
#305
#307
#31
#310
#312
#314
#315
#316
#319
#32
#320
#322
#323
#325
#33
#330
#332
#334
#339
#34
#340
#341
#349
#35
#350
#36
#37
#371
#374
#375
#378
#379
#38
#384
#388
#39
#392
#393
#396
#4
#40
#401
#407
#408
#409
#415
#416
#42
#429
#43
#430
#432
#44
#442
#449
#45
#451
#456
#457
#46
#466
#467
#468
#47
#472
#474
#475
#482
#484
#487
#49
#490
#492
#495
#496
#497
#5
#50
#500
#501
#502
#504
#51
#511
#512
#513
#514
#517
#52
#528
#53
#530
#531
#532
#533
#534
#537
#54
#544
#549
#550
#551
#552
#555
#556
#557
#558
#56
#569
#57
#58
#59
#593
#6
#604
#606
#608
#613
#615
#616
#618
#624
#628
#63
#630
#631
#633
#642
#65
#650
#655
#656
#662
#675
#679
#68
#69
#691
#693
#694
#695
#697
#698
#7
#70
#700
#701
#702
#708
#71
#716
#717
#718
#719
#72
#733
#735
#739
#742
#75
#752
#759
#769
#772
#777
#783
#786
#788
#79
#793
#8
#80
#800
#801
#803
#804
#805
#808
#81
#811
#814
#815
#816
#817
#818
#819
#82
#820
#821
#824
#825
#826
#827
#83
#830
#831
#832
#837
#839
#84
#841
#842
#843
#850
#852
#853
#854
#855
#856
#857
#86
#862
#868
#869
#872
#873
#874
#875
#876
#878
#88
#880
#881
#883
#896
#897
#90
#901
#903
#905
#91
#910
#912
#916
#919
#92
#929
#93
#932
#935
#940
#941
#945
#948
#949
#95
#951
#958
#96
#965
#966
#967
#98
#99
#999
v0.1.1
v0.2.0
v0.3.0
v0.3.1
v0.4.0
v1.0.0
v1.0.1
v1.0.2
v1.1.0
v1.1.1
v1.1.2
v1.10.0
v1.11.0
v1.12.0
v1.12.1
v1.12.2
v1.13.0
v1.13.1
v1.14.0
v1.15.0
v1.16.0
v1.16.1
v1.17.0
v1.18.0
v1.19.0
v1.19.1
v1.2.0
v1.2.1
v1.20.0
v1.3.0
v1.4.0
v1.5.0
v1.6.0
v1.6.1
v1.6.2
v1.6.3
v1.7.0
v1.7.1
v1.8.0
v1.8.1
v1.8.2
v1.8.3
v1.8.4
v1.8.5
v1.9.0
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.12.0
v2.13.0
v2.14.0
v2.15.0
v2.15.1
v2.16.0
v2.17.0
v2.18.0
v2.19.0
v2.2.0
v2.2.1
v2.20.0
v2.21.0
v2.22.0
v2.23.0
v2.23.1
v2.24.0
v2.25.0
v2.25.1
v2.25.2
v2.26.0
v2.27.0
v2.28.0
v2.28.1
v2.28.2
v2.28.3
v2.28.4
v2.29.0
v2.3.0
v2.3.1
v2.30.0
v2.31.0
v2.31.1
v2.31.2
v2.32.0
v2.33.0
v2.34.0
v2.35.0
v2.36.0
v2.36.1
v2.37.0
v2.38.0
v2.38.1
v2.39.0
v2.4.0
v2.4.1
v2.4.2
v2.40.0
v2.41.0
v2.42.0
v2.42.1
v2.42.2
v2.43.0
v2.44.0
v2.45.0
v2.46.0
v2.47.0
v2.47.1
v2.48.0
v2.49.0
v2.5.0
v2.5.1
v2.5.2
v2.50.0
v2.51.0
v2.52.0
v2.53.0
v2.54.0
v2.55.0
v2.55.1
v2.56.0
v2.56.1
v2.57.0
v2.58.0
v2.59.0
v2.6.0
v2.60.0
v2.60.1
v2.61.0
v2.61.1
v2.61.2
v2.62.0
v2.63.0
v2.64.0
v2.7.0
v2.7.1
v2.8.0
v2.8.1
v2.8.2
v2.8.3
v2.9.0
-
2d24faecd9
docs: add integrations, revamp docs (#693)
Panos Vagenas
2025-01-07 14:15:54 +01:00 -
d49650c54f
fix(mspowerpoint): handle invalid images in PowerPoint slides (#650)
Jinfeng Sun
2025-01-07 20:58:10 +08:00 -
0ee849e8bc
feat: added http header support for document converter and cli (#642)
Luke Harrison
2025-01-07 04:15:14 -05:00 -
569038df42
docs: Add OpenContracts as an integration (#679)
JSIV
2025-01-07 04:14:42 -05:00 -
2b591f9872
docs: add Weaviate RAG recipe notebook (#451)
m-newhauser
2024-12-19 14:57:40 -06:00 -
fc645ea531
docs: document Haystack & Vectara support (#628)
Panos Vagenas
2024-12-19 13:33:02 +01:00 -
1418fa1488
chore: bump version to 2.14.0 [skip ci]
v2.14.0
github-actions[bot]
2024-12-18 07:04:47 +00:00 -
fd034802b6
feat: Create a backend to transform PubMed XML files to DoclingDocument (#557)
Lucas Morin
2024-12-17 19:27:09 +01:00 -
e31f09f71f
chore: bump version to 2.13.0 [skip ci]
v2.13.0
github-actions[bot]
2024-12-17 17:01:04 +00:00 -
60dc852f16
feat: Updated Layout processing with forms and key-value areas (#530)
Christoph Auer
2024-12-17 17:32:24 +01:00 -
00dec7a2f3
test: generate file from CLI in a temporary directory (#618)
Cesar Berrospi Ramis
2024-12-17 16:35:42 +01:00 -
4e087504cc
feat: create a backend to parse USPTO patents into DoclingDocument (#606)
Cesar Berrospi Ramis
2024-12-17 16:35:23 +01:00 -
3e599c7bbe
docs: add Haystack RAG example (#615)
Panos Vagenas
2024-12-17 14:24:40 +01:00 -
b7f94183f1
Merge branch 'main' of github.com:DS4SD/docling into release_v3
cau/new-layout-processing
Christoph Auer
2024-12-17 14:07:58 +01:00 -
ec554cb4f2
Adjust confidence in EasyOcr
Christoph Auer
2024-12-17 13:45:59 +01:00 -
3b53bd38c8
feat: Add Easyocr parameter recog_network (#613)
itsainii
2024-12-17 16:47:18 +08:00 -
1f5b1d46ab
feat: Add Easyocr parameter recog_network (#613)
itsainii
2024-12-17 16:47:18 +08:00 -
3bb3bf5715
docs: Fix the path to the run_with_accelerator.py example (#608)
Nikos Livathinos
2024-12-16 15:03:06 +01:00 -
cf2606825a
docs: Fix the path to the run_with_accelerator.py example (#608)
Nikos Livathinos
2024-12-16 15:03:06 +01:00 -
0fd50e53be
Fix form and key value area groups
Christoph Auer
2024-12-16 15:01:27 +01:00 -
efc25225ac
Introduce OCR confidence, propagate to orphan in post-processing
Christoph Auer
2024-12-16 14:42:01 +01:00 -
c020f2cba3
Rebase from main
Christoph Auer
2024-12-16 11:26:24 +01:00 -
a2db5fbd0f
chore: bump version to 2.12.0 [skip ci]
v2.12.0
github-actions[bot]
2024-12-13 18:27:00 +00:00 -
31184ad516
chore: bump version to 2.12.0 [skip ci]
github-actions[bot]
2024-12-13 18:27:00 +00:00 -
19fad9261c
feat: Introduce support for GPU Accelerators (#593)
Nikos Livathinos
2024-12-13 17:45:22 +01:00 -
16bd38cbf4
feat: Introduce support for GPU Accelerators (#593)
Nikos Livathinos
2024-12-13 17:45:22 +01:00 -
8cb7d8327a
Fixes for cluster pre-ordering
Christoph Auer
2024-12-13 14:17:21 +01:00 -
d972a29f2a
Fix table box snapping
Christoph Auer
2024-12-13 08:44:22 +01:00 -
12ccf20ddc
Update test GT
Christoph Auer
2024-12-12 20:37:48 +01:00 -
1aaf34056f
Merge from main
Christoph Auer
2024-12-12 20:17:24 +01:00 -
ccab2db1d4
Update pinnings to docling-core
Christoph Auer
2024-12-12 20:15:15 +01:00 -
365a1e7b98
chore: bump version to 2.11.0 [skip ci]
v2.11.0
github-actions[bot]
2024-12-12 08:16:05 +00:00 -
d1d0ddd924
chore: bump version to 2.11.0 [skip ci]
github-actions[bot]
2024-12-12 08:16:05 +00:00 -
57d51ede04
Many layout processing improvements, add document index type
Christoph Auer
2024-12-11 17:08:35 +01:00 -
3da166eafa
feat: Add timeout limit to document parsing job. DS4SD#270 (#552)
Abhishek Kumar
2024-12-11 19:36:10 +05:30 -
f407f68716
feat: Add timeout limit to document parsing job. DS4SD#270 (#552)
Abhishek Kumar
2024-12-11 19:36:10 +05:30 -
d094c4990a
Repin to release package versions
Christoph Auer
2024-12-11 13:16:35 +01:00 -
038791a25f
Rebase from main
Christoph Auer
2024-12-11 12:30:45 +01:00 -
aee9c0b324
fix: Do not import python modules from deepsearch-glm (#569)
Christoph Auer
2024-12-11 12:29:06 +01:00 -
443c28557c
fix: Do not import python modules from deepsearch-glm (#569)
Christoph Auer
2024-12-11 12:29:06 +01:00 -
05c8cb0fba
Update HF model ref, reset test generate
Christoph Auer
2024-12-10 20:02:19 +01:00 -
1de42bef6a
Update tests
Christoph Auer
2024-12-10 16:47:58 +01:00 -
5e013294f9
Update lockfile
Christoph Auer
2024-12-10 16:42:57 +01:00 -
76a6b13a92
Rebase from main
Christoph Auer
2024-12-10 16:32:48 +01:00 -
b66fb830c9
Merge pull request #556 from DS4SD/cau/layout-processing-improvement
Christoph Auer
2024-12-10 16:29:07 +01:00 -
184eed4095
Merge pull request #514 from DS4SD/nli/performance
Christoph Auer
2024-12-10 16:26:27 +01:00 -
f45499ce93
fix: Handle no result from RapidOcr reader (#558)
Christoph Auer
2024-12-10 16:25:05 +01:00 -
861e6fa90c
fix: Handle no result from RapidOcr reader (#558)
Christoph Auer
2024-12-10 16:25:05 +01:00 -
5c69081453
fix: Ocr AccleratorDevice
Nikos Livathinos
2024-12-10 15:23:56 +00:00 -
6bc1bd2ec4
fix: Correct the way to set GPU for EasyOCR, RapidOCR
Nikos Livathinos
2024-12-10 15:05:00 +00:00 -
d0c9e8e508
docs: update chunking usage docs, minor reorg (#550)
Panos Vagenas
2024-12-10 16:03:02 +01:00 -
6f986d26e1
docs: update chunking usage docs, minor reorg (#550)
Panos Vagenas
2024-12-10 16:03:02 +01:00 -
99ccb69a47
fix: Do proper check to set the device in EasyOCR, RapidOCR.
Nikos Livathinos
2024-12-10 14:46:21 +00:00 -
a7df337654
fix: make enum serializable with human-readable value (#555)
Michele Dolfi
2024-12-10 13:12:44 +01:00 -
1a3daf2ffb
fix: make enum serializable with human-readable value (#555)
Michele Dolfi
2024-12-10 13:12:44 +01:00 -
eb30c4f763
chore: bump version to 2.10.0 [skip ci]
v2.10.0
github-actions[bot]
2024-12-09 16:28:46 +00:00 -
ca83a1f0c9
chore: bump version to 2.10.0 [skip ci]
github-actions[bot]
2024-12-09 16:28:46 +00:00 -
7972d47f88
fix: Call into docling-core for legacy document transform (#551)
Christoph Auer
2024-12-09 17:06:47 +01:00 -
440c16ff20
fix: Call into docling-core for legacy document transform (#551)
Christoph Auer
2024-12-09 17:06:47 +01:00 -
ce82e23b66
Merge branch 'release_v3' into nli/performance
Christoph Auer
2024-12-09 16:52:54 +01:00 -
d006b937ad
Rebase from main
Christoph Auer
2024-12-09 16:52:26 +01:00 -
78f61a8522
fix: Introduce Image format options in CLI. Silence the tqdm downloading messages. (#544)
Nikos Livathinos
2024-12-09 15:57:37 +01:00 -
c21ada4b22
fix: Introduce Image format options in CLI. Silence the tqdm downloading messages. (#544)
Nikos Livathinos
2024-12-09 15:57:37 +01:00 -
fbb28b851d
Updated test ground-truth (again), bugfix for empty layout
Christoph Auer
2024-12-09 13:50:04 +01:00 -
aca57f0527
feat: docling-parse v2 as default PDF backend (#549)
Christoph Auer
2024-12-09 13:26:17 +01:00 -
840f5e15ed
feat: docling-parse v2 as default PDF backend (#549)
Christoph Auer
2024-12-09 13:26:17 +01:00 -
731e48ea43
Updated test ground-truth
Christoph Auer
2024-12-09 13:19:38 +01:00 -
1149d3ae08
fix: TableStructureModel: Refactor the artifacts path to use the new structure for fast/accurate model
Nikos Livathinos
2024-12-09 11:12:28 +01:00 -
9fd2cf847a
chore: bump version to 2.9.0 [skip ci]
v2.9.0
github-actions[bot]
2024-12-09 09:33:55 +00:00 -
d15d656c39
chore: bump version to 2.9.0 [skip ci]
github-actions[bot]
2024-12-09 09:33:55 +00:00 -
c8ecdd987e
feat: expose new hybrid chunker, update docs (#384)
Panos Vagenas
2024-12-09 08:28:29 +01:00 -
48d2cb3505
feat: expose new hybrid chunker, update docs (#384)
Panos Vagenas
2024-12-09 08:28:29 +01:00 -
eb7ffcdd1c
fix: Correcting DefaultText ID for MS Word backend (#537)
Maxim Lysak
2024-12-06 15:48:35 +01:00 -
dc71b8c004
fix: Correcting DefaultText ID for MS Word backend (#537)
Maxim Lysak
2024-12-06 15:48:35 +01:00 -
3e073dfbeb
feat(MS Word backend): Make detection of headers and other styles localization agnostic (#534)
Maxim Lysak
2024-12-06 15:17:56 +01:00 -
c31d9f032e
feat(MS Word backend): Make detection of headers and other styles localization agnostic (#534)
Maxim Lysak
2024-12-06 15:17:56 +01:00 -
f63e5ef3b5
fix: Improve the pydantic objects in the pipeline_options and imports.
Nikos Livathinos
2024-12-06 14:56:35 +01:00 -
53039a8367
ci: allow ! in conventionalcommits (#533)
Michele Dolfi
2024-12-06 14:50:10 +01:00 -
a38f57efce
ci: allow ! in conventionalcommits (#533)
Michele Dolfi
2024-12-06 14:50:10 +01:00 -
9102fe1adc
fix: Add
py.typedmarker file (#531)Sander Maijers
2024-12-06 13:42:14 +01:00 -
ba32fb8637
fix: Add
py.typedmarker file (#531)Sander Maijers
2024-12-06 13:42:14 +01:00 -
eb02a3235f
merged with main
dev/update-html-parser-with-h1
Peter Staar
2024-12-06 13:23:53 +01:00 -
6f7b128867
docs: document new integrations (#532)
Panos Vagenas
2024-12-06 13:18:14 +01:00 -
e780333440
docs: document new integrations (#532)
Panos Vagenas
2024-12-06 13:18:14 +01:00 -
54b4daa2dd
fix: Enable HTML export in CLI and add options for image mode (#513)
Peter W. J. Staar
2024-12-06 12:37:57 +01:00 -
0d11e30dd8
fix: Enable HTML export in CLI and add options for image mode (#513)
Peter W. J. Staar
2024-12-06 12:37:57 +01:00 -
63f1125d5c
fix: Missing text in docx (t tag) when embedded in a table (#528)
Maxim Lysak
2024-12-06 12:37:25 +01:00 -
b730b2d7a0
fix: Missing text in docx (t tag) when embedded in a table (#528)
Maxim Lysak
2024-12-06 12:37:25 +01:00 -
71f3a7ac3c
Rebase from release_v3
Christoph Auer
2024-12-06 12:33:38 +01:00 -
b0da1a2127
Merge pull request #504 from DS4SD/cau/layout-postprocessing
Christoph Auer
2024-12-06 12:26:34 +01:00 -
bed92b766f
fix: restore pydantic version pin after fixes (#512)
Michele Dolfi
2024-12-06 09:33:39 +01:00 -
c830b92b2e
fix: restore pydantic version pin after fixes (#512)
Michele Dolfi
2024-12-06 09:33:39 +01:00 -
3bb7df66ca
feat(Accelerator): Introduce options to control the num_threads and device from API, envvars, CLI. - Introduce the AcceleratorOptions, AcceleratorDevice and use them to set the device where the models run. - Introduce the accelerator_utils with function to decide the device and resolve the AUTO setting. - Refactor the way how the docling-ibm-models are called to match the new init signature of models. - Translate the accelerator options to the specific inputs for third-party models. - Extend the docling CLI with parameters to set the num_threads and device. - Add new unit tests. - Write new example how to use the accelerator options.
Nikos Livathinos
2024-12-02 18:27:44 +01:00 -
84f3548d30
Clean up imports again
Christoph Auer
2024-12-04 15:22:43 +01:00 -
e36f7d82f6
fix: folder input in cli (#511)
Michele Dolfi
2024-12-04 14:22:00 +01:00 -
8ada0bccc7
fix: folder input in cli (#511)
Michele Dolfi
2024-12-04 14:22:00 +01:00 -
e97688cd3d
Merge branch 'release_v3' of github.com:DS4SD/docling into cau/layout-postprocessing
Christoph Auer
2024-12-04 14:21:09 +01:00 -
11c7c43bad
Move to_docling_document from ds-glm to this repo
Christoph Auer
2024-12-04 13:11:41 +01:00 -
9c788ae778
chore: bump version to 2.8.3 [skip ci]
v2.8.3
github-actions[bot]
2024-12-03 15:16:47 +00:00 -
78fad801fe
chore: bump version to 2.8.3 [skip ci]
github-actions[bot]
2024-12-03 15:16:47 +00:00