mirror of
https://github.com/DS4SD/docling.git
synced 2025-07-31 14:34:40 +00:00
* fix(html-backend): improve accordion extraction and hidden content handling - Add specialized handlers for Bootstrap accordion components to properly extract questions from panel-title elements - Implement is_hidden_element() method to detect and skip content with hidden classes, styles, and attributes - Update walk(), analyze_tag(), and extract_text_recursively() to filter out hidden elements - Add comprehensive test suite with direct method tests and example HTML files This fixes two issues: 1. Missing questions in accordion components 2. Unwanted extraction of hidden metadata content Tests: tests/test_html_enhanced.py Signed-off-by: Ulan.Yisaev <ulan.yisaev@nortal.com> * + html-backend itelsd Signed-off-by: Ulan.Yisaev <ulan.yisaev@nortal.com> * run pre-commit run --all-files --------- Signed-off-by: Ulan.Yisaev <ulan.yisaev@nortal.com> Co-authored-by: Ulan.Yisaev <ulan.yisaev@nortal.com>
45 lines
2.4 KiB
HTML
45 lines
2.4 KiB
HTML
<!DOCTYPE html>
|
|
<html>
|
|
<head>
|
|
<title>Accordion Test</title>
|
|
</head>
|
|
<body>
|
|
<div class="row">
|
|
<div class="col-xs-12">
|
|
<h3>Account Information FAQ</h3>
|
|
<div class="row">
|
|
<div class="col-xs-12">
|
|
<div class="panel-group" id="accordion-36" role="tablist" aria-multiselectable="true">
|
|
<div class="panel panel-default">
|
|
<div class="panel-heading" id="accordion-36h0kkk">
|
|
<div class="panel-title">
|
|
<a class="collapsed" role="button" data-toggle="collapse" data-path="faq/account-information" data-id="digitally-signed-statement" data-target="#accordion-36h0kkk.panel-collapse" aria-controls="accordion-36h0kkk">1. How can I get a digitally signed bank statement?</a>
|
|
</div>
|
|
</div>
|
|
<div class="panel-collapse collapse" id="accordion-36h0kkk" role="tabpanel" aria-labelledby="accordion-36h0kkk">
|
|
<div class="panel-body">
|
|
<p>You can download your statement from the online banking portal..</p>
|
|
<div class="keywords hidden">Account Information FAQ</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="panel panel-default">
|
|
<div class="panel-heading" id="accordion-36h1kkk">
|
|
<div class="panel-title">
|
|
<a class="collapsed" role="button" data-toggle="collapse" data-path="faq/account-information" data-id="change-contact-details" data-target="#accordion-36h1kkk.panel-collapse" aria-controls="accordion-36h1kkk">2. How do I update my contact information?</a>
|
|
</div>
|
|
</div>
|
|
<div class="panel-collapse collapse" id="accordion-36h1kkk" role="tabpanel" aria-labelledby="accordion-36h1kkk">
|
|
<div class="panel-body">
|
|
<p>You can update your contact details through the online banking portal..</p>
|
|
<div class="keywords hidden">Account Information FAQ</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html> |