fix(markdown): make parsing of rich table cells valid (#1821)

* fix: update md table classification

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix ground truth header changes

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix merge issues

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

* Fix minor ground truth errors

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>

---------

Signed-off-by: Michael Honaker <Michael.Honaker@ibm.com>
This commit is contained in:
Michael Honaker
2025-06-26 13:50:45 -04:00
committed by GitHub
parent ee4781075a
commit e79e4f0ab6
4 changed files with 322 additions and 24 deletions

View File

@@ -335,7 +335,7 @@ class MarkdownDocumentBackend(DeclarativeDocumentBackend):
_log.debug(f" - Paragraph (raw text): {element.children}")
snippet_text = element.children.strip()
# Detect start of the table:
if "|" in snippet_text:
if "|" in snippet_text or self.in_table:
# most likely part of the markdown table
self.in_table = True
if len(self.md_table_buffer) > 0: