Skip to content

Conversation

@GeneralUser01
Copy link
Contributor

@GeneralUser01 GeneralUser01 commented Nov 27, 2024

The Markdown implementation was here changed to add many Markdown features, such as lists, tables and images, as well as to reuse more code in a flexible manner.

Initial change (outdated, inaccurate as to the behavior of code blocks here):

The way code-blocks are handled was here changed to support multiple code-blocks per message.

This allows content that is not formatted as code between code-blocks, whereas previously everything between the first start of a code-block and the last end of a code-block would become part of one code-block. The only other change is to only end a code-block on the end-signature when it is right after a new line, so triple backticks may also be included as long as they are not at the start of a line.

For explanation on how the regex functions, see perl - Regex to match any character including new lines

Checks

@Krzmbrzl
Copy link
Member

Krzmbrzl commented Nov 27, 2024

only end a code-block on the end-signature when it is right after a new line, so triple backticks may also be included as long as they are not at the start of a line.

I assume this is in accordance with the Markdown spec (or at least what most dialects use)?


Bla```

EDIT: seems that at least GitHub agrees that triple backticks may be part of code blocks

@GeneralUser01
Copy link
Contributor Author

GeneralUser01 commented Nov 28, 2024

Including triple backticks that are not at the start of a line deviates slightly from GitHub flavored Markdown, which requires triple backticks to on the same line be preceded by either four spaces or a non-whitespace character. I suppose it is designed this way to make it more visible if the triple backticks really are at the start of a line or close to it and thereby work as the mark to end the code-block. Still, this change to include triple backticks at all is more flexible in terms of what content it supports when formatted as usual with at least one new line both when starting and ending the content section, though I will consider changing this again to more closely match the GitHub flavored Markdown when it comes to code-blocks. I'm not sure what exact code-block formatting most Markdown flavors use, if anyone here could elaborate on that and/or has some sources on it then they may share it. However, this is a smaller change that the main change of supporting multiple code-blocks per message is independent from.

Regarding the info string right after a code-block's initial triple backticks, it is currently accepted but ignored. This would usually be used for specifying a language to provide syntax highlighting for and it could be accessed from the regex via another capture group. If it were implemented, would syntax highlighting be a welcome addition? If so, are there any particular conditions that feature should fulfill?

@Hartmnt
Copy link
Member

Hartmnt commented Feb 27, 2025

Hi, to make the CI work with FreeBSD again, please rebase your branch against current master :)

Copy link
Member

@Hartmnt Hartmnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh dear, it would have been nice to have more than one giant commit on the new refactor of the entire markdown parsing :D

This might take a while to review

@GeneralUser01 GeneralUser01 changed the title FEAT(client,markdown): Support multiple code-blocks per message FEAT(client,markdown): Add further Markdown support Apr 5, 2025
@GeneralUser01 GeneralUser01 force-pushed the patch-3 branch 2 times, most recently from 4d73917 to 2b57273 Compare April 7, 2025 21:58
@Hartmnt
Copy link
Member

Hartmnt commented Apr 19, 2025

Would you consider this PR ready for review, yet?

@GeneralUser01
Copy link
Contributor Author

Thanks for asking, I was about to push my last patches for this PR, so I would consider it ready for review once that is done. I estimate that I will have added the patches tomorrow at the latest.

@GeneralUser01
Copy link
Contributor Author

The patches are applied and I consider this PR ready for review now :)

@Hartmnt
Copy link
Member

Hartmnt commented Apr 24, 2025

Don't have the time to review Markdown.cpp yet, but one thing I noticed immediately: Since the definitions you added for Dark.scss and Lite.scss are the same, they should live in Base Theme.scss which both Dark.scss and Lite.scss already inherit :)

@GeneralUser01
Copy link
Contributor Author

Right, indeed these are the same aside from the colors which are already handled by variables set in Dark Definitions.scss and Lite Definitions.scss, so I will remove the duplicate and move the rest into Base Theme.scss.

@GeneralUser01
Copy link
Contributor Author

Would you have time to review the GIF PR sooner? It is even larger in terms of the amount of lines of code so I guess not. I have yet to finish the last details on it before the review but will have within a few days.

@Hartmnt
Copy link
Member

Hartmnt commented Apr 24, 2025

@GeneralUser01 I think the markdown MR has a higher chance of being reviewed sooner as it is mostly self-contained in Markdown.cpp where as the gif thing interacts with the log view more directly

@Hartmnt
Copy link
Member

Hartmnt commented Apr 24, 2025

Also the gif thing will need a rebase after the pending image paste fixes are merged

@GeneralUser01
Copy link
Contributor Author

@Hartmnt Speaking of image pasting, I have an upcoming PR to paste Markdown images instead. This is based on the Markdown feature and the GIF feature (due to the need to distinguish which formats to process as a QImage before converting the file data to base64) which are currently not merged. How would you prefer this to be handled?

Another PR would be adding a message preview like it is provided on GitHub. This may be a standalone change given that the container of the chat bar is already disabled while preparing images to send so that the "Write" and "Preview" buttons are as well when placed within the same container.

@GeneralUser01 GeneralUser01 force-pushed the patch-3 branch 4 times, most recently from e4a6b18 to 2f8f71f Compare May 30, 2025 16:32
Adds the following features:
- Underline text
- Reference links and reference images
- Combinable formatting, such as emphasis on links and images in lists or tables
- Strikethrough text changed to require having no more surrounding tildes
- Block quotes support subquotes and have a container with padding and colors in both Lite and Dark theme
- Superscript and subscript text
- Unordered and ordered lists
- Description lists
- Images and linked images
- Tables
- Thematic breaks
- Centered blocks
- Multiple code blocks per message
- Code blocks and inline code may use additional signature characters to include more such characters in the text
- Code blocks may use triple tildes or indents on each line with at least 4 spaces instead
- Code blocks and inline code have padding (with just fixed spaces for inline code) and colors in both Lite and Dark theme as well as support for adding syntax highlighting in the future
- Colored text via LaTeX or HTML
- Fixed spaces via HTML
- Linebreaks via Markdown or HTML and regular linebreaks come as double or none (single linebreaks are collapsed into one space)
- Removal of HTML comments in output (as with the rest, text remains as-is in inline code or code blocks)
…es and make various adjustments

Made the following changes:
- Add a newline to the end of each used `.scss` file for Lite and Dark theme
- Remove spacing between cells in tables
- Use the border style `ridge` in tables
- Use shorter syntax instead of `std::make_tuple`
- Use shorter regex pattern for HTML comments
- Use `[]` universally instead of `at()` to access content by index due to there being no clear advantage to using the alternative for getting a value
- Store `width` and `height` attributes as `int`s since they cannot come with a unit
- Change some wording
- Make lists separate when another item marker is used regardless of whether the list is ordered
- Remove check before using regex to find a list starting at the current offset which can seldom be ruled out anyway
- Limit `static const` declarations to heavy- and frequently used objects, which is the `QRegularExpression` objects with repeated calls per message here
When the input string was separated from the HTML output, the function for escaping a character no longer advanced the offset correctly since it now always remains one character in the input instead of the multiple characters the escaped character may consist of. This change corrects it by always incrementing the offset by one.
… before it

Some emphasis, namely italic style as well as superscript and subscript, could wrap text over multiple lines which was inappropriate Markdown. This change restricts emphasis to wrap no more than one line in terms of explicit linebreaks and lists are processed before emphasis in order to distinguish list declarations using asterisks (*).
…ions

Extends the removal of reference definitions in the text to their surrounding empty lines.
Ensures any content that starts indented by at least 4 spaces is processed as a code block. This change also simplifies the code for some comparisons.
…he string's memory beforehand

This change tries to further improve the performance of the text-to-Markdown conversion by allocating much of the output string's memory before appending to it for each character and matching pattern in the input string.
…re closing

Adds a space before closing img-tags to make it safer, avoiding having the slash become a part of an unquoted attribute value, and for consistency.
…th "http"

Enables the use of references with file URIs by not prefixing the protocol "http" in either case where the reference value already starts with "http" or "file".
… dimensions

When limiting values for image resizes set via Markdown to that which is allowed, and scaling the length on the other axis when only one is set, the maximum area sizes for messages and images are required. Previously these were declared locally where they were used but now that these constants are used in multiple functions across multiple classes I created globals for them, making it easier to make consistent changes. The values the globals are set to match the upstream change for images to increase the max width from 600 to 1600 and max height from 400 to 1000.
…mats without animation support

Since animated images require that their image format is preserved and some formats only have read-support, this change only limits the given image via `QImage` if its format does not support animated images and otherwise just encodes the data as base64 without modifications.

Also limiting the given image via `QImage` if its format has write-support would be done in another commit where the method for it first tries to write the modified image back to the given format instead of always to JPG (may enable transparency as well as reducing the dimensions and file size of more types of images to fit).
@Hartmnt Hartmnt added client feature-request This issue or PR deals with a new feature labels Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client feature-request This issue or PR deals with a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants