Fix a minor issue in expected anchor after recent PR. The tests were written before the improvement that squashes non-alphanumeric characters into a single dash, and does not include dashes at the beginning and end. This updates the test case to match that behavior so that tests pass and Travis is green.
This is specifically driven by the Hugo usecase where multiple documents
are often rendered into the same ultimate HTML page.
When a header ID is written to the output HTML format (either through
`HTML_TOC`, `EXTENSION_HEADER_IDS`, or `EXTENSION_AUTO_HEADER_IDS`), it
is possible that multiple documents will hvae identical header IDs. To
permit validation to pass, it is useful to have a per-document prefix or
suffix (in our case, an MD5 of the content filename, and we will be
using it as a suffix).
That is, two documents (`A` and `B`) that have the same header ID (`#
Reason {#reason}`), will end up having an actual header ID of the form
`#reason-DOCID` (e.g., `#reason-A`, `#reason-B`) with these HTML
parameters.
This is built on top of #126 (more intelligent collision detection for
`EXTENSION_AUTO_HEADER_IDS`).
> This is a rework of an earlier version of this code.
The automatic header ID generation code submitted in #125 has a subtle
bug where it will use the same ID for multiple headers with identical
text. In the case below, all the headers are rendered a `<h1
id="header">Header</h1>`.
```markdown
# Header
# Header
# Header
# Header
```
This change is a simple but robust approach that uses an incrementing
counter and pre-checking to prevent header collision. (The above would
be rendered as `header`, `header-1`, `header-2`, and `header-3`.) In
more complex cases, it will append a new counter suffix (`-1`), like so:
```markdown
# Header
# Header 1
# Header
# Header
```
This will generate `header`, `header-1`, `header-1-1`, and `header-1-2`.
This code has two additional changes over the prior version:
1. Rather than reimplementing @shurcooL’s anchor sanitization code, I
have imported it as from
`github.com/shurcooL/go/github_flavored_markdown/sanitized_anchor_name`.
2. The markdown block parser is now only interested in *generating* a
sanitized anchor name, not with ensuring its uniqueness. That code
has been moved to the HTML renderer. This means that if the HTML
renderer is modified to identify all unique headers prior to
rendering, the hackish nature of the collision detection can be
eliminated.
The flag `HTML_SMARTYPANTS_ANGLED_QUOTES` combined with `HTML_USE_SMARTYPANTS` configures rendering of double quotes as angled left and right quotes (« »).
The SmartyPants documentation mentions a special syntax for these, `<<>>`, a syntax neither pretty nor user friendly.
Typical use cases would be either or, or combined, but never in the same document. As an example would be a person from Norway; he has a blog in both English and Norwegian (his native tounge); he would then configure Blackfriday to use angled quotes for the Norwegian section, but keep them as reqular double quotes for the English.
If the flag `HTML_SMARTYPANTS_ANGLED_QUOTES` is not provided, everything works as before this commit.
- Fixes#51, #101, and #102.
- Uses the [code][gfm] mentioned by @shurcooL from his Github
Flavored Markdown parser extension in a [comment on #102][comment].
Since this was mentioned, I assumed that @shurcooL would be OK with
this being included under the licence provided by blackfriday (there
is no licence comment on his code).
- I’ve added it behind another flag, EXTENSION_AUTO_HEADER_IDS, that
would need to be turned on for it to work. It works with both prefix
and underline headers.
[gfm]: 3bec0366a8/github_flavored_markdown/main.go (L90-L102)
[comment]: https://github.com/russross/blackfriday/issues/102#issuecomment-51272260
Add tests to make sure we don't break relative URLs again.
Extracted common html flags and common extensions for easy access from
tests.
Closes issue #104, which was fixed as a side effect of cf6bfc9.
For code blocks that contain a certain language of code, the recommended
attribute structure is <pre><code class="language-foo">. This also
corresponds to the behavior expected by various JS syntax highlighters.
The GitHub code block implementation was obsolete, and identical to the
normal implementation except for its attribute structure, so it was
removed.
Closes#108.
When checking if it's a newline preceeded by two spaces, look at the input data rather than the output, since the output depends on the renderer implementation.