* Accept info strings in code fences
According to the common mark standard, code fence info strings can be anything,
not just single words. Update the tests and parser accordingly.
The formatter already expected an info string with a language and HTML classes,
so this does not need to change. Update the LaTeX formatter to take the first
word of the info string as the language.
Fixes#410 (in v1).
* Don't output whole info string as code classes
This follows the common mark specification.
* run go fmt
This drops the naive approach at <script> tag stripping and resorts to
full sanitization of html. The general idea (and the regexps) is grabbed
from Stack Exchange's PageDown JavaScript Markdown processor[1]. Like in
PageDown, it's implemented as a separate pass over resulting html.
Includes a metric ton (but not all) of test cases from here[2]. Several
are commented out since they don't pass yet.
Stronger (but still incomplete) fix for #11.
[1] http://code.google.com/p/pagedown/wiki/PageDown
[2] https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet