GreenCloth Bugs

Security problems with GreenCloth

GreenCloth has filter_html turned on, which means that no direct HTML should make it through.

Examples of things that should be safe:

the raw code as rendered in greencloth
<!--[if gte IE 4]><SCRIPT>alert('XSS');</SCRIPT><![endif]-->
<script>alert(1)</script> alert(1)
<a href='javascript:alert(2)'>hello</a> hello
[ hello -> javascript:alert(3) ] hello
<img src=javascript:alert('XSS')>
!javascript:alert('XSS')! 'XSS'
<kode onmouseover="javascript:alert(4)">blah</kode> blah x
[nasty <script>alert(5)</script> -> link] nasty <script>alert(5)</script>
<notextile><script>alert(6)</script></notextile> alert(6)

Formatting problems

Cyrillic character bugs

The russian text of this riseup newsletter causes major problems unless escaped. The problem starts at this line:

Мы боролись с законом и победили
-----------------------------------

The following code block does something goofy with the CSS (at least in firefox), note how it continues off the page all the way to the right

This should be fixable by giving the surrounding box an overflow=scroll style attribute.

mysql> select * from subscriber_table where list_subscriber='freieburg';
+-----------------+------------------+---------------------+---------------------+-----------------------+----------------------+-------------------+--------------------+-----------------------+---------------------+----------------------------+-------------------------+---------------------------+------------------+-------------------+
| list_subscriber | user_subscriber  | date_subscriber     | update_subscriber   | visibility_subscriber | reception_subscriber | bounce_subscriber | comment_subscriber | subscribed_subscriber | included_subscriber | include_sources_subscriber | bounce_score_subscriber | bounce_address_subscriber | robot_subscriber | topics_subscriber |
+-----------------+------------------+---------------------+---------------------+-----------------------+----------------------+-------------------+--------------------+-----------------------+---------------------+----------------------------+-------------------------+---------------------------+------------------+-------------------+
| somelist       | micah@riseup.net | 2007-11-18 09:56:42 | 2007-11-18 09:56:42 | noconceal             | mail                 | NULL              | NULL               |                     1 |                   0 | NULL                       |                    NULL | NULL                      | lists.riseup.net | NULL              | 
+-----------------+------------------+---------------------+---------------------+-----------------------+----------------------+-------------------+--------------------+-----------------------+---------------------+----------------------------+-------------------------+---------------------------+------------------+-------------------+
1 row in set (0.01 sec)

it should be possible to make pre-formatted blocks that do not line-wrap

When dealing with pasted transcripts of console output, or languages that treat line breaks as syntactically meaningful, asking the user’s browser to line wrap them at word boundaries is a Bad Idea. There should be a simple wiki way to create blocks without linewrap (probably with overflow=scroll to prevent page-widening attacks)

Why is it a bad idea? The particular way that < pre > blocks are currently wrapped makes it so that if you copy and paste you will get the original without the extra line breaks. -e

I think it’s a bad idea because people trying to make sense of the page by reading it will be confused. I don’t want to have to copy and paste a console dump into another window just to see where the actual linebreaks fall. But if you don’t mind that, i don’t want to force my preferences on you, either. Is there a way to make this a user-settable (per-reader, not per-author) preference? Could we supply our own stylesheets, for example? —dkg

how about something like <pre nowrap>this will not get wrapped</pre>

bugs that won’t be fixed

  • carets in URLS. the ^ is a totally illegal character to be in a url. so there should be no need to make it work with urls.
   

I think that commit a20619499a5ef6371e5a4186217e63ff1adf6768 on 09-21 by Robin Davis introduced some rendering problems to WholeCloth – i guess that that is what greencloth is called now.
The commit seems to mainly add newlines to some of the html.
However this replacement:

if arg.any?
-      ret = "<div class=\"#{tagname}title\">#{arg}</div>\n#{ret}"
+      ret = "<div class=\"#{tagname}title\">#{arg}</div>\n"
     end
     ret
   end

seems to remove the actuall content from code blocks. (see the testpage i created at we.riseup.net/rails/testwiki )

Will ask robin in a mail what this is supposed to do.