aboutsummaryrefslogtreecommitdiff
path: root/vendors/kses/README
diff options
context:
space:
mode:
authorben <ben@36083f99-b078-4883-b0ff-0f9b5a30f544>2008-07-09 09:55:42 +0000
committerben <ben@36083f99-b078-4883-b0ff-0f9b5a30f544>2008-07-09 09:55:42 +0000
commit2cab677427f7fd462f35432d4a83fe89a26d7595 (patch)
treecefe9fa9a867e133a57c7d0b6df41c1dcf10f328 /vendors/kses/README
parentdb507314bc38957a23189f3af696473b0edb0c83 (diff)
downloadelgg-2cab677427f7fd462f35432d4a83fe89a26d7595.tar.gz
elgg-2cab677427f7fd462f35432d4a83fe89a26d7595.tar.bz2
Elgg 1.0, meet kses. Kses, Elgg 1.0.
git-svn-id: https://code.elgg.org/elgg/trunk@1344 36083f99-b078-4883-b0ff-0f9b5a30f544
Diffstat (limited to 'vendors/kses/README')
-rw-r--r--vendors/kses/README206
1 files changed, 206 insertions, 0 deletions
diff --git a/vendors/kses/README b/vendors/kses/README
new file mode 100644
index 000000000..192524c9f
--- /dev/null
+++ b/vendors/kses/README
@@ -0,0 +1,206 @@
+kses 0.2.2 README [kses strips evil scripts!]
+=================
+
+
+* INTRODUCTION *
+
+
+Welcome to kses - an HTML/XHTML filter written in PHP. It removes all unwanted
+HTML elements and attributes, no matter how malformed HTML input you give it.
+It also does several checks on attribute values. kses can be used to avoid
+Cross-Site Scripting (XSS), Buffer Overflows and Denial of Service attacks,
+among other things.
+
+The program is released under the terms of the GNU General Public License. You
+should look into what that means, before using kses in your programs. You can
+find the full text of the license in the file COPYING.
+
+
+* FEATURES *
+
+
+Some of kses' current features are:
+
+* It will only allow the HTML elements and attributes that it was explicitly
+told to allow.
+
+* Element and attribute names are case-insensitive (a href vs A HREF).
+
+* It will understand and process whitespace correctly.
+
+* Attribute values can be surrounded with quotes, apostrophes or nothing.
+
+* It will accept valueless attributes with just names and no values (selected).
+
+* It will accept XHTML's closing " /" marks.
+
+* Attribute values that are surrounded with nothing will get quotes to avoid
+producing non-W3C conforming HTML
+(<a href=http://sourceforge.net/projects/kses> works but isn't valid HTML).
+
+* It handles lots of types of malformed HTML, by interpreting the existing
+code the best it can and then rebuilding new code from it. That's a better
+approach than trying to process existing code, as you're bound to forget about
+some weird special case somewhere. It handles problems like never-ending
+quotes and tags gracefully.
+
+* It will remove additional "<" and ">" characters that people may try to
+sneak in somewhere.
+
+* It supports checking attribute values for minimum/maximum length and
+minimum/maximum value, to protect against Buffer Overflows and Denial of
+Service attacks against WWW clients and various servers. You can stop
+<iframe src= width= height=> from having too high values for width and height,
+for instance.
+
+* It has got a system for whitelisting URL protocols. You can say that
+attribute values may only start with http:, https:, ftp: and gopher:, but no
+other URL protocols (javascript:, java:, about:, telnet:..). The functions that
+do this work handle whitespace, upper/lower case, HTML entities
+("jav&#97;script:") and repeated entries ("javascript:javascript:alert(57)").
+It also normalizes HTML entities as a nice side effect.
+
+* It removes Netscape 4's JavaScript entities ("&{alert(57)};").
+
+* It handles NULL bytes and Opera's chr(173) whitespace characters.
+
+* There is a procedural version and two object-oriented versions (for PHP 4
+ and PHP 5) of kses.
+
+
+* USE IT *
+
+
+It's very easy to use kses in your own PHP web application! Basic usage looks
+like this:
+
+
+<?php
+
+include 'kses.php';
+
+$allowed = array('b' => array(),
+ 'i' => array(),
+ 'a' => array('href' => 1, 'title' => 1),
+ 'p' => array('align' => 1),
+ 'br' => array());
+
+$val = $_POST['val'];
+if (get_magic_quotes_gpc())
+ $val = stripslashes($val);
+# You must strip slashes from magic quotes, or kses will get confused.
+
+$val = kses($val, $allowed); # The filtering takes place here.
+
+# Do something with $val.
+
+?>
+
+
+This definition of $allowed means that only the elements B, I, A, P and BR are
+allowed (along with their closing tags /B, /I, /A, /P and /BR). B, I and BR
+may not have any attributes. A may only have the attributes HREF and TITLE,
+while P may only have the attribute ALIGN. You can list the elements and
+attributes in the array in any mixture of upper and lower case. kses will also
+recognize HTML code that uses both lower and upper case.
+
+It's important to select the right allowed attributes, so you won't open up
+an XSS hole by mistake. Some important attributes that you mustn't allow
+include but are not limited to: 1) style, and 2) all intrinsic events
+attributes (onMouseOver and so on, on* really). I'll write more about this in
+the documentation that will be distributed with future versions of kses.
+
+It's also important to note that kses' HTML input must be cleaned of all
+slashes coming from magic quotes. If the rest of your code requires these
+slashes to be present, you can always add them again after calling kses with
+a simple addslashes() call.
+
+You should take a look at the documentation in the docs/ directory and the
+examples in the examples/ directory, to get more information on how to use
+kses. The object-oriented versions of kses are also worth checking out, and
+they're included in the oop/ directory.
+
+
+* UPGRADING TO 0.2.2 *
+
+
+kses 0.2.2 is backwards compatible with all previous releases, so upgrading
+should just be a matter of using a new version of kses.php instead of an old
+one.
+
+
+* NEW VERSIONS, MAILING LISTS AND BUG REPORTS *
+
+
+If you want to download new versions, subscribe to the kses-general mailing
+list or even take part in the development of kses, we refer you to its
+homepage at http://sourceforge.net/projects/kses . New developers and beta
+testers are more than welcome!
+
+If you have any bug reports, suggestions for improvement or simply want to tell
+us that you use kses for some project, feel free to post to the kses-general
+mailing list. If you have found any security problems (particularly XSS,
+naturally) in kses, please contact Ulf privately at metaur at users dot
+sourceforge dot net so he can correct it before you or someone else tells the
+public about it.
+
+(No, it's not a security problem in kses if some program that uses it allows a
+bad attribute, silly. If kses is told to accept the element body with the
+attributes style and onLoad, it will accept them, even if that's a really bad
+idea, securitywise.)
+
+
+* OTHER HTML FILTERS *
+
+
+Here are the other stand-alone, open source HTML filters that we currently know
+of:
+
+* Htmlfilter for PHP - the filter from Squirrelmail
+ PHP
+ Konstantin Riabitsev
+ http://linux.duke.edu/projects/mini/htmlfilter/
+
+* HTML::StripScripts and related CPAN modules
+ Perl
+ Nick Cleaton
+ http://search.cpan.org/perldoc?HTML%3A%3AStripScripts
+
+* SafeHtmlChecker [is this really open source?]
+ PHP
+ Simon Willison
+ http://simon.incutio.com/archive/2003/02/23/safeHtmlChecker
+
+There are also a lot of HTML filters that were written specifically for some
+program. Some of them are better than others.
+
+Please write to the kses-general mailing list if you know of any other
+stand-alone, open-source filters.
+
+
+* DEDICATION *
+
+
+kses 0.2.2 is dedicated to Audrey Tautou and Jean-Pierre Jeunet.
+
+
+* MISC *
+
+
+The kses code is based on an HTML filter that Ulf wrote on his own back in 2002
+for the open-source project Gnuheter ( http://savannah.nongnu.org/projects/
+gnuheter ). Gnuheter is a fork from PHP-Nuke. The HTML filter has been
+improved a lot since then.
+
+To stop people from having sleepless nights, we feel the urgent need to state
+that kses doesn't have anything to do with the KDE project, despite having a
+name that starts with a K.
+
+In case someone was wondering, Ulf is available for kses-related consulting.
+
+Finally, the name kses comes from the terms XSS and access. It's also a
+recursive acronym (every open-source project should have one!) for "kses
+strips evil scripts".
+
+
+// Ulf and the kses development group, February 2005