In order to compare the effectiveness of the Apache Commons Lang StringEscapeUtils and the OWASP Reform library, I created a JSP page that encodes ASCII values from 0 to 255. I chose to examine the ability of each library to encode HTML and JavaScript values for the purpose of preventing cross-site scripting attacks. The results are shown at the bottom of this post.
In general, both the ESAPI and Reform libraries encode any value other than a-z, A-Z, and 0-9 (there are some exceptions). This is a great approach to ensuring client-side input cannot be interpreted as HTML or JavaScript commands when it is redisplayed in the browser.
ESAPI is under active development and boasts a variety of other security related functionality that may benefit an organization. I encourage everyone to take a look at the ESAPI OWASP project.
The rest of this post is provided as a reference.
Apache Commons Lang StringEscapeUtils Methods:
- escapeHtml
- escapeJava
- escapeJavaScript
- escapeSql
- escapeXml
- unescapeHtml
- unescapeJava
- unescapeJavaScript
- unescapeXml
OWASP Reform Methods:
- HtmlEncode
- HtmlAttributeEncode
- XmlEncode
- XmlAttributeEncode
- JsString
- VbsString
- canonicalize
- normalize
- encodeForCSS
- encodeForHTML
- encodeForHTMLAttribute
- encodeForJavaScript
- encodeForVBScript
- encodeForSQL
- encodeForLDAP
- encodeForDN
- encodeForXPath
- encodeForXML
- encodeForXMLAttribute
- encodeForURL
- decodeFromURL
- encodeForBase64
- decodeFromBase64
Legend:
ASCII - The numerical ASCII value
Char - The symbol or character associated with that ASCII value
SEU HTML - org.apache.commons.lang.StringEscapeUtils.escapeHtml
Reform HTML - org.owasp.reform.Reform.HtmlEncode
ESAPI HTML - org.owasp.esapi.Encoder.encodeForHTML
SEU JS - org.apache.commons.lang.StringEscapeUtils.escapeJavaScript
Reform JS - org.owasp.reform.Reform.JsString
ESAPI JS - org.owasp.esapi.Encoder.encodeForJavaScript
ASCII | Char | SEU HTML | Reform HTML | ESAPI HTML | SEU JS | Reform JS | ESAPI JS |
---|---|---|---|---|---|---|---|
0 | � | \u0000 | '\x00' | \0 | |||
1 |  | \u0001 | '\x01' | \x01 | |||
2 |  | \u0002 | '\x02' | \x02 | |||
3 |  | \u0003 | '\x03' | \x03 | |||
4 |  | \u0004 | '\x04' | \x04 | |||
5 |  | \u0005 | '\x05' | \x05 | |||
6 |  | \u0006 | '\x06' | \x06 | |||
7 |  | \u0007 | '\x07' | \x07 | |||
8 |  | \b | '\x08' | \b | |||
9 | 	 | \t | '\x09' | \t | |||
10 | | \n | '\x0a' | \n | |||
11 |  | \u000B | '\x0b' | \v | |||
12 |  | \f | '\x0c' | \f | |||
13 | | \r | '\x0d' | \r | |||
14 |  | \u000E | '\x0e' | \x0E | |||
15 |  | \u000F | '\x0f' | \x0F | |||
16 |  | \u0010 | '\x10' | \x10 | |||
17 |  | \u0011 | '\x11' | \x11 | |||
18 |  | \u0012 | '\x12' | \x12 | |||
19 |  | \u0013 | '\x13' | \x13 | |||
20 |  | \u0014 | '\x14' | \x14 | |||
21 |  | \u0015 | '\x15' | \x15 | |||
22 |  | \u0016 | '\x16' | \x16 | |||
23 |  | \u0017 | '\x17' | \x17 | |||
24 |  | \u0018 | '\x18' | \x18 | |||
25 |  | \u0019 | '\x19' | \x19 | |||
26 |  | \u001A | '\x1a' | \x1A | |||
27 |  | \u001B | '\x1b' | \x1B | |||
28 |  | \u001C | '\x1c' | \x1C | |||
29 |  | \u001D | '\x1d' | \x1D | |||
30 |  | \u001E | '\x1e' | \x1E | |||
31 |  | \u001F | '\x1f' | \x1F | |||
32 | ' ' | ||||||
33 | ! | ! | ! | ! | ! | '\x21' | \x21 |
34 | " | " | " | " | \" | '\x22' | \" |
35 | # | # | # | # | # | '\x23' | \x23 |
36 | $ | $ | $ | $ | $ | '\x24' | \x24 |
37 | % | % | % | % | % | '\x25' | \x25 |
38 | & | & | & | & | & | '\x26' | \x26 |
39 | ' | ' | ' | ' | \' | '\x27' | \' |
40 | ( | ( | ( | ( | ( | '\x28' | \x28 |
41 | ) | ) | ) | ) | ) | '\x29' | \x29 |
42 | * | * | * | * | * | '\x2a' | \x2A |
43 | + | + | + | + | + | '\x2b' | \x2B |
44 | , | , | , | , | , | ',' | , |
45 | - | - | - | - | - | '\x2d' | - |
46 | . | . | . | . | . | '.' | . |
47 | / | / | / | / | \/ | '\x2f' | \x2F |
48 | 0 | 0 | 0 | 0 | 0 | '0' | 0 |
49 | 1 | 1 | 1 | 1 | 1 | '1' | 1 |
50 | 2 | 2 | 2 | 2 | 2 | '2' | 2 |
51 | 3 | 3 | 3 | 3 | 3 | '3' | 3 |
52 | 4 | 4 | 4 | 4 | 4 | '4' | 4 |
53 | 5 | 5 | 5 | 5 | 5 | '5' | 5 |
54 | 6 | 6 | 6 | 6 | 6 | '6' | 6 |
55 | 7 | 7 | 7 | 7 | 7 | '7' | 7 |
56 | 8 | 8 | 8 | 8 | 8 | '8' | 8 |
57 | 9 | 9 | 9 | 9 | 9 | '9' | 9 |
58 | : | : | : | : | : | '\x3a' | \x3A |
59 | ; | ; | ; | ; | ; | '\x3b' | \x3B |
60 | < | < | < | < | < | '\x3c' | \x3C |
61 | = | = | = | = | = | '\x3d' | \x3D |
62 | > | > | > | > | > | '\x3e' | \x3E |
63 | ? | ? | ? | ? | ? | '\x3f' | \x3F |
64 | @ | @ | @ | @ | @ | '\x40' | \x40 |
65 | A | A | A | A | A | 'A' | A |
66 | B | B | B | B | B | 'B' | B |
67 | C | C | C | C | C | 'C' | C |
68 | D | D | D | D | D | 'D' | D |
69 | E | E | E | E | E | 'E' | E |
70 | F | F | F | F | F | 'F' | F |
71 | G | G | G | G | G | 'G' | G |
72 | H | H | H | H | H | 'H' | H |
73 | I | I | I | I | I | 'I' | I |
74 | J | J | J | J | J | 'J' | J |
75 | K | K | K | K | K | 'K' | K |
76 | L | L | L | L | L | 'L' | L |
77 | M | M | M | M | M | 'M' | M |
78 | N | N | N | N | N | 'N' | N |
79 | O | O | O | O | O | 'O' | O |
80 | P | P | P | P | P | 'P' | P |
81 | Q | Q | Q | Q | Q | 'Q' | Q |
82 | R | R | R | R | R | 'R' | R |
83 | S | S | S | S | S | 'S' | S |
84 | T | T | T | T | T | 'T' | T |
85 | U | U | U | U | U | 'U' | U |
86 | V | V | V | V | V | 'V' | V |
87 | W | W | W | W | W | 'W' | W |
88 | X | X | X | X | X | 'X' | X |
89 | Y | Y | Y | Y | Y | 'Y' | Y |
90 | Z | Z | Z | Z | Z | 'Z' | Z |
91 | [ | [ | [ | [ | [ | '\x5b' | \x5B |
92 | \ | \ | \ | \ | \\ | '\x5c' | \\ |
93 | ] | ] | ] | ] | ] | '\x5d' | \x5D |
94 | ^ | ^ | ^ | ^ | ^ | '\x5e' | \x5E |
95 | _ | _ | _ | _ | _ | '\x5f' | _ |
96 | ` | ` | ` | ` | ` | '\x60' | \x60 |
97 | a | a | a | a | a | 'a' | a |
98 | b | b | b | b | b | 'b' | b |
99 | c | c | c | c | c | 'c' | c |
100 | d | d | d | d | d | 'd' | d |
101 | e | e | e | e | e | 'e' | e |
102 | f | f | f | f | f | 'f' | f |
103 | g | g | g | g | g | 'g' | g |
104 | h | h | h | h | h | 'h' | h |
105 | i | i | i | i | i | 'i' | i |
106 | j | j | j | j | j | 'j' | j |
107 | k | k | k | k | k | 'k' | k |
108 | l | l | l | l | l | 'l' | l |
109 | m | m | m | m | m | 'm' | m |
110 | n | n | n | n | n | 'n' | n |
111 | o | o | o | o | o | 'o' | o |
112 | p | p | p | p | p | 'p' | p |
113 | q | q | q | q | q | 'q' | q |
114 | r | r | r | r | r | 'r' | r |
115 | s | s | s | s | s | 's' | s |
116 | t | t | t | t | t | 't' | t |
117 | u | u | u | u | u | 'u' | u |
118 | v | v | v | v | v | 'v' | v |
119 | w | w | w | w | w | 'w' | w |
120 | x | x | x | x | x | 'x' | x |
121 | y | y | y | y | y | 'y' | y |
122 | z | z | z | z | z | 'z' | z |
123 | { | { | { | { | { | '\x7b' | \x7B |
124 | | | | | | | | | | | '\x7c' | \x7C |
125 | } | } | } | } | } | '\x7d' | \x7D |
126 | ~ | ~ | ~ | ~ | ~ | '\x7e' | \x7E |
127 | | |  | | '\x7f' | \x7F | |
128 | € | € | € | \u0080 | '\u0080' | \x80 | |
129 | � |  |  | \u0081 | '\u0081' | \x81 | |
130 | ‚ | ‚ | ‚ | \u0082 | '\u0082' | \x82 | |
131 | ƒ | ƒ | ƒ | \u0083 | '\u0083' | \x83 | |
132 | „ | „ | „ | \u0084 | '\u0084' | \x84 | |
133 | … | … | … | \u0085 | '\u0085' | \x85 | |
134 | † | † | † | \u0086 | '\u0086' | \x86 | |
135 | ‡ | ‡ | ‡ | \u0087 | '\u0087' | \x87 | |
136 | ˆ | ˆ | ˆ | \u0088 | '\u0088' | \x88 | |
137 | ‰ | ‰ | ‰ | \u0089 | '\u0089' | \x89 | |
138 | Š | Š | Š | \u008A | '\u008a' | \x8A | |
139 | ‹ | ‹ | ‹ | \u008B | '\u008b' | \x8B | |
140 | Œ | Œ | Œ | \u008C | '\u008c' | \x8C | |
141 | � |  |  | \u008D | '\u008d' | \x8D | |
142 | Ž | Ž | Ž | \u008E | '\u008e' | \x8E | |
143 | � |  |  | \u008F | '\u008f' | \x8F | |
144 | � |  |  | \u0090 | '\u0090' | \x90 | |
145 | ‘ | ‘ | ‘ | \u0091 | '\u0091' | \x91 | |
146 | ’ | ’ | ’ | \u0092 | '\u0092' | \x92 | |
147 | “ | “ | “ | \u0093 | '\u0093' | \x93 | |
148 | ” | ” | ” | \u0094 | '\u0094' | \x94 | |
149 | • | • | • | \u0095 | '\u0095' | \x95 | |
150 | – | – | – | \u0096 | '\u0096' | \x96 | |
151 | — | — | — | \u0097 | '\u0097' | \x97 | |
152 | ˜ | ˜ | ˜ | \u0098 | '\u0098' | \x98 | |
153 | ™ | ™ | ™ | \u0099 | '\u0099' | \x99 | |
154 | š | š | š | \u009A | '\u009a' | \x9A | |
155 | › | › | › | \u009B | '\u009b' | \x9B | |
156 | œ | œ | œ | \u009C | '\u009c' | \x9C | |
157 | � |  |  | \u009D | '\u009d' | \x9D | |
158 | ž | ž | ž | \u009E | '\u009e' | \x9E | |
159 | Ÿ | Ÿ | Ÿ | \u009F | '\u009f' | \x9F | |
160 | | |   | | \u00A0 | '\u00a0' | \xA0 |
161 | ¡ | ¡ | ¡ | ¡ | \u00A1 | '\u00a1' | \xA1 |
162 | ¢ | ¢ | ¢ | ¢ | \u00A2 | '\u00a2' | \xA2 |
163 | £ | £ | £ | £ | \u00A3 | '\u00a3' | \xA3 |
164 | ¤ | ¤ | ¤ | ¤ | \u00A4 | '\u00a4' | \xA4 |
165 | ¥ | ¥ | ¥ | ¥ | \u00A5 | '\u00a5' | \xA5 |
166 | ¦ | ¦ | ¦ | ¦ | \u00A6 | '\u00a6' | \xA6 |
167 | § | § | § | § | \u00A7 | '\u00a7' | \xA7 |
168 | ¨ | ¨ | ¨ | ¨ | \u00A8 | '\u00a8' | \xA8 |
169 | © | © | © | © | \u00A9 | '\u00a9' | \xA9 |
170 | ª | ª | ª | ª | \u00AA | '\u00aa' | \xAA |
171 | « | « | « | « | \u00AB | '\u00ab' | \xAB |
172 | ¬ | ¬ | ¬ | ¬ | \u00AC | '\u00ac' | \xAC |
173 | | ­ | ­ | ­ | \u00AD | '\u00ad' | \xAD |
174 | ® | ® | ® | ® | \u00AE | '\u00ae' | \xAE |
175 | ¯ | ¯ | ¯ | ¯ | \u00AF | '\u00af' | \xAF |
176 | ° | ° | ° | ° | \u00B0 | '\u00b0' | \xB0 |
177 | ± | ± | ± | ± | \u00B1 | '\u00b1' | \xB1 |
178 | ² | ² | ² | ² | \u00B2 | '\u00b2' | \xB2 |
179 | ³ | ³ | ³ | ³ | \u00B3 | '\u00b3' | \xB3 |
180 | ´ | ´ | ´ | ´ | \u00B4 | '\u00b4' | \xB4 |
181 | µ | µ | µ | µ | \u00B5 | '\u00b5' | \xB5 |
182 | ¶ | ¶ | ¶ | ¶ | \u00B6 | '\u00b6' | \xB6 |
183 | · | · | · | · | \u00B7 | '\u00b7' | \xB7 |
184 | ¸ | ¸ | ¸ | ¸ | \u00B8 | '\u00b8' | \xB8 |
185 | ¹ | ¹ | ¹ | ¹ | \u00B9 | '\u00b9' | \xB9 |
186 | º | º | º | º | \u00BA | '\u00ba' | \xBA |
187 | » | » | » | » | \u00BB | '\u00bb' | \xBB |
188 | ¼ | ¼ | ¼ | ¼ | \u00BC | '\u00bc' | \xBC |
189 | ½ | ½ | ½ | ½ | \u00BD | '\u00bd' | \xBD |
190 | ¾ | ¾ | ¾ | ¾ | \u00BE | '\u00be' | \xBE |
191 | ¿ | ¿ | ¿ | ¿ | \u00BF | '\u00bf' | \xBF |
192 | À | À | À | À | \u00C0 | '\u00c0' | \xC0 |
193 | Á | Á | Á | Á | \u00C1 | '\u00c1' | \xC1 |
194 | Â | Â | Â | Â | \u00C2 | '\u00c2' | \xC2 |
195 | Ã | Ã | Ã | Ã | \u00C3 | '\u00c3' | \xC3 |
196 | Ä | Ä | Ä | Ä | \u00C4 | '\u00c4' | \xC4 |
197 | Å | Å | Å | Å | \u00C5 | '\u00c5' | \xC5 |
198 | Æ | Æ | Æ | Æ | \u00C6 | '\u00c6' | \xC6 |
199 | Ç | Ç | Ç | Ç | \u00C7 | '\u00c7' | \xC7 |
200 | È | È | È | È | \u00C8 | '\u00c8' | \xC8 |
201 | É | É | É | É | \u00C9 | '\u00c9' | \xC9 |
202 | Ê | Ê | Ê | Ê | \u00CA | '\u00ca' | \xCA |
203 | Ë | Ë | Ë | Ë | \u00CB | '\u00cb' | \xCB |
204 | Ì | Ì | Ì | Ì | \u00CC | '\u00cc' | \xCC |
205 | Í | Í | Í | Í | \u00CD | '\u00cd' | \xCD |
206 | Î | Î | Î | Î | \u00CE | '\u00ce' | \xCE |
207 | Ï | Ï | Ï | Ï | \u00CF | '\u00cf' | \xCF |
208 | Ð | Ð | Ð | Ð | \u00D0 | '\u00d0' | \xD0 |
209 | Ñ | Ñ | Ñ | Ñ | \u00D1 | '\u00d1' | \xD1 |
210 | Ò | Ò | Ò | Ò | \u00D2 | '\u00d2' | \xD2 |
211 | Ó | Ó | Ó | Ó | \u00D3 | '\u00d3' | \xD3 |
212 | Ô | Ô | Ô | Ô | \u00D4 | '\u00d4' | \xD4 |
213 | Õ | Õ | Õ | Õ | \u00D5 | '\u00d5' | \xD5 |
214 | Ö | Ö | Ö | Ö | \u00D6 | '\u00d6' | \xD6 |
215 | × | × | × | × | \u00D7 | '\u00d7' | \xD7 |
216 | Ø | Ø | Ø | Ø | \u00D8 | '\u00d8' | \xD8 |
217 | Ù | Ù | Ù | Ù | \u00D9 | '\u00d9' | \xD9 |
218 | Ú | Ú | Ú | Ú | \u00DA | '\u00da' | \xDA |
219 | Û | Û | Û | Û | \u00DB | '\u00db' | \xDB |
220 | Ü | Ü | Ü | Ü | \u00DC | '\u00dc' | \xDC |
221 | Ý | Ý | Ý | Ý | \u00DD | '\u00dd' | \xDD |
222 | Þ | Þ | Þ | Þ | \u00DE | '\u00de' | \xDE |
223 | ß | ß | ß | ß | \u00DF | '\u00df' | \xDF |
224 | à | à | à | à | \u00E0 | '\u00e0' | \xE0 |
225 | á | á | á | á | \u00E1 | '\u00e1' | \xE1 |
226 | â | â | â | â | \u00E2 | '\u00e2' | \xE2 |
227 | ã | ã | ã | ã | \u00E3 | '\u00e3' | \xE3 |
228 | ä | ä | ä | ä | \u00E4 | '\u00e4' | \xE4 |
229 | å | å | å | å | \u00E5 | '\u00e5' | \xE5 |
230 | æ | æ | æ | æ | \u00E6 | '\u00e6' | \xE6 |
231 | ç | ç | ç | ç | \u00E7 | '\u00e7' | \xE7 |
232 | è | è | è | è | \u00E8 | '\u00e8' | \xE8 |
233 | é | é | é | é | \u00E9 | '\u00e9' | \xE9 |
234 | ê | ê | ê | ê | \u00EA | '\u00ea' | \xEA |
235 | ë | ë | ë | ë | \u00EB | '\u00eb' | \xEB |
236 | ì | ì | ì | ì | \u00EC | '\u00ec' | \xEC |
237 | í | í | í | í | \u00ED | '\u00ed' | \xED |
238 | î | î | î | î | \u00EE | '\u00ee' | \xEE |
239 | ï | ï | ï | ï | \u00EF | '\u00ef' | \xEF |
240 | ð | ð | ð | ð | \u00F0 | '\u00f0' | \xF0 |
241 | ñ | ñ | ñ | ñ | \u00F1 | '\u00f1' | \xF1 |
242 | ò | ò | ò | ò | \u00F2 | '\u00f2' | \xF2 |
243 | ó | ó | ó | ó | \u00F3 | '\u00f3' | \xF3 |
244 | ô | ô | ô | ô | \u00F4 | '\u00f4' | \xF4 |
245 | õ | õ | õ | õ | \u00F5 | '\u00f5' | \xF5 |
246 | ö | ö | ö | ö | \u00F6 | '\u00f6' | \xF6 |
247 | ÷ | ÷ | ÷ | ÷ | \u00F7 | '\u00f7' | \xF7 |
248 | ø | ø | ø | ø | \u00F8 | '\u00f8' | \xF8 |
249 | ù | ù | ù | ù | \u00F9 | '\u00f9' | \xF9 |
250 | ú | ú | ú | ú | \u00FA | '\u00fa' | \xFA |
251 | û | û | û | û | \u00FB | '\u00fb' | \xFB |
252 | ü | ü | ü | ü | \u00FC | '\u00fc' | \xFC |
253 | ý | ý | ý | ý | \u00FD | '\u00fd' | \xFD |
254 | þ | þ | þ | þ | \u00FE | '\u00fe' | \xFE |
255 | ÿ | ÿ | ÿ | ÿ | \u00FF | '\u00ff' | \xFF |
4 comments:
Hi - this is great work. It looks like there are some serious inconsistencies here in how output encoding is handled. Would you be willing to include OWASP ESAPI in the test? The latest versions has codecs for all of these schemes and more, including CSS, MySQL, Oracle, etc... ESAPI also handles *decoding* (including double-encoding) which is quite complex. Thanks for this work!
I am looking into including ESAPI in the tests shortly. I'm not sure how I missed this project. It has a lot of really cool stuff in it beyond output encoding. I may write a post just about ESAPI in general in the near future.
Thanks Nick! We've studied all the specs to try to get these right in ESAPI (they're linked in the javadocs). It's important to note that some characters are illegal in certain encoding schemes. If anyone notices any issues with the ESAPI encodings, please let us know: http://www.owasp.org/index.php/ESAPI.
Hi Nick, we've updated ESAPI to make sure that illegal characters were replaced with the official u+FFFD character. Replacing with whitespace may allow an attacker more freedom.
Post a Comment