Security | Closure Templates | Google Developers

Many web applications suffer from cross-site scripting (XSS) vulnerabilities. XSS was number 3 in OWASP's top 10 application security risks in 2013. Closure Templates has features to prevent XSS in your application.

Autoescaping in Closure Templates

XSS vulnerabilities typically occur when dynamic text from an untrusted source is embedded into an HTML document. To prevent these vulnerabilities, escaping is used. Escaping is the process of converting text to be properly displayed in its context, such as turning angle brackets into < and > so they are not interpreted as tags.

The type of escaping needed depends on the context in the document where the value appears. For example, a value that appears inside a <style> tag needs to be escaped differently than a value that appears in a URI.

Closure Templates' autoescaping ensures that every dynamic value is escaped in a context-appropriate way.

Strict Autoescaping

The most secure way to use Closure Templates is with strict autoescaping. Strict templates are recursively guaranteed not to underescape the output. Every last dynamic value is printed with the correct escaping technique.

The output of a strict template is not a plain string, but a SanitizedContent object, which associates a content kind with the text. The content kind represents how the content is intended to be used, and the type of escaping, if any, that has already been applied to it. This information is particularly important in cases where the output of one template is used as an input parameter to another template.

For every dynamic value that appears in the output of a template, Closure Templates identifies the output context at the point of use, determined by the surrounding text.

These two factors (content kind and output context) determine what kind of escaping is applied to the text. For example, if the text has already been URI-escaped, and it's being used in a URI context, then there's no need to escape it again. This prevents "double-escaping" of the text.

Content Kinds

The different content kinds are:

Content kind	Description	Example	Notes
`html`	HTML markup	`<div>Hello!</div>`
`attributes`	HTML attribute-value pairs	`class="foo" width="100%"`	Represents the combination of both attribute names and attribute values, and must include the quotation marks around the attribute value. If the template output is intended to be just an attribute value alone (the part inside the quotes) then use either the `text` or `html` content kind.
`text`	Plain text, not yet escaped	`Hello!`
`uri`	URIs	`http://www.google.com/search?q=android`
`css`	Stylesheet text	`.myClass{ color: red; display: block; }`
`js`	JavaScript or JSON	`{"a": 1, "b": 2}`
`trusted_resource_uri`	A URL which is under application control and from which script, CSS, and other resources that represent executable code can be fetched.	`https://www.google.com/test.js`	Currently Soy requires trusted_resource_uri for script srcs. In the future, this may apply to other kinds of resources, such as stylesheets.

The content kind isn't a compiler type; you won't get an error or warning if you use a text kind in a css context or vice versa. Rather, the content kind is an indication that the text is safe for a given context and therefore does not need additional escaping.

For input values that are not SanitizedContent objects, a strict template coerces the value to a text string, and then applies escaping based on the context.

Usage

Strict autoescaping is on by default for all templates. (You can also explicitly declare it by adding autoescape="strict" to your namespace or template declarations.)

By default, the output of a strict template has kind html. If your template produces a different kind of content, you must add kind attributes to your template. For example, a strict template that produces a URI might look like this:

{template .googleUri autoescape="strict" kind="uri"}
  http://www.google.com/
{/template}

The kind attribute can be added to the following Closure Templates commands:

Command	Notes
`template`	Optional. Assumed to be `kind="html"` if omitted.
`deltemplate`	Optional. Assumed to be `kind="html"` if omitted. All matching delegates must have the same `kind`.
`let`	Required only for block form let statements.
`param`	Required only for block form param statements.

The following example illustrates the usage of the kind attribute:

{template .foo autoescape="strict" kind="html"}
  // Block-form 'let' command, 'kind' is required.
  {let $message kind="text"}
    {msg}Hi, {$name}!{/msg}
  {/let}

  // Short form 'let', no 'kind' attribute.
  {let $category: $categoryList[0] /}

  {call .bar}
    // Block-form 'param' command, kind is required.
    {param attributes kind="attributes"}
      title="{$message}"{sp}
      onclick="foo('{$message}')"
    {/param}
    {param content kind="html"}
      <b>{$message}</b>
    {/param}

    // Short-form 'param' command, no 'kind' attribute.
    {param visible: true /}
  {/call}
{/template}

Short-form commands don't need the kind attribute because they pass values rather than constructing strings, and values keep whatever kind they already have.

Strict autoescaping can be turned on for some templates and not for others, so you do not need to change all your templates at once. However, it is a good idea to eventually make all your templates use strict autoescaping.

Passing parameters to strict templates

For ordinary content that doesn't contain markup, you can just pass in the string values as template parameters as before, and they will get escaped.

For trusted content that has markup that you don't want re-escaped, wrap the content in the appropriate SanitizedContent object. It is your responsibility to make sure that the content is really safe; otherwise you will introduce exactly the kind of XSS vulnerability that strict mode was designed to prevent.

For JavaScript, the function to wrap an HTML content string is soydata.VERY_UNSAFE.ordainSanitizedHtml. In Java, the equivalent function is com.google.template.soy.data.UnsafeSanitizedContentOrdainer#ordainAsSafe.

You might want to place restrictions in your project that limit where and when these wrapper functions can be called, such as limiting these calls to a specific class or package that can easily be searched and audited. Otherwise, it becomes tempting to simply wrap arbitrary, untrusted strings whenever it's convenient in the code, which defeats the whole purpose of strict autoescaping.

Anatomy of an XSS Hack (and its prevention)

Template systems make it easy to compose content from static HTML and dynamic values. Closure Templates's autoescaping makes it even easier by letting you use the same values in many contexts without having to explicitly specify encoding.

An enterprising hacker might try to sneak a malicious value into your template to take it over via XSS. Perhaps using

{ x: 'javascript:/*</style></script>/**/
  /<script>1/(alert(1337))//</script>' }

If we pass this to a naive template, like

{template .foo autoescape="deprecated-contextual"}
  <a href="{$x|noAutoescape}"
   onclick="{$x|noAutoescape}"
   >{$x|noAutoescape}</a>
  <script>var x = '{$x|noAutoescape}'</script>
  <style>
    p {
      font-family: "{$x|noAutoescape}";
      background: url(/images?q={$x|noAutoescape});
      left: {$x|noAutoescape}
    }
  </style>
{/template}

then the attack succeeds. That template produces

  <a href="javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script>"
   onclick="javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script>"
   >javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script></a>
  <script>var x = 'javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script></script>
  <style>
    p {
      font-family: "javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script>";
      background: url(/images?q=javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script>);
      left: javascript:/*</style></script>/**/ /<script>1/(alert(1337))//</script>
    }
  </style>

which pops up "1337" 6 times, and a seventh if you click the link.

Let's take another look at that malicious input to figure out why:

`javascript:`	At the beginning of a URL, this changes the rest of the content into JavaScript. In a script statement, this is just an unused label.
`/</style></script>/*/`	This breaks out of any `style` or `script` element. If already in a script attribute value, this just looks like a comment. It prematurely ends any unquoted attribute value and its containing tag.
`/<script>1/`	If outside a script, this starts a script tag with a useless division. Inside a script, this is a self-contained regular expression literal.
`(alert(1337))`	If preceded by a regular expression literal, this tries to call it, but only after executing the real malicious code, `alert(1337)`.
`//</script>`	If inside a script tag, this closes it correctly. If inside a javascript: URL attribute or event handler attribute, this is a harmless comment.

Many of the pieces of that malicious input depend on being interpreted different ways by different parts of a browser. Autoescaping defangs this and other malicious inputs by choosing a single consistent meaning for a dynamic value, and choosing an escaping scheme that makes sure the browser will interpret it the same way.

So if we pass that same malicious input to an autoescaped template: (Note that only the |noAutoescape's have been removed.)

{template .foo autoescape="deprecated-contextual"}
  <a href="{$x|noAutoescape}"
   onclick="{$x|noAutoescape}"
   >{$x|noAutoescape}</a>
  <script>var x = '{$x|noAutoescape}'</script>
  <style>
    p {
      font-family: "{$x|noAutoescape}";
      background: url(/images?q={$x|noAutoescape});
      left: {$x|noAutoescape}
    }
  </style>
{/template}

We get a very different output; one that is altogether saner:

  <a href="#zSoyz"
   onclick="'javascript:/*&lt;/style&gt;&lt;/script&gt;/**/ /&lt;script&gt;1/(alert(1337))//&lt;/script&gt;'"
   >javascript:/*&lt;/style&gt;&lt;/script&gt;/**/ /&lt;script&gt;1/(alert(1337))//&lt;/script&gt;</a>
  <script>var x = 'javascript:/*\x3c/style\x3e\x3c/script\x3e/**/ /\x3cscript\x3e1/(alert(1337))//\x3c/script\x3e'</script>
  <style>
    p {
      font-family: "javascript:/*\3c /style\3e \3c /script\3e /**/ /\3c script\3e 1/(alert(1337))//\3c /script\3e ";
      background: url(/images?q=javascript%3A%2F%2A%3E%2Fstyle%3C%3E%2Fscript%3C%2F%2A%2A%2F%20%2F%3Escript%3C1%2F%28alert%281337%29%29%2F%2F%3E%2Fscript%3C);
      left: zSoyz
    }
  </style>

When {$x} appeared inside HTML text, we entity-encoded it (< → <).
When {$x} appeared inside a URL or as a CSS quantity, we rejected it because it had a protocol javascript: that was not http or https, and instead output a safe value #zSoyz. Had {$x} appeared in the query portion of a URL, we would have percent-encoded it instead of rejecting it outright (< → %3C).
When {$x} appeared in JavaScript, we wrapped it in quotes (if not already inside quotes) and escaped HTML special characters (< → \x3c).
When {$x} appeared inside CSS quotes, we did something similar to JavaScript, but using CSS escaping conventions (< → \3c ).

The malicious output was defanged.

Escaping: the fine details

Substitutions in HTML

When a print command appears where normal HTML text could appear, then the result is HTML entity-escaped. For example, in

  <div title="{$shortMessage}">{$longMessage}</div>

given ({ "shortMessage": "I <3 ponies", "longMessage": "OMG! <3 <3 <3!" }) produces

  <div title="I &lt;3 ponies!">OMG!  &lt;3 &lt;3 &lt;3!</div>

You can safely substitute data anywhere a tag can appear or in a plain attribute value. It's good practice to quote all your attributes, but if you do forget quotes, the autoescaper makes sure the attribute value cannot be split by spaces in the dynamic value. Given the input above, <div title={$shortMessage}> → <div title=I <3 ponies!>. Spaces, which would normally end an unquoted attribute value, are encoded to keep the value together.

To avoid over-escaping of known safe HTML, you can use sanitized content. The template <div>{$foo}</div> given { foo: new soydata.SanitizedHtml("<b>Foo</b>") } produces output that is not re-escaped: <div><b>Foo</b></div>, instead of the over-escaped version that would have been produced if the soydata.SanitizedHtml wrapper were not there: <div><b>Foo</b></div>.

Sanitized content is safe to use with attributes and with elements that cannot contain tags such as TEXTAREA. The template <div title="{$foo}">{$foo}</div> given the input above produces a sensible output: <div title="Foo"><b>Foo</b></div>. When embedded in an HTML attribute, sanitized content will have tags stripped first.

Substitutions in Tag and Attribute Names

Substitutions in tag and attribute names are sanity-checked rather than entity-encoded.

<h{$headerLevel}>Foo</h{$headerLevel> → <h3>Foo</h3> for headerLevel=3 but for headerLevel='><script>alert(1337)<script' you get <hzSoyz>Foo</hzSoyz>. You'll also get a log message in Java, and in JavaScript, if you're running with closure asserts enabled, you get an assert.

Don't try to specify special tag names; like script or style; or special attribute names; like href, style, or onclick; dynamically. Trying to use <{$name}>{$content}</{$name}> with ({ "name": "script", "content": "alert(1337)" }) or <a {$name}="{$content}"> with ({ "name": "onmouseover", "content": "alert(1337)" }) is asking for trouble. Since the autoescaper cannot distinguish JavaScript, CSS, or URLs from plain HTML with those tag and attribute names, it must reject them.

Substitutions in URLs

Values that are substituted into different parts of URIs are treated differently. Substitutions in the query part are URI-escaped.

`<a href="{$x}">`		Entity-escape and filter out bad protocols.
	`({ "x": "http://foo/" })`	→	`<a href="http://foo/">`
	`({ "x": "/foo?a=b&c=d" })`	→	`<a href="/foo?a=b&c=d">`
	`({ "x": "javascript:alert(1337)" })`	→	`<a href="#zSoyz">`
`<a href="/foo/{$x}">`		Just entity-escape.
	`({ "x": "bar" })`	→	`<a href="/foo/bar">`
	`({ "x": "bar&baz/boo" })`	→	`<a href="/foo/bar&baz/boo">`
`<a href="/foo?q={$x}">`		Percent encode inside query.
	`({ "x": "bar&baz=boo" })`	→	`<a href="/foo?q=bar%26baz%3dboo">`
	`({ "x": soydata.VERY_UNSAFE. ordainSanitizedUri("bar&baz=boo") })`	→	`<a href="/foo?q=bar&baz=boo">`
	`({ "x": "A is #1" })`	→	`<a href="/foo?q=A%20is%20%231">`

As long as you stick to standard HTML attribute names, the autoescaper figures out which attributes contain URLs, which contain CSS, etc. If you do decide to define custom attributes such as data-… attributes, you can still use a naming convention to tell the autoescaper which attributes have URL content: Names that start or end with "URL" or "URI", ignoring case, will be treated as having URL values. For example, the autoescaper treats data-secondaryUrl, foo:urlForLogin, and data-thesauri as having URL content; but not data-curliewurly. Precisely, /\bur[il]|ur[il]s?$/i is the set of custom attribute names with URL values.

Substitutions in Trusted Resource URLs

Values that are substituted in Trusted Resource URIs are almost same as in URIs except that the value needs to be TrustedResourceUrl.

`<script src="{$x}">`		Entity-escape and filter out non-TrustedResourceUri.
	`({ "x": "foo") })`	→	`<script src="about:invalid#zSoyz">`
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "http://foo/") })`	→	`<script src="http://foo/">`
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "/foo?a=b&c=d") })`	→	`<script src="/foo?a=b&c=d">`
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "javascript:alert(1337)") })`	→	`<script src="javascript:alert(1337)">`
`<script src="/foo/{$x}">`		Entity-escape and filter out non-TrustedResourceUri.
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "bar") })`	→	`<script src="/foo/bar">`
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "bar&baz/boo") })`	→	`<script src="/foo/bar&baz/boo">`
`<script src="/foo?q={$x}">`		Entity-escape and filter out non-TrustedResourceUri.
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "bar&baz=boo") })`	→	`<script src="/foo?q=bar&baz=boo">`
	`({ "x": goog.html.TrustedResourceUrl. fromConstant(goog.string.Const.from( "A is #1") })`	→	`<script src="/foo?q=A is #1">`

Substitutions in JavaScript

Values in JavaScript that are inside quotes are dealt with differently from those outside quotes.

`<script>alert('{$x}');</script>`		Escaped inside quotes.
	`({ "x": "O'Reilly Books" })`	→	`<script>alert('O\'Reilly Books');</script>`
	`({ "x": new soydata.SanitizedJsStrChars( "O\\'Reilly Books") })`	→	`<script>alert('O\'Reilly Books');</script>`
`<script>alert({$x});</script>`		Without quotes, treated as a value.
	`({ "x": "O'Reilly Books" })`	→	`<script>alert('O\'Reilly Books');</script>`
	`({ "x": 42 })`	→	`<script>alert( 42 );</script>`
	`({ "x": true })`	→	`<script>alert( true );</script>`

Substitutions in CSS

Values in CSS can be parts of classes, IDs, quantities, colors, or URLs.

`<style>div#{$id} {lb} {rb}</style>`		Classes and IDs
	`({ "id": "foo-bar" })`	→	`<style>div#foo-bar { }</style>`
`<div style="color: {$x}">`		Quantities
	`({ "x": "red" })`	→	`<div style="color: red">`
	`({ "x": "#f00" })`	→	`<div style="color: #foo">`
	`({ "x": "expression('alert(1337)')" })`	→	`<div style="color: zSoyz">`
`<div style="margin-{$ltr-dir}: 1em">`		Property Names
	`({ "ltr-dir": "left" })`	→	`<div style="margin-left: 1em">`
	`({ "ltr-dir": "right" })`	→	`<div style="margin-right: 1em">`
`<style>p {lb} font-family: '{$x}' {rb}</style>`		Quoted Values
	`({ "x": "Arial" })`	→	`<style>p { font-family: 'Arial' }</style>`
	`({ "x": "</style>" })`	→	`<style>p { font-family: '\3c \2f style\3e ' }</style>`
`<div style="background: url({$x})">`		URLs in CSS are handled as in attributes above
	`({ "x": "/foo/bar" })`	→	`<div style="background: url(/foo/bar)">`
	`({ "x": "javascript:alert(1337)" })`	→	`<div style="background: url(#zSoyz)">`
	`({ "x": "?q=(O'Reilly) OR Books" })`	→	`<div style="background: url(?q=%28O%27Reilly%29%20OR%20Books)">`

Print Directives

Autoescaping works by automatically adding print directives to templates, so you can remove the print directives that you explicitly added, including |escapeJs, |escapeUri, |escapeHtml, and especially those dangerous |noAutoescape directives.

In case you have defined custom print directives, the autoescaper does not interfere with any {print …} command containing a directive that returns true from shouldCancelAutoescape(). Thus, if the escape directive transforms plain text to the expected content type, then override shouldCancelAutoescape() to return true. If your custom directive expects already-escaped input instead of plain text, you can implement SanitizedContentOperator to get the autoescaper to insert escaping directives before your directive so they produce the already-escaped input and pipe it to your directive.

Guarantees

Autoescaping augments Closure Templates to choose an appropriate encoding for each dynamic value so that even if a particular dynamic value can be controlled by an attacker, certain safety properties hold.

Specifically, if a template, and all the templates that it calls have autoescape="deprecated-contextual" or autoescape="strict", and have no manual escaping overrides such as |noAutoescape, then the following properties hold:

Structure is preserved

If you, the Closure Templates author, write <b>{$x}</b>, then the tags <b> and </b> always correspond to matched tags in the template output regardless of the value of $x.

No dynamic value can change the meaning of an HTML, CSS, or JavaScript token in the template, or correspondences between pairs of matched tokens.

Only code in the template is executed

Dynamic values cannot specify unsafe code. Any code hidden in dynamic values (whether via <script> elements, javascript: URIs, or some other mechanism) are treated as plain text and encoded properly on output instead of being rendered as code.

Dynamic values that appear in JavaScript (e.g. $message in <script>alert('{$message}')</script>) are encoded to expressions without side effects or free variables (to preserve privacy constraints). Given { "message": "'//\ndoEvil()//" }, the template produces <script>alert('\x27//\ndoEvil()//');</script>, which alerts the garbage string passed in instead of calling doEvil.

All code in the template is executed

A dynamic value cannot cause code to fail to parse. Some applications have security-critical code that they need to run if JavaScript is enabled. Take for example the following template:

<script>
  var s = '{$s}';
  doSecurityCriticalStuff();
</script>

If the value of the variable s is a newline character "\n", then a non-autoescaped template would produce the following output:

<script>
  var s = '
';
  doSecurityCriticalStuff();
</script>

The autoescaped version of the template instead produces:

<script>
  var s = '\n';
  doSecurityCriticalStuff();
</script>

which parses properly.

If a template or the templates that it calls do not have autoescaping enabled, or use explicit escaping directives like |noAutoescape incorrectly, then the autoescaper makes a best effort to preserve these properties but might fail.

Content Security Policy

Closure Templates has an optional pass that supports Content Security Policy nonces. CSP nonces are a defense-in-depth technique for restricting the execution of <script> and <style> blocks. With CSP nonces, even if an attacker can inject scripts into your document, they will be unable to execute unless they can also guess the CSP nonce. (See this article for a good overview.)

When CSP nonces are enabled in Closure Templates, autoescaped templates have nonce="..." added to <script> and <style> elements declared inside them:

<script>...</script>

becomes

<script {if $ij.csp_nonce} nonce="{$ij.csp_nonce}"{/if}>...</script>

There are three steps to configuring CSP nonces with Closure Templates:

Configure your web server to compute nonces and send them in CSP response headers.
Configure Closure Templates to add nonces to autoescaped templates.
Make the nonces computed in step 1 available to the templates from step 2.

Step 1 is outside the scope of this document. General considerations for nonces include generating strong random numbers ( article) and not reusing nonces ( article).

Step 2 is backend specific: see Tofu and JavaScript below.

For step 3, render with an injected data bundle that includes an $ij.csp_nonce value that is a valid nonce .

Tofu

Enable CSP support by calling

    mySoyFileSetBuilder
        .getGeneralOptions()
        .setSupportContentSecurityPolicy(true)

JavaScript

Pass the --supportContentSecurityPolicy=true command line flag to SoyToJsSrcCompiler to enable CSP support. Enabling this will increase the size of generated code for templates that include embedded scripts or styles.