org.unbescape.html
Class HtmlEscape

Object
  extended by org.unbescape.html.HtmlEscape

public final class HtmlEscape
extends Object

Utility class for performing HTML escape/unescape operations.

Configuration of escape/unescape operations

Escape operations can be (optionally) configured by means of:

Unescape operations need no configuration parameters. Unescape operations will always perform complete unescape of NCRs (whole HTML5 set supported), decimal and hexadecimal references.

Features

Specific features of the HTML escape/unescape operations performed by means of this class:

Input/Output

There are two different input/output modes that can be used in escape/unescape operations:

Glossary

NCR
Named Character Reference or Character Entity Reference: textual representation of an Unicode codepoint: á
DCR
Decimal Character Reference: base-10 numerical representation of an Unicode codepoint: á
HCR
Hexadecimal Character Reference: hexadecimal numerical representation of an Unicode codepoint: á
Unicode Codepoint
Each of the int values conforming the Unicode code space. Normally corresponding to a Java char primitive value (codepoint <= \uFFFF), but might be two chars for codepoints \u10000 to \u10FFFF if the first char is a high surrogate (\uD800 to \uDBFF) and the second is a low surrogate (\uDC00 to \uDFFF).

References

The following references apply:

Since:
1.0
Author:
Daniel Fernández

Method Summary
static void escapeHtml(char[] text, int offset, int len, Writer writer, HtmlEscapeType type, HtmlEscapeLevel level)
           Perform a (configurable) HTML escape operation on a char[] input.
static String escapeHtml(String text, HtmlEscapeType type, HtmlEscapeLevel level)
           Perform a (configurable) HTML escape operation on a String input.
static void escapeHtml4(char[] text, int offset, int len, Writer writer)
           Perform an HTML 4 level 2 (result is ASCII) escape operation on a char[] input.
static String escapeHtml4(String text)
           Perform an HTML 4 level 2 (result is ASCII) escape operation on a String input.
static void escapeHtml4Xml(char[] text, int offset, int len, Writer writer)
           Perform an HTML 4 level 1 (XML-style) escape operation on a char[] input.
static String escapeHtml4Xml(String text)
           Perform an HTML 4 level 1 (XML-style) escape operation on a String input.
static void escapeHtml5(char[] text, int offset, int len, Writer writer)
           Perform an HTML5 level 2 (result is ASCII) escape operation on a char[] input.
static String escapeHtml5(String text)
           Perform an HTML5 level 2 (result is ASCII) escape operation on a String input.
static void escapeHtml5Xml(char[] text, int offset, int len, Writer writer)
           Perform an HTML5 level 1 (XML-style) escape operation on a char[] input.
static String escapeHtml5Xml(String text)
           Perform an HTML5 level 1 (XML-style) escape operation on a String input.
static void unescapeHtml(char[] text, int offset, int len, Writer writer)
           Perform an HTML unescape operation on a char[] input.
static String unescapeHtml(String text)
           Perform an HTML unescape operation on a String input.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

escapeHtml5

public static String escapeHtml5(String text)

Perform an HTML5 level 2 (result is ASCII) escape operation on a String input.

Level 2 means this method will escape:

This escape will be performed by replacing those chars by the corresponding HTML5 Named Character References (e.g. '&acute;') when such NCR exists for the replaced character, and replacing by a decimal character reference (e.g. '&#8345;') when there there is no NCR for the replaced character.

This method calls escapeHtml(String, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the String to be escaped.
Returns:
The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

escapeHtml5Xml

public static String escapeHtml5Xml(String text)

Perform an HTML5 level 1 (XML-style) escape operation on a String input.

Level 1 means this method will only escape the five markup-significant characters: <, >, &, " and '. It is called XML-style in order to link it with JSP's escapeXml attribute in JSTL's <c:out ... /> tags.

Note this method may not produce the same results as escapeHtml4Xml(String) because it will escape the apostrophe as &apos;, whereas in HTML 4 such NCR does not exist (the decimal numeric reference &#39; is used instead).

This method calls escapeHtml(String, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the String to be escaped.
Returns:
The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

escapeHtml4

public static String escapeHtml4(String text)

Perform an HTML 4 level 2 (result is ASCII) escape operation on a String input.

Level 2 means this method will escape:

This escape will be performed by replacing those chars by the corresponding HTML 4 Named Character References (e.g. '&acute;') when such NCR exists for the replaced character, and replacing by a decimal character reference (e.g. '&#8345;') when there there is no NCR for the replaced character.

This method calls escapeHtml(String, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the String to be escaped.
Returns:
The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

escapeHtml4Xml

public static String escapeHtml4Xml(String text)

Perform an HTML 4 level 1 (XML-style) escape operation on a String input.

Level 1 means this method will only escape the five markup-significant characters: <, >, &, " and '. It is called XML-style in order to link it with JSP's escapeXml attribute in JSTL's <c:out ... /> tags.

Note this method may not produce the same results as escapeHtml5Xml(String) because it will escape the apostrophe as &#39;, whereas in HTML5 there is a specific NCR for such character (&apos;).

This method calls escapeHtml(String, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the String to be escaped.
Returns:
The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

escapeHtml

public static String escapeHtml(String text,
                                HtmlEscapeType type,
                                HtmlEscapeLevel level)

Perform a (configurable) HTML escape operation on a String input.

This method will perform an escape operation according to the specified HtmlEscapeType and HtmlEscapeLevel argument values.

All other String-based escapeHtml*(...) methods call this one with preconfigured type and level values.

This method is thread-safe.

Parameters:
text - the String to be escaped.
type - the type of escape operation to be performed, see HtmlEscapeType.
level - the escape level to be applied, see HtmlEscapeLevel.
Returns:
The escaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no escaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

escapeHtml5

public static void escapeHtml5(char[] text,
                               int offset,
                               int len,
                               Writer writer)
                        throws IOException

Perform an HTML5 level 2 (result is ASCII) escape operation on a char[] input.

Level 2 means this method will escape:

This escape will be performed by replacing those chars by the corresponding HTML5 Named Character References (e.g. '&acute;') when such NCR exists for the replaced character, and replacing by a decimal character reference (e.g. '&#8345;') when there there is no NCR for the replaced character.

This method calls escapeHtml(char[], int, int, java.io.Writer, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the char[] to be escaped.
offset - the position in text at which the escape operation should start.
len - the number of characters in text that should be escaped.
writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if text is null.
Throws:
IOException

escapeHtml5Xml

public static void escapeHtml5Xml(char[] text,
                                  int offset,
                                  int len,
                                  Writer writer)
                           throws IOException

Perform an HTML5 level 1 (XML-style) escape operation on a char[] input.

Level 1 means this method will only escape the five markup-significant characters: <, >, &, " and '. It is called XML-style in order to link it with JSP's escapeXml attribute in JSTL's <c:out ... /> tags.

Note this method may not produce the same results as escapeHtml4Xml(char[], int, int, java.io.Writer) because it will escape the apostrophe as &apos;, whereas in HTML 4 such NCR does not exist (the decimal numeric reference &#39; is used instead).

This method calls escapeHtml(char[], int, int, java.io.Writer, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the char[] to be escaped.
offset - the position in text at which the escape operation should start.
len - the number of characters in text that should be escaped.
writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if text is null.
Throws:
IOException

escapeHtml4

public static void escapeHtml4(char[] text,
                               int offset,
                               int len,
                               Writer writer)
                        throws IOException

Perform an HTML 4 level 2 (result is ASCII) escape operation on a char[] input.

Level 2 means this method will escape:

This escape will be performed by replacing those chars by the corresponding HTML 4 Named Character References (e.g. '&acute;') when such NCR exists for the replaced character, and replacing by a decimal character reference (e.g. '&#8345;') when there there is no NCR for the replaced character.

This method calls escapeHtml(char[], int, int, java.io.Writer, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the char[] to be escaped.
offset - the position in text at which the escape operation should start.
len - the number of characters in text that should be escaped.
writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if text is null.
Throws:
IOException

escapeHtml4Xml

public static void escapeHtml4Xml(char[] text,
                                  int offset,
                                  int len,
                                  Writer writer)
                           throws IOException

Perform an HTML 4 level 1 (XML-style) escape operation on a char[] input.

Level 1 means this method will only escape the five markup-significant characters: <, >, &, " and '. It is called XML-style in order to link it with JSP's escapeXml attribute in JSTL's <c:out ... /> tags.

Note this method may not produce the same results as escapeHtml5Xml(char[], int, int, java.io.Writer) because it will escape the apostrophe as &#39;, whereas in HTML5 there is a specific NCR for such character (&apos;).

This method calls escapeHtml(char[], int, int, java.io.Writer, HtmlEscapeType, HtmlEscapeLevel) with the following preconfigured values:

This method is thread-safe.

Parameters:
text - the char[] to be escaped.
offset - the position in text at which the escape operation should start.
len - the number of characters in text that should be escaped.
writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if text is null.
Throws:
IOException

escapeHtml

public static void escapeHtml(char[] text,
                              int offset,
                              int len,
                              Writer writer,
                              HtmlEscapeType type,
                              HtmlEscapeLevel level)
                       throws IOException

Perform a (configurable) HTML escape operation on a char[] input.

This method will perform an escape operation according to the specified HtmlEscapeType and HtmlEscapeLevel argument values.

All other char[]-based escapeHtml*(...) methods call this one with preconfigured type and level values.

This method is thread-safe.

Parameters:
text - the char[] to be escaped.
offset - the position in text at which the escape operation should start.
len - the number of characters in text that should be escaped.
writer - the java.io.Writer to which the escaped result will be written. Nothing will be written at all to this writer if text is null.
type - the type of escape operation to be performed, see HtmlEscapeType.
level - the escape level to be applied, see HtmlEscapeLevel.
Throws:
IOException

unescapeHtml

public static String unescapeHtml(String text)

Perform an HTML unescape operation on a String input.

No additional configuration arguments are required. Unescape operations will always perform complete unescape of NCRs (whole HTML5 set supported), decimal and hexadecimal references.

This method is thread-safe.

Parameters:
text - the String to be unescaped.
Returns:
The unescaped result String. As a memory-performance improvement, will return the exact same object as the text input argument if no unescaping modifications were required (and no additional String objects will be created during processing). Will return null if text is null.

unescapeHtml

public static void unescapeHtml(char[] text,
                                int offset,
                                int len,
                                Writer writer)
                         throws IOException

Perform an HTML unescape operation on a char[] input.

No additional configuration arguments are required. Unescape operations will always perform complete unescape of NCRs (whole HTML5 set supported), decimal and hexadecimal references.

This method is thread-safe.

Parameters:
text - the char[] to be unescaped.
offset - the position in text at which the unescape operation should start.
len - the number of characters in text that should be unescaped.
writer - the java.io.Writer to which the unescaped result will be written. Nothing will be written at all to this writer if text is null.
Throws:
IOException


Copyright © 2014 The UNBESCAPE team. All rights reserved.