A DOM node is an object with properties containing information about node itself and its contents. Some of the properties are read-only, and some can be updated on-the-fly.
Structure and content properties
nodeType
All nodes are typed. There are totally 12 types of nodes. described in DOM Level 1.
interface Node {
// NodeType
const unsigned short ELEMENT_NODE = 1;
const unsigned short ATTRIBUTE_NODE = 2;
const unsigned short TEXT_NODE = 3;
const unsigned short CDATA_SECTION_NODE = 4;
const unsigned short ENTITY_REFERENCE_NODE = 5;
const unsigned short ENTITY_NODE = 6;
const unsigned short PROCESSING_INSTRUCTION_NODE = 7;
const unsigned short COMMENT_NODE = 8;
const unsigned short DOCUMENT_NODE = 9;
const unsigned short DOCUMENT_TYPE_NODE = 10;
const unsigned short DOCUMENT_FRAGMENT_NODE = 11;
const unsigned short NOTATION_NODE = 12;
...
}
The most important ones are ELEMENT_NODE with number 1 and TEXT_NODE, which has number 3. Other types are rarely used.
For example, to list all nodes skipping non-elements, one can iterate over childNodes and use childNodes[i].nodeType != 1 check.
That is demonstrated in the example below:
<body>
<div>Allowed readers:</div>
<ul>
<li>John</li>
<li>Bob</li>
</ul>
<!-- a comment node -->
<script>
var childNodes = document.body.childNodes
for(var i=0; i<childNodes.length; i++) {
*!*
if (childNodes[i].nodeType != 1) continue
*/!*
alert(childNodes[i])
}
</script>
</body>
What does the page alert?
<!DOCTYPE HTML>
<html>
<body>
<script>
alert(document.body.lastChild.nodeType)
</script>
</body>
</html>
The minor pitfall is that at the time of script execution, the last child is the SCRIPT itself.
So, the result is 1, the element node.
<!DOCTYPE HTML>
<html>
<body>
<script>
alert(document.body.lastChild.nodeType)
</script>
</body>
</html>
nodeName, tagName
Both nodeName and tagName contain the name of an element node.
For document.body:
alert( document.body.nodeName ) // BODY
In HTML any nodeName is uppercased, no matter which case you use in the document.
That’s a rare, exceptional case when nodeName is not uppercased. Read on only if you’re curious.
As you probably know, a browser has two modes of parsing: HTML-mode and XML-mode. Usually, HTML-mode is used, but XML documents, received via XMLHttpRequest (an AJAX technique), XML-mode is enabled.
In Firefox, the XML-mode is also used when XHTML documents have xmlish Content-Type.
In XML-mode nodeName preserves case, so there may appear “body” and “bOdY” nodeNames.
So, if one loads XML from the server with XMLHttpRequest and transfer XML nodes into the HTML document, the case will be kept “as is”.
For element nodes, nodeName and tagName are the same.
But nodeName property also exists on non-element nodes. It has special values on such nodes, like in the example below:
alert(document.nodeName) // #document
The tagName property is undefined on most node types and equals '!' for comment nodes in IE.
So, generally tagName is less informative than nodeName. But it is one-symbol shorter. So, if you are working with node elements only, feel free to prefer it over nodeName.
innerHTML
The innerHTML property is a part of HTML 5 standard, see embedded content.
It allows to access node contents in the text form. The example below will output all contents from document.body and replace is by a new one.
<body>
<p>The paragraph</p>
<div>And a div</div>
<script>
alert( document.body.innerHTML ) // read current contents
document.body.innerHTML = 'Yaaahooo!' // replace contents
</script>
</body>
The innerHTML should contain a valid HTML. But usually the browser can parse malformed HTML as well.
The innerHTML property works for any element node. It’s very, very useful.
innerHTML pitfalls
The innerHTML is not as simple as it may seem. There is a number of pitfalls awaiting for a newbie, and sometimes even an experienced programmer.
In Internet Explorer, innerHTML is read-only for COL, COLGROUP, FRAMESET, HEAD, HTML, STYLE, TABLE, TBODY, TFOOT, THEAD, TITLE, TR.
In IE, innerHTML is read-only for all table tags except TD.
Syntactically, is possible to append to innerHTML with elem.innerHTML += "New text", like below:
chatDiv.innerHTML += "<div>Hi <img src='smile.gif'/> !</div>" chatDiv.innerHTML += "How you doing?"
But what actually is done:
- Old content is wiped
- The new value
innerHTMLis parsed and inserted.
The content is not appended, it is re-created. So, all images and other resources will be reloaded after +=, including the smile.gif in the example above.
Fortunately, there are other ways to update content, which make no use of innerHTML and don’t have that issue.
nodeValue
The innerHTML works only for element nodes.
For other types of nodes, there is a nodeValue property, which keeps the content.
The example below demonstrates how it works for text nodes and comments:
<body>
The text
<!-- A comment -->
<script>
for(var i=0; i<document.body.childNodes.length; i++) {
alert(document.body.childNodes[i].nodeValue)
}
</script>
</body>
In the example above, few alerts are empty, just because of whitespace text nodes. Also note, that nodeValue === null for SCRIPT. That’s because SCRIPT is an element node. For element nodes, use innerHTML.
Summary
nodeType- Type of the node. Most notable types are
1for elements and3for text nodes. Read-only. nodeName/tagName- Tag name in upper case. The
nodeNamealso has special values for non-element nodes. Read-only. innerHTML- Content of an element node. Writeable.
nodeValue- Content of a text node. Writeable.
DOM Nodes also have other properties, depending on the tag. For example, an INPUT element has value and checked properties, A has href etc.