The DOM tree classes in this module wrap the DOM tree returned by the builtin
XML parser and provide numerous routines for
querying nodes: get children/grand children/attributes/text of a node
manipulating nodes: modify attributes or text of a node
re-structuring the tree: add or remove child nodes of a node
The DOM tree always has a current context node relative to which all
queries/modifications are executed (gets initialized to the document root).
All matching and building procedures always leave the context node
unchanged. Most DOM tree method calls feature a contextNode keyword
argument to specify the target for the call on the fly.
The PyObjectDomTree class has support for storing arbitrary Python objects
in DOM nodes, which are pickled and unpickled as needed.
Notes:
no validation is performed when parsing XML documents
by default, minidom is used to build the DOM tree. Support for the much
faster cDomlette from FourThought is not quite functional
provides a DOM data object that has methods for
matching/manipulating/building nodes
Detail:
provides a persistent pointer into the
tree (the "context node") which is accessed with the
.setContextNode and .getContextNode methods.
Has three groups of methods:
node manipulation methods
node matching methods
node removal/creation methods
Every node in the DOM tree that has an "id" attribute
is indexed for fast retrieval with the .matchById or
.__getitem__ methods.
Node modifications (e.g., by changing an attribute with the
.setNodeAttribute method) are registered and can be queried
with the .nodeWasModified method. This can be used to check
if a node is still in synch with the source for the DOM tree.
Note that the term "children" (e.g., in the .matchChildren
method) always refers to element child nodes only
(i.e., attribute/text/cdata DOM child nodes are ignored).
builds a new child node of parentNode (or the context node, if
parentNode is None).
The new child node will be populated as specified in the nodeName
(mandatory) and text and attributes (optional) parameters. See
.buildChildFromNode for insertBefore and index parameters.
Parameters:
nodeName: name string for the new node
text: text string for the new node
attributes: mapping of attribute names to values for the new node
parentNode: the DOM node to serve as the parent for the new node
appends the node node as a child of parentNode. If insertBefore
is true, the node is pre-pended rather than appended to the list
of child nodes of parentNode. If insertBefore is a node, node
will be inserted just before this node in the list of child nodes of
parentNode. If index is false, indexing of the tree starting
with the added child node is suppressed.
Parameters:
node: source DOM node
parentNode: DOM node to serve as a parent for the new node
appends the children of the document element in the XML stream
xmlStream to the parent node parentNode, which defaults to the
context node. Further options are passed on as parse options to
from_xml_stream.
Parameters:
xmlStream: an XML stream object (see pdk.XmlStream module)
returns the content of all text nodes that are children of node
(joined with separator). Stips the content of the individual text
nodes if strip is set.
provides efficient access to the DOM tree node that has the (string) ID
idString. The index is kept current using the .indexNode and
.unindexNode methods.
The empty string passed in as idString matches the document node.
Raises a DomTreeError if no node with the given ID was found.
attributes: mapping of node attribute names to values
strictNames: boolean
strictValues: boolean
Value:
boolean
Rules for matching:
if nodeName is None, node matches (provided the attribute
check passes; see below);
if attributes is empty, node matches irrespective of its
actual attributes;
if strictNames is set, node does only match if it has
exactly the same attribute names as the keys of the dictionary
attributes.
If strictValues is also set, the corresponding values are also
checked.
If only strictValues is set, only the values given in
attributes are checked.
Note that this is the only matching function that operates directly on
a node.
removes the child(ren) of parentNode (the context node if
parentNode is None) specified by nodeName, text, and
attributes using the .matchChildren method. If unique is true,
an error is raised if more than one child matches; otherwise, all
matching children are removed.
Parameters:
nodeName: name string of the node(s) to remove
text: text content string of the node(s) to remove
attributes: attribute name to value mapping of the node(s) to
remove
parentNode: parent DOM node of the child node(s) to be removed
Note that this, contrary to the 4DOM implementation, removes any
existing attribute named attributeName prior to setting it to the new
value attributeValue (4DOM allows two attribute nodes with the same
name, we don`t).
removes the node node (and all its children) from the ID index.
Fails silently if the node id cannot be found (i.e., the node has not
been indexed before). If a walker instance is supplied, it should
adhere to the interface of ElementWalker.
update existing attributes of node only. Fails silently if a key
in attributeD is not found in the attributes of node unless
strict is set, in which case an AttributeError is raised.
CODES = {'empty_tree': ('A node could not be found because the DOM tree has no indexed nodes', ''), 'id_not_found': ('No node matching the given id string was found.', ''), 'key_not_found': ('No node matching the DOM tree key was found.', ''), 'nonunique_remove': ('child to be removed not specified uniquely in remove operation', '')}
a DOM tree supporting automatic pickling of Python objects to XML
Detail:
Python objects can be stored either as an attribute of or as
text in an element node. Either the node receiving the Python
object or any of its parent needs to provide a (unique) "id"
attribute; internally, the node will then be referenced as
follows:
attribute referenced:
<node id>[_<child element name>]*[_attribute name]
value referenced:
<node id>[_<child element name>]*"_value"
Hence it is required that the string built from concatenating
the node id and the names of all child nodes along the path to
the current node uniquely identifies the attribute/value.
Note that, unlike in the base class, any access to a node
attribute or value that had been assigned a Python object will
result in an attempt to restore this object via a call to eval()
in the namespaces returned by the .getLocalNamespace and
.getGlobalNamespace methods. Also, strings representing "atomic"
data types (bool, int, float) are automatically
converted to the corresponding Python objects (this implies that
double quotes are needed to specify the string "1" in an XML
source!).
By convention, a string starting and ending with a "@"
character assigned as a node attribute or value is also
interpreted as a Python object upon read access. This allows
references to Python objects in the runtime namespace to be made
in any XML source (e.g., a file).
Internally, python objects are converted to strings with the
one-liner