Selectors

Modified on Tue, 06 Sep 2022 at 02:31 AM

Part of the core of the Kodexa platform is the ability to use selectors to find content within our document structure.

A selector works in a similar way to a CSS selector or an XPath.  It allows you to build a query that can be executed on an instance of a document.


For example if you wanted find all content nodes with the value “Name” you can do:

document.select('//*[contentRegex("Name")]')

This would return an iterator of the content nodes (see the What is a Kodexa Document?)


Selectors are a powerful way to work and offer a number of clever features, the syntax is broken into a few parts:


Axis & Node Type 

The axis is used to define how we are going to navigate the tree structure to identify the node types we want to match. Some basic examples are:


//Root node and all children
/Root node
.Current Node  (or root if from the document)
./line/.All nodes of type line under the current node
parent::lineAny node in the parent structure of this node that is of node type line



Predicate 

The predicate for the nodes selected with the axis and node type is in square brackets.  It can be made up a functions, attributes and operators to allow you to further filter the nodes that you want to select.


Functions can be used in the predicate, these are:


contentRegexReturn true if a regular expression on the content of the node matches

regularExpression: The regular expression to use
includeChildren: True if you want to include the children of this node for the expression (optional, default false)
typeRegex Return true if a regular expression on the node type name matches

regularExpression: The regular expression to use
tagRegexReturn true a regular expression on the tag name matches

regularExpression: The regular expression to use
hasTagReturn true if the tag has the given name

tagName: The name of the tag
hasFeatureReturn true if a feature on the node has a value matching the provided type and name

featureType: The feature type
featureName: The name of the feature
hasFeatureValueReturn true if a feature on the node has a value matching the provided type, name and value

featureType: The feature type
featureName: The name of the feature
featureValue: The value of the feature
contentReturns the content of the node
uuidReturns the UUID of the node
node_typeReturns the node type of the node
indexReturns the index of the node 


Also we support operators to allow you to combine functions, these are: 


|Union the results of two sides
=Test that two sides are equal
!=Test that two sides are not equal
intersectReturn the intersection of the two sides
andBoolean AND operation on the two sides
orBoolean OR operation on the two sides


Pipelines 


Another concept that is available in selectors is a “pipeline”.


This allows use to chain together a list of selectors, where the results of the first selector are passed into the next selector. This can be a useful way to build more complex queries against document structures.


For example:

//word stream *[hasTag("ORG")] stream *[hasTag("PERSON")]

In the example above we stream all nodes of type word and then filter those that have the tag ORG, then filter those that have the tag PERSON.


Examples


Below are some examples of how you can use selectors:


*[typeRegex("te.*")]
hasTag() = false() 
*[contentRegex("Hello.*")]
*[content()="Hello World"]
*[contentRegex("Cheese.*",true)] 
//*[typeRegex("te.*") and contentRegex("H.*D")] 
//*[hasTag($entityName)]
//p stream *[hasTag("ORG")] stream *[hasTag("PERSON")]
//p intersect //*[hasTag("ORG")] 
//*[hasFeature()]

 

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select atleast one of the reasons

Feedback sent

We appreciate your effort and will try to fix the article