xpath

package module

v1.3.4 Latest Latest Go to latest Published: Apr 5, 2025 License: MIT Imports: 12 Imported by: 183

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/antchfx/xpath

Links

Open Source Insights

README ¶

XPath

XPath is Go package provides selecting nodes from XML, HTML or other documents using XPath expression.

Implementation

htmlquery - an XPath query package for HTML document
xmlquery - an XPath query package for XML document.
jsonquery - an XPath query package for JSON document

Supported Features

The basic XPath patterns.

The basic XPath patterns cover 90% of the cases that most stylesheets will need.

node : Selects all child elements with nodeName of node.
* : Selects all child elements.
@attr : Selects the attribute attr.
@* : Selects all attributes.
node() : Matches an org.w3c.dom.Node.
text() : Matches a org.w3c.dom.Text node.
comment() : Matches a comment.
. : Selects the current node.
.. : Selects the parent of current node.
/ : Selects the document node.
a[expr] : Select only those nodes matching a which also satisfy the expression expr.
a[n] : Selects the nth matching node matching a When a filter's expression is a number, XPath selects based on position.
a/b : For each node matching a, add the nodes matching b to the result.
a//b : For each node matching a, add the descendant nodes matching b to the result.
//b : Returns elements in the entire document matching b.
a|b : All nodes matching a or b, union operation(not boolean or).
(a, b, c) : Evaluates each of its operands and concatenates the resulting sequences, in order, into a single result sequence
(a/b) : Selects all matches nodes as grouping set.

Node Axes

child::* : The child axis selects children of the current node.
- child::node(): Selects all the children of the context node.
- child::text(): Selects all text node children of the context node.
descendant::* : The descendant axis selects descendants of the current node. It is equivalent to '//'.
descendant-or-self::* : Selects descendants including the current node.
attribute::* : Selects attributes of the current element. It is equivalent to @*
following-sibling::* : Selects nodes after the current node.
preceding-sibling::* : Selects nodes before the current node.
following::* : Selects the first matching node following in document order, excluding descendants.
preceding::* : Selects the first matching node preceding in document order, excluding ancestors.
parent::* : Selects the parent if it matches. The '..' pattern from the core is equivalent to 'parent::node()'.
ancestor::* : Selects matching ancestors.
ancestor-or-self::* : Selects ancestors including the current node.
self::* : Selects the current node. '.' is equivalent to 'self::node()'.

Expressions

The gxpath supported three types: number, boolean, string.

path : Selects nodes based on the path.
a = b : Standard comparisons.
- a = b : True if a equals b.
- a != b : True if a is not equal to b.
- a < b : True if a is less than b.
- a <= b : True if a is less than or equal to b.
- a > b : True if a is greater than b.
- a >= b : True if a is greater than or equal to b.
a + b : Arithmetic expressions.
- - a Unary minus
- a + b : Addition
- a - b : Subtraction
- a * b : Multiplication
- a div b : Division
- a mod b : Modulus (division remainder)
a or b : Boolean or operation.
a and b : Boolean and operation.
(expr) : Parenthesized expressions.
fun(arg1, ..., argn) : Function calls:

Function	Supported
`boolean()`	✓
`ceiling()`	✓
`choose()`	✗
`concat()`	✓
`contains()`	✓
`count()`	✓
`current()`	✗
`document()`	✗
`element-available()`	✗
`ends-with()`	✓
`false()`	✓
`floor()`	✓
`format-number()`	✗
`function-available()`	✗
`generate-id()`	✗
`id()`	✗
`key()`	✗
`lang()`	✗
`last()`	✓
`local-name()`	✓
`lower-case()`[^1]	✓
`matches()`	✓
`name()`	✓
`namespace-uri()`	✓
`normalize-space()`	✓
`not()`	✓
`number()`	✓
`position()`	✓
`replace()`	✓
`reverse()`	✓
`round()`	✓
`starts-with()`	✓
`string()`	✓
`string-join()`[^1]	✓
`string-length()`	✓
`substring()`	✓
`substring-after()`	✓
`substring-before()`	✓
`sum()`	✓
`system-property()`	✗
`translate()`	✓
`true()`	✓
`unparsed-entity-url()`	✗

[^1]: XPath-2.0 expression

Documentation ¶

Overview ¶

Example ¶

XPath package example. See more xpath implements package: https://github.com/antchfx/htmlquery https://github.com/antchfx/xmlquery https://github.com/antchfx/jsonquery

package main

import (
	"bytes"
	"fmt"

	"github.com/antchfx/xpath"
)

type NodeType uint

const (
	DocumentNode NodeType = iota
	ElementNode
	TextNode
)

type Node struct {
	Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node

	Type NodeType
	Data string
}

func (n *Node) Value() string {
	if n.Type == TextNode {
		return n.Data
	}

	var buff bytes.Buffer
	var output func(*Node)
	output = func(node *Node) {
		if node.Type == TextNode {
			buff.WriteString(node.Data)
		}
		for child := node.FirstChild; child != nil; child = child.NextSibling {
			output(child)
		}
	}
	output(n)
	return buff.String()
}

func (parent *Node) AddChild(n *Node) {
	n.Parent = parent
	n.NextSibling = nil
	if parent.FirstChild == nil {
		parent.FirstChild = n
		n.PrevSibling = nil
	} else {
		parent.LastChild.NextSibling = n
		n.PrevSibling = parent.LastChild
	}

	parent.LastChild = n
}

type NodeNavigator struct {
	curr, root *Node
}

func (n *NodeNavigator) NodeType() xpath.NodeType {
	switch n.curr.Type {
	case TextNode:
		return xpath.TextNode
	case DocumentNode:
		return xpath.RootNode
	}
	return xpath.ElementNode
}

func (n *NodeNavigator) LocalName() string {
	return n.curr.Data
}

func (n *NodeNavigator) Prefix() string {
	return ""
}

func (n *NodeNavigator) NamespaceURL() string {
	return ""
}

func (n *NodeNavigator) Value() string {
	switch n.curr.Type {
	case ElementNode:
		return n.curr.Value()
	case TextNode:
		return n.curr.Data
	}
	return ""
}

func (n *NodeNavigator) Copy() xpath.NodeNavigator {
	n2 := *n
	return &n2
}

func (n *NodeNavigator) MoveToRoot() {
	n.curr = n.root
}

func (n *NodeNavigator) MoveToParent() bool {
	if node := n.curr.Parent; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToNextAttribute() bool {
	return true
}

func (n *NodeNavigator) MoveToChild() bool {
	if node := n.curr.FirstChild; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToFirst() bool {
	for {
		node := n.curr.PrevSibling
		if node == nil {
			break
		}
		n.curr = node
	}
	return true
}

func (n *NodeNavigator) String() string {
	return n.Value()
}

func (n *NodeNavigator) MoveToNext() bool {
	if node := n.curr.NextSibling; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToPrevious() bool {
	if node := n.curr.PrevSibling; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool {
	node, ok := other.(*NodeNavigator)
	if !ok || node.root != n.root {
		return false
	}

	n.curr = node.curr
	return true
}

// XPath package example. See more xpath implements package:
// https://github.com/antchfx/htmlquery
// https://github.com/antchfx/xmlquery
// https://github.com/antchfx/jsonquery
func main() {
	/***
	?xml version="1.0" encoding="UTF-8"?>
	<bookstore>
		<book>
			<title>Everyday Italian</title>
			<author>Giada De Laurentiis</author>
			<year>2005</year>
			<price>30.00</price>
		</book>
		<book>
			<title>Harry Potter</title>
			<author>J K. Rowling</author>
			<year>2005</year>
			<price>29.99</price>
		</book>
	</bookstore>
	**/

	// Here, for begin test, we should create a document
	books := []struct {
		title  string
		author string
		year   int
		price  float64
	}{
		{title: "Everyday Italian", author: "Giada De Laurentiis", year: 2005, price: 30.00},
		{title: "Harry Potter", author: "J K. Rowling", year: 2005, price: 29.99},
	}
	bookstore := &Node{Data: "bookstore", Type: ElementNode}
	for _, v := range books {
		book := &Node{Data: "book", Type: ElementNode}
		title := &Node{Data: "title", Type: ElementNode}
		title.AddChild(&Node{Data: v.title, Type: TextNode})
		book.AddChild(title)
		author := &Node{Data: "author", Type: ElementNode}
		author.AddChild(&Node{Data: v.author, Type: TextNode})
		book.AddChild(author)
		year := &Node{Data: "year", Type: ElementNode}
		year.AddChild(&Node{Data: fmt.Sprintf("%d", v.year), Type: TextNode})
		book.AddChild(year)
		price := &Node{Data: "price", Type: ElementNode}
		price.AddChild(&Node{Data: fmt.Sprintf("%f", v.price), Type: TextNode})
		book.AddChild(price)
		bookstore.AddChild(book)
	}
	var doc = &Node{}
	doc.AddChild(bookstore)
	var root xpath.NodeNavigator = &NodeNavigator{curr: doc, root: doc}
	expr, err := xpath.Compile("count(//book)")
	// using Evaluate() method
	if err != nil {
		panic(err)
	}
	val := expr.Evaluate(root) // it returns float64 type
	fmt.Println(val.(float64))

	// using Evaluate() method
	expr = xpath.MustCompile("sum(//price)")
	val = expr.Evaluate(root) // output total price
	fmt.Println(val.(float64))

	// using Select() method
	expr = xpath.MustCompile("//book")
	iter := expr.Select(root) // it always returns NodeIterator object.
	for iter.MoveNext() {
		fmt.Println(iter.Current().Value())
	}
}

Output:

Index ¶

Variables
func NewLoadingCache(load loadFunc, capacity int) *loadingCache
type Expr
type NodeIterator
- func Select(root NodeNavigator, expr string) *NodeIterator
- func (t *NodeIterator) Current() NodeNavigator
- func (t *NodeIterator) MoveNext() bool
type NodeNavigator
type NodeType

Examples ¶

Package

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// RegexpCache is a loading cache for string -> *regexp.Regexp mapping. It is exported so that in rare cases
	// client can customize load func and/or capacity.
	RegexpCache = defaultRegexpCache()
)

Functions ¶

func NewLoadingCache ¶ added in v1.1.11

func NewLoadingCache(load loadFunc, capacity int) *loadingCache

NewLoadingCache creates a new instance of a loading cache with capacity. Capacity must be >= 0, or it will panic. Capacity == 0 means the cache growth is unbounded.

Types ¶

type Expr ¶

type Expr struct {
	// contains filtered or unexported fields
}

Expr is an XPath expression for query.

func Compile ¶

func Compile(expr string) (*Expr, error)

Compile compiles an XPath expression string.

func CompileWithNS ¶ added in v1.2.4

func CompileWithNS(expr string, namespaces map[string]string) (*Expr, error)

CompileWithNS compiles an XPath expression string, using given namespaces map.

func MustCompile ¶

func MustCompile(expr string) *Expr

MustCompile compiles an XPath expression string and ignored error.

func (*Expr) Evaluate ¶

func (expr *Expr) Evaluate(root NodeNavigator) interface{}

Evaluate returns the result of the expression. The result type of the expression is one of the follow: bool,float64,string,NodeIterator).

func (*Expr) Select ¶

func (expr *Expr) Select(root NodeNavigator) *NodeIterator

Select selects a node set using the specified XPath expression.

func (*Expr) String ¶

func (expr *Expr) String() string

String returns XPath expression string.

type NodeIterator ¶

type NodeIterator struct {
	// contains filtered or unexported fields
}

NodeIterator holds all matched Node object.

func Select ¶

func Select(root NodeNavigator, expr string) *NodeIterator

Select selects a node set using the specified XPath expression. This method is deprecated, recommend using Expr.Select() method instead.

func (*NodeIterator) Current ¶

func (t *NodeIterator) Current() NodeNavigator

Current returns current node which matched.

func (*NodeIterator) MoveNext ¶

func (t *NodeIterator) MoveNext() bool

MoveNext moves Navigator to the next match node.

type NodeNavigator ¶

type NodeNavigator interface {
	// NodeType returns the XPathNodeType of the current node.
	NodeType() NodeType

	// LocalName gets the Name of the current node.
	LocalName() string

	// Prefix returns namespace prefix associated with the current node.
	Prefix() string

	// Value gets the value of current node.
	Value() string

	// Copy does a deep copy of the NodeNavigator and all its components.
	Copy() NodeNavigator

	// MoveToRoot moves the NodeNavigator to the root node of the current node.
	MoveToRoot()

	// MoveToParent moves the NodeNavigator to the parent node of the current node.
	MoveToParent() bool

	// MoveToNextAttribute moves the NodeNavigator to the next attribute on current node.
	MoveToNextAttribute() bool

	// MoveToChild moves the NodeNavigator to the first child node of the current node.
	MoveToChild() bool

	// MoveToFirst moves the NodeNavigator to the first sibling node of the current node.
	MoveToFirst() bool

	// MoveToNext moves the NodeNavigator to the next sibling node of the current node.
	MoveToNext() bool

	// MoveToPrevious moves the NodeNavigator to the previous sibling node of the current node.
	MoveToPrevious() bool

	// MoveTo moves the NodeNavigator to the same position as the specified NodeNavigator.
	MoveTo(NodeNavigator) bool
}

NodeNavigator provides cursor model for navigating XML data.

type NodeType ¶

type NodeType int

NodeType represents a type of XPath node.

const (
	// RootNode is a root node of the XML document or node tree.
	RootNode NodeType = iota

	// ElementNode is an element, such as <element>.
	ElementNode

	// AttributeNode is an attribute, such as id='123'.
	AttributeNode

	// TextNode is the text content of a node.
	TextNode

	// CommentNode is a comment node, such as <!-- my comment -->
	CommentNode
)

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL