xpath

package module
v1.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 5, 2025 License: MIT Imports: 12 Imported by: 183

README

XPath

GoDoc Coverage Status Build Status Go Report Card

XPath is Go package provides selecting nodes from XML, HTML or other documents using XPath expression.

Implementation

  • htmlquery - an XPath query package for HTML document

  • xmlquery - an XPath query package for XML document.

  • jsonquery - an XPath query package for JSON document

Supported Features

The basic XPath patterns.

The basic XPath patterns cover 90% of the cases that most stylesheets will need.

  • node : Selects all child elements with nodeName of node.

  • * : Selects all child elements.

  • @attr : Selects the attribute attr.

  • @* : Selects all attributes.

  • node() : Matches an org.w3c.dom.Node.

  • text() : Matches a org.w3c.dom.Text node.

  • comment() : Matches a comment.

  • . : Selects the current node.

  • .. : Selects the parent of current node.

  • / : Selects the document node.

  • a[expr] : Select only those nodes matching a which also satisfy the expression expr.

  • a[n] : Selects the nth matching node matching a When a filter's expression is a number, XPath selects based on position.

  • a/b : For each node matching a, add the nodes matching b to the result.

  • a//b : For each node matching a, add the descendant nodes matching b to the result.

  • //b : Returns elements in the entire document matching b.

  • a|b : All nodes matching a or b, union operation(not boolean or).

  • (a, b, c) : Evaluates each of its operands and concatenates the resulting sequences, in order, into a single result sequence

  • (a/b) : Selects all matches nodes as grouping set.

Node Axes
  • child::* : The child axis selects children of the current node.

    • child::node(): Selects all the children of the context node.
    • child::text(): Selects all text node children of the context node.
  • descendant::* : The descendant axis selects descendants of the current node. It is equivalent to '//'.

  • descendant-or-self::* : Selects descendants including the current node.

  • attribute::* : Selects attributes of the current element. It is equivalent to @*

  • following-sibling::* : Selects nodes after the current node.

  • preceding-sibling::* : Selects nodes before the current node.

  • following::* : Selects the first matching node following in document order, excluding descendants.

  • preceding::* : Selects the first matching node preceding in document order, excluding ancestors.

  • parent::* : Selects the parent if it matches. The '..' pattern from the core is equivalent to 'parent::node()'.

  • ancestor::* : Selects matching ancestors.

  • ancestor-or-self::* : Selects ancestors including the current node.

  • self::* : Selects the current node. '.' is equivalent to 'self::node()'.

Expressions

The gxpath supported three types: number, boolean, string.

  • path : Selects nodes based on the path.

  • a = b : Standard comparisons.

    • a = b : True if a equals b.
    • a != b : True if a is not equal to b.
    • a < b : True if a is less than b.
    • a <= b : True if a is less than or equal to b.
    • a > b : True if a is greater than b.
    • a >= b : True if a is greater than or equal to b.
  • a + b : Arithmetic expressions.

    • - a Unary minus
    • a + b : Addition
    • a - b : Subtraction
    • a * b : Multiplication
    • a div b : Division
    • a mod b : Modulus (division remainder)
  • a or b : Boolean or operation.

  • a and b : Boolean and operation.

  • (expr) : Parenthesized expressions.

  • fun(arg1, ..., argn) : Function calls:

Function Supported
boolean()
ceiling()
choose()
concat()
contains()
count()
current()
document()
element-available()
ends-with()
false()
floor()
format-number()
function-available()
generate-id()
id()
key()
lang()
last()
local-name()
lower-case()[^1]
matches()
name()
namespace-uri()
normalize-space()
not()
number()
position()
replace()
reverse()
round()
starts-with()
string()
string-join()[^1]
string-length()
substring()
substring-after()
substring-before()
sum()
system-property()
translate()
true()
unparsed-entity-url()

[^1]: XPath-2.0 expression

Documentation

Overview

Example

XPath package example. See more xpath implements package: https://github.com/antchfx/htmlquery https://github.com/antchfx/xmlquery https://github.com/antchfx/jsonquery

package main

import (
	"bytes"
	"fmt"

	"github.com/antchfx/xpath"
)

type NodeType uint

const (
	DocumentNode NodeType = iota
	ElementNode
	TextNode
)

type Node struct {
	Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node

	Type NodeType
	Data string
}

func (n *Node) Value() string {
	if n.Type == TextNode {
		return n.Data
	}

	var buff bytes.Buffer
	var output func(*Node)
	output = func(node *Node) {
		if node.Type == TextNode {
			buff.WriteString(node.Data)
		}
		for child := node.FirstChild; child != nil; child = child.NextSibling {
			output(child)
		}
	}
	output(n)
	return buff.String()
}

func (parent *Node) AddChild(n *Node) {
	n.Parent = parent
	n.NextSibling = nil
	if parent.FirstChild == nil {
		parent.FirstChild = n
		n.PrevSibling = nil
	} else {
		parent.LastChild.NextSibling = n
		n.PrevSibling = parent.LastChild
	}

	parent.LastChild = n
}

type NodeNavigator struct {
	curr, root *Node
}

func (n *NodeNavigator) NodeType() xpath.NodeType {
	switch n.curr.Type {
	case TextNode:
		return xpath.TextNode
	case DocumentNode:
		return xpath.RootNode
	}
	return xpath.ElementNode
}

func (n *NodeNavigator) LocalName() string {
	return n.curr.Data
}

func (n *NodeNavigator) Prefix() string {
	return ""
}

func (n *NodeNavigator) NamespaceURL() string {
	return ""
}

func (n *NodeNavigator) Value() string {
	switch n.curr.Type {
	case ElementNode:
		return n.curr.Value()
	case TextNode:
		return n.curr.Data
	}
	return ""
}

func (n *NodeNavigator) Copy() xpath.NodeNavigator {
	n2 := *n
	return &n2
}

func (n *NodeNavigator) MoveToRoot() {
	n.curr = n.root
}

func (n *NodeNavigator) MoveToParent() bool {
	if node := n.curr.Parent; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToNextAttribute() bool {
	return true
}

func (n *NodeNavigator) MoveToChild() bool {
	if node := n.curr.FirstChild; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToFirst() bool {
	for {
		node := n.curr.PrevSibling
		if node == nil {
			break
		}
		n.curr = node
	}
	return true
}

func (n *NodeNavigator) String() string {
	return n.Value()
}

func (n *NodeNavigator) MoveToNext() bool {
	if node := n.curr.NextSibling; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveToPrevious() bool {
	if node := n.curr.PrevSibling; node != nil {
		n.curr = node
		return true
	}
	return false
}

func (n *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool {
	node, ok := other.(*NodeNavigator)
	if !ok || node.root != n.root {
		return false
	}

	n.curr = node.curr
	return true
}

// XPath package example. See more xpath implements package:
// https://github.com/antchfx/htmlquery
// https://github.com/antchfx/xmlquery
// https://github.com/antchfx/jsonquery
func main() {
	/***
	?xml version="1.0" encoding="UTF-8"?>
	<bookstore>
		<book>
			<title>Everyday Italian</title>
			<author>Giada De Laurentiis</author>
			<year>2005</year>
			<price>30.00</price>
		</book>
		<book>
			<title>Harry Potter</title>
			<author>J K. Rowling</author>
			<year>2005</year>
			<price>29.99</price>
		</book>
	</bookstore>
	**/

	// Here, for begin test, we should create a document
	books := []struct {
		title  string
		author string
		year   int
		price  float64
	}{
		{title: "Everyday Italian", author: "Giada De Laurentiis", year: 2005, price: 30.00},
		{title: "Harry Potter", author: "J K. Rowling", year: 2005, price: 29.99},
	}
	bookstore := &Node{Data: "bookstore", Type: ElementNode}
	for _, v := range books {
		book := &Node{Data: "book", Type: ElementNode}
		title := &Node{Data: "title", Type: ElementNode}
		title.AddChild(&Node{Data: v.title, Type: TextNode})
		book.AddChild(title)
		author := &Node{Data: "author", Type: ElementNode}
		author.AddChild(&Node{Data: v.author, Type: TextNode})
		book.AddChild(author)
		year := &Node{Data: "year", Type: ElementNode}
		year.AddChild(&Node{Data: fmt.Sprintf("%d", v.year), Type: TextNode})
		book.AddChild(year)
		price := &Node{Data: "price", Type: ElementNode}
		price.AddChild(&Node{Data: fmt.Sprintf("%f", v.price), Type: TextNode})
		book.AddChild(price)
		bookstore.AddChild(book)
	}
	var doc = &Node{}
	doc.AddChild(bookstore)
	var root xpath.NodeNavigator = &NodeNavigator{curr: doc, root: doc}
	expr, err := xpath.Compile("count(//book)")
	// using Evaluate() method
	if err != nil {
		panic(err)
	}
	val := expr.Evaluate(root) // it returns float64 type
	fmt.Println(val.(float64))

	// using Evaluate() method
	expr = xpath.MustCompile("sum(//price)")
	val = expr.Evaluate(root) // output total price
	fmt.Println(val.(float64))

	// using Select() method
	expr = xpath.MustCompile("//book")
	iter := expr.Select(root) // it always returns NodeIterator object.
	for iter.MoveNext() {
		fmt.Println(iter.Current().Value())
	}
}
Output:

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// RegexpCache is a loading cache for string -> *regexp.Regexp mapping. It is exported so that in rare cases
	// client can customize load func and/or capacity.
	RegexpCache = defaultRegexpCache()
)

Functions

func NewLoadingCache added in v1.1.11

func NewLoadingCache(load loadFunc, capacity int) *loadingCache

NewLoadingCache creates a new instance of a loading cache with capacity. Capacity must be >= 0, or it will panic. Capacity == 0 means the cache growth is unbounded.

Types

type Expr

type Expr struct {
	// contains filtered or unexported fields
}

Expr is an XPath expression for query.

func Compile

func Compile(expr string) (*Expr, error)

Compile compiles an XPath expression string.

func CompileWithNS added in v1.2.4

func CompileWithNS(expr string, namespaces map[string]string) (*Expr, error)

CompileWithNS compiles an XPath expression string, using given namespaces map.

func MustCompile

func MustCompile(expr string) *Expr

MustCompile compiles an XPath expression string and ignored error.

func (*Expr) Evaluate

func (expr *Expr) Evaluate(root NodeNavigator) interface{}

Evaluate returns the result of the expression. The result type of the expression is one of the follow: bool,float64,string,NodeIterator).

func (*Expr) Select

func (expr *Expr) Select(root NodeNavigator) *NodeIterator

Select selects a node set using the specified XPath expression.

func (*Expr) String

func (expr *Expr) String() string

String returns XPath expression string.

type NodeIterator

type NodeIterator struct {
	// contains filtered or unexported fields
}

NodeIterator holds all matched Node object.

func Select

func Select(root NodeNavigator, expr string) *NodeIterator

Select selects a node set using the specified XPath expression. This method is deprecated, recommend using Expr.Select() method instead.

func (*NodeIterator) Current

func (t *NodeIterator) Current() NodeNavigator

Current returns current node which matched.

func (*NodeIterator) MoveNext

func (t *NodeIterator) MoveNext() bool

MoveNext moves Navigator to the next match node.

type NodeNavigator

type NodeNavigator interface {
	// NodeType returns the XPathNodeType of the current node.
	NodeType() NodeType

	// LocalName gets the Name of the current node.
	LocalName() string

	// Prefix returns namespace prefix associated with the current node.
	Prefix() string

	// Value gets the value of current node.
	Value() string

	// Copy does a deep copy of the NodeNavigator and all its components.
	Copy() NodeNavigator

	// MoveToRoot moves the NodeNavigator to the root node of the current node.
	MoveToRoot()

	// MoveToParent moves the NodeNavigator to the parent node of the current node.
	MoveToParent() bool

	// MoveToNextAttribute moves the NodeNavigator to the next attribute on current node.
	MoveToNextAttribute() bool

	// MoveToChild moves the NodeNavigator to the first child node of the current node.
	MoveToChild() bool

	// MoveToFirst moves the NodeNavigator to the first sibling node of the current node.
	MoveToFirst() bool

	// MoveToNext moves the NodeNavigator to the next sibling node of the current node.
	MoveToNext() bool

	// MoveToPrevious moves the NodeNavigator to the previous sibling node of the current node.
	MoveToPrevious() bool

	// MoveTo moves the NodeNavigator to the same position as the specified NodeNavigator.
	MoveTo(NodeNavigator) bool
}

NodeNavigator provides cursor model for navigating XML data.

type NodeType

type NodeType int

NodeType represents a type of XPath node.

const (
	// RootNode is a root node of the XML document or node tree.
	RootNode NodeType = iota

	// ElementNode is an element, such as <element>.
	ElementNode

	// AttributeNode is an attribute, such as id='123'.
	AttributeNode

	// TextNode is the text content of a node.
	TextNode

	// CommentNode is a comment node, such as <!-- my comment -->
	CommentNode
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL