psv

package module
v0.3.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 13, 2025 License: MIT Imports: 0 Imported by: 0

README

PSV - Pipe Separated Values

Index

Introduction

PSV (Pipe Separated Values) is a go module and command line program for reading and writing tables of data as plain text.

The PSV format is similar in concept to Comma-Separated Values (CSV), Tab-Separated Values (TSV) or Delimiter-Separated Values (DSV), but with the distinction that additional space is added so that that

  • all rows have the same number of columns
  • and all columns align vertically

PSV tables are deliberately hunan readable, while still also being machine readable.

e.g.:

Some data, as CSV:

name,score
Alexander,3
Tim,5
Johannes,17

... and the same data as PSV:

| name      | score |
| --------- | ----: |
| Alexander |     3 |
| Tim       |     5 |
| Johannes  |    17 |

PSV tables are also used by Markdown, with some minor differences

Intended Use Cases

psv was created to help with three specific scenarios:

  1. Reading data from PSV tables into code
  2. Generating PSV tables from code
  3. Round-trip re-formatting of existing PSV tables via the psv command
Reading data from PSV tables into code
input := `
	| name  | score |
	| ----- | ----- |
	| Alice | 3     |
	| Bob   | 2     |
	`

// read the input
doc := &psv.Document{}
doc.UnmarshalText([]byte(input))

// the input may contain any number of tables
// loop through each table looking for 'interesting' data
for _, table := range doc.Tables() {

	// ColumnByNameFunc provides a convenient name => column mapping
	value := table.ColumnByName()

	// DataRows() returns all rows except the first row (assumed to be a header)
	// AllRows() returns all rows including the first row.
	for r, row := range table.DataRows() {
		fmt.Printf("%s points were awarded to %q\n",
			value(row,"score"),
			value(row,"name"),
		)
	}
}

Output

3 points were awarded to "Alice"
2 points were awarded to "Bob"
Generating PSV tables from code
doc := &psv.Document{}
doc.AppendRow([]string{"name","score"})
doc.AppendRow([]string{"Alice","3"})
doc.AppendRow([]string{"Bob","2"})
ouput, _ := doc.MarshalText()
fmt.Println(ouput)
| name  | score |
| Alice | 3     |
| Bob   | 2     |
Round-Trip re-formatting of existing PSV tables via the psv command
% cat input.txt
| name | score
| ---
    | Alice |       3
|Bob|2
% psv < input.txt > output.txt
% cat output.txt
| name  | score |
| ----- | ----- |
| Alice | 3     |
| Bob   | 2     |

Goals

  • all data is stored as utf-8 text
    • no numerical formats - numbers are just text
    • no special 'date' or 'time' formats
    • the transformation of text to internal formats is left to client applications
  • full "round trip" support
    • all generated PSV tables should be parsable without loss of information
    • non-PSV data should be left un-changed
  • it should be possible to generate documents with multiple tables
  • it should be possible to parse incorrectly formatted tables
    • incosistent row lengths are harmonized to the length of the longest row in the table
    • adjacent column separators
    • missing column headers
    • multiple horizontal ruler lines
    • empty rows
  • it should be possible to sort the rows in a table
  • it should be possible to remove empty columns from a table
  • it should be possible to add columns to a table
  • it should be possible to re-format or generate tables with a known row-prefix (comment lead-in, e.g. // or #)

Progress

doc/progress.svg

doc/progress.gv

User Journey Event Maps

doc/use_case_events.svg

doc/use_case_events.gv

PSV Document Structure

  • parsing text always returns a Document
  • all tables in a document may be aligned with each other by enabling the align_all option
  • a ruler after the first row of data in a table is special. it can
    • specify left,right,center,numeric data alignment per column
    • specify that a column should be sorted before joining (actually, before encoding!)
    • all other rulers are for decoration purposes only, and any additional markers within them will be ignored

Basic Formatting Rules

  • PSV is encoded in UTF-8
  • data rows
    • data rows must begin with a | (ASCII 0x7c, Unicode U+007c)
    • new columns are introduced by further | characters (one | per column)
      • a trailing | at the end of a data row is optonal
      • empty columns at the end of a line are always truncated
    • empty columns inside a table may be removed by enabling the squash-empty option
    • UTF-8 whitespace surrounding |s is ignored
    • any other UTF-8 characters are considered data
      • whitespace within data is retained verbatim
      • whitespace and | can be included as data by preceding them with a \ (ASCII 0x5c, Unicode U+005c)
    • \n (ASCII 0x0a, Unicode U+000a) separates data rows
      • \r (ASCII 0x0d, Unicode U+000a) is included as whitespace, and is thus ignored
      • a trailing \n at the end of a file is not required
  • any text lines which do not begin with a | are retained verbatim, but are not part of a PSV table

These rules are enough to produce simple PSV tables. Horizontal rulers are also available, however, they are "somewhat more complicated" and are thus explained in ruler formatting or psv_format.md

Introductory examples

Creating PSV Tables Manually

To write a PSV table, simply start a line with with a | and some text. Don't worry about spacing or indentation, the psv tool will fix that in a minute. For example, the following, deliberately sloppily entered table:

    |A| B     |     A   anb B
| -
  | false | false | false
|false| true        | false ||||||
  |true       |       false | false
    |true   | true  | true    | yay

will be turned into this:

    | A     | B     | A   anb B |     |
    | ----- | ----- | --------- | --- |
    | false | false | false     |     |
    | false | true  | false     |     |
    | true  | false | false     |     |
    | true  | true  | true      | yay |

with a single call to psv (in this case, the vim [^1] command: vip!psv [^2]).

Some things of note:

  • all table rows are indented to align with the first row
  • all rows have been trimmed to the same number of columns
  • all columns are vertically aligned
  • a trailing | is always included on every data row
  • the horizontal ruler has been resized to match the width of each column
  • the contents of the table has not changed
    • e.g. the extra spacing between A and B was retained

(see ruler formatting)

[^1]: You don't have to use vim! psv can be used from any editor or shell script that lets you pipe text through shell commands.

[^2]: which translates to: - v start a visual selection ... - i select everything in ... - p the current paragraph - !psv and replace the current selection with whatever psv makes of it

Using psv Tables Programmatically

psv Tables can also help improve the readibility of test data.

Here is an example of an actual test suite (containing 14 individual unit tests) from psv's own unit testing code (sort_test.go):

func TestSingleSectionSorting(t *testing.T) {

    testTable := psv.TableFromString(`
        | 0 | b | 3  | partial
        | 1 | D
        | 2 | E | 5
        | 3 | a | 4  | unequal
        | 4 | c | 20
        | 5 | C | 10 | row | lengths
        | 6 | e | 5
        | 7 | d | 7
        `)

    testCases := sortingTestCasesFromTable(`
	| name                         | sort  | columns | exp-col | exp-rows        |
	| ---------------------------- | ----- | ------- | ------- | --------------- |
	| no sort                      | false |         |         | 0 1 2 3 4 5 6 7 |
	| default sort                 |       |         |         | 0 1 2 3 4 5 6 7 |
	| sort only when asked to      | false | 2       |         | 0 1 2 3 4 5 6 7 |
	| reverse default sort         |       | ~       |         | 7 6 5 4 3 2 1 0 |
	| reverse reverse default sort |       | ~~      |         | 0 1 2 3 4 5 6 7 |
	| indexed column sort          |       | 2       |         | 3 0 4 5 7 1 6 2 |
	| indexed column sort          |       | 2       | 2       | a b c C d D e E |
	| reverse column sort          |       | ~2      |         | 2 6 1 7 5 4 0 3 |
	| third column sort            |       | 3       |         | 1 5 4 0 3 2 6 7 |
	| numeric sort                 |       | #3      |         | 1 0 3 2 6 7 5 4 |
	| reverse numeric sort         |       | ~#3     |         | 4 5 7 6 2 3 0 1 |
	| numeric reverse sort         |       | #~3     |         | 4 5 7 6 2 3 0 1 |
	| reverse reverse column sort  |       | ~ #~3   |         | 1 0 3 2 6 7 5 4 |
	| partial column sort          |       | 4 2     |         | 4 7 1 6 2 0 5 3 |
	| non-existent column sort     |       | 9       |         | 0 1 2 3 4 5 6 7 |
	`)

    runSortingTestCases(t, testTable.AllRows(), testCases.DataRows())
}

In the example above, two tables are defined:

  • testTable is the reference table to be tested

    • it simply contains a few rows of data, in various forms suitable for testing some features of psv
    • testTable.AllRows() is used to get a [][]string containing all of the rows in the table.
  • testCases then defines a series of individual unit tests to be run on testTable

    • the first rows (|name|...) is used as a header for the table
      • psv always refers to columns by the value in their first row
        • but the first row is treated the same as all other rows
      • testCases.DataRows() is used to get all of the rows except the first row
      • the second row in the table is a ruler
        • rulers are decorative in nature and may be used to influence column alignment and sorting preferences, but they do not appear in the [][]string array of data!

Detailed Description

psv reads, formats and writes simple tables of data in text files.

In doing so, psv focuses on human readibility and ease of use, rather than trying to provide a loss-less, ubiquitous, machine-readable data transfer format.

The same could be said of markdown, and indeed, psv can be used to generate github-style markdown tables that look nice in their markdown source code, and not just after they have been converted to HTML by the markdown renderer.

Another intended use case is data tables in Gherkin files, which are a central component of Behaviour Driven Development (BDD).

However, the real reason for creating psv was to be able to use text tables as the source of data for running automated tests. Hence the go package.

Main Features

  • normalisation of rows and columns, so that every row has the same number of cells
  • automatic table indentation and column alignment
  • the ability to automatically draw horizontal separation lines, called rulers
  • the ability to re-format existing tables, while leaving lines which "do not look like table rows" unchanged
  • a simple way to read data from tables into go programs via the psv go package
  • the (limited) ability to sort table data
    • without interfering with the rest of the table's formatting
  • and more ...

Specification

A semi-formal specification is available as a separate, RFC-like document.

What about RFC 4180?

psv deliberately does not fulfill RFC 4180, as the two specification have very different goals.

  • psv is intended for presenting tabular data in a human-friendly form
  • RFC 4180 is intended for encoding and transmitting data between programs
Comparison of PSV with RFC 4180
Feature psv RFC 4180 Description
new lines native CRLF psv does not prefer any particular line separator
final new line yes optional psv will accept a final row without a new line, but will always terminate the final row with a new line
header row optional optional psv and RFC 4180 both allow an optional header row that provides column names
horizontal rulers yes no psv allows the use of decorative horizontal rulers to separate logical groups within a table
separator pipe comma vertical lines are more visualy distinctive
trailing separator yes no psv prioritises visual clarity over document size
padded values yes no psv aligns columns visually for human consumption, which requires the addition/removal of leading/trailing spaces
embedded spaces yes yes psv preserves space within a value
all rows have same width yes yes psv and RFC 4180 both recommend rows have the same number of columns
quoted values no yes psv tables reduce cognitive load by only using backslashes for escaping
multi-line values no yes psv is not intended to be a lossless format for any data, but a visual representation for humans
Not Supported

psv is not intended to replace spreadsheets etc 😄

Among a myriad of other non-features, the following are definitely not supported by psv:

  • the inclusion of | characters in a cell's data
  • multi-line cell data
  • any kind of cell merging or splitting
  • sorting of complex data formats, including:
    • date and/or timestamps (unless they are in ISO-8601 format, which sorts nicely)
    • signed numbers (+ and - signs confuse go's collators 😦)
    • floating point numbers
    • scientific notation
    • hexadecimal notation
  • ...
Design Principles
  • self contained
    • psv is a single go binary with no external dependencies
    • the psv go package is a single package, also with no external dependecies other than go's standard packages
      • exception: I do include another package of mine to provide simplified testing with meaningful success and error messages.
    • all psv actions occur locally (no network access required)
  • non-destructive
    • if psv doesn't know how to interperet a line of text, the text remains unchanged
      • only data rows (lines beginning with a |) and rulers are re-formatted, all other lines remain unchanged
  • idempotent
    • any table generated by psv can also be read be psv
    • running a formatted table through psv again must not change the table in any way
  • easy of use
    • normal use should not require any configuration or additional parameters

Markdown Support

Markdown's table format is a subset of the formatting options provided by psv.

Specifically:

  • Markdown tables MUST begin with a header row of column names
  • Markdown tables MUST have exactly one ruler as their second line
  • Markdown rulers MAY contain the alignment hints :- (left-aligned), -: (right-aligned) or :-: (centered)
  • Markdown tables MUST NOT have embedded rulers anywhere else
TODO's
  • add ability to configure the scanner

    • allow auto-indent detection
      • -I detect indent by capturing the indent before the first | encountered
    • explicitly specify ruler characters (for cli)
      • default autodetect
      • explicit rulers
        • turns off autodetection
        • allows the use of + and - as data
        • options:
          • -rh '-' horizontal ruler
          • -ro '|' outer ruler
          • -ri ':' inner ruler
          • -rc '+' corners
          • -rp 'ophi'
            • o outer vertical ruler
            • p padding character
            • h horizontal ruler (default: same as padding character)
            • i inner vertical ruler (default: same as outer ruler)
  • Replace table.Data with table.DataRows

Installation

psv consists of two components: the psv command and the psv go package.

To use the psv command, you only need the psv binary in your PATH, e.g. ~/bin/psv (see binary installation below).

If you don't want to install "a binary, downloaded from the 'net", you can download the source, (inspect it 😄), and build your own version.

Installation From Source
Prerequisites
  • go 1.18 or later
  • make (optional, but recommended)
Build Steps

Clone the psv git repository and use make to build, test and install psv in your $GOBIN directory (typically $GOPATH/bin or ~/Go/bin)

git clone -o codeberg https://codeberg.org/japh/psv
cd psv
make install
psv -v
Binary Installation

Note: currently only available for darwin amd64 (64-bit Intel Macs)

  • download the latest psv.gz from https://codeberg.org/japh/psv/releases
  • verify psv.gz with gpg --verify psv.gz.asc
  • compare psv.gz's checksums against those provided with shasum -c psv.gz.sha256
  • unpack psv.gz with gunzip psv.gz
  • copy psv to any directory in your $PATH, or use it directly via ./psv
  • don't forget to check that it is executable, e.g. chmod +x psv

Now you can use the psv command...

Using The psv Package In Go Projects
Prerequisites
  • go 1.18 or later

To use psv in your go project, simply import codeberg.org/japh/psv and go mod tidy will download it, build it and make it available for your project.

See the psv package documentation for the API and code examples.

Alternatives

  • csv, tsv and delimeter-separated-values tables | wikipedia

    • generally, psv tables are just a single type of delimeter separated values format
  • ASCII Table Writer

    • go package for creating tables of almost any form
    • more traditional table.SetHeader, table.SetFooter() interface
    • more features (incl. colors)
    • does not read tables
      • no good for defining test cases etc in code
  • psv-spec (unrelated project!)

    • an attempt to standardize a CSV replacement using pipes as the delimiter
    • focuses on electronic data transfers
    • does not provide a tabular layout
    • escaping just |, \, \n and \r is nice
      • but does not allow for whitespace quoting
      • future: | " " | could be used by psv to represent a space

References

Copyright 2022-2025 Stephen Riehm japh-codeberg@opensauce.de

Documentation

Overview

Package psv provides methods for handling tables of Pipe-Separated-Values (PSV)

Three basic use cases are supported:

1. Generating PSV tables from code:

doc := &psv.NewDocument()
doc.AppendRow([]string{"name","score"})
doc.AppendRow([]string{"Alice","3"})
doc.AppendRow([]string{"Bob","2"})
ouput, _ := doc.MarshalText()
fmt.Println(ouput)

| name  | score |
| Alice | 3     |
| Bob   | 2     |

2. Reading data from PSV tables into code:

input := `
	| name  | score |
	| Alice | 3     |
	| Bob   | 2     |
	`

// read the input
doc := psv.NewDocument()
doc.UnmarshalText([]byte(input))

// the input may contain any number of tables
// loop through each table looking for 'interesting' data
for _, table := range doc.Tables() {

	// ColumnByNameFunc provides a convenient name => column mapping
	value := table.ColumnByNameFunc()

	// DataRows() returns all rows except the first row (assumed to be a header)
	// AllRows() returns all rows including the first row.
	for r, row := range table.DataRows() {
		fmt.Printf("%s points were awarded to %q\n",
			value(row,"score"),
			value(row,"name"),
		)
	}
}

// Output:
// 3 points were awarded to "Alice"
// 2 points were awarded to "Bob"

3. Re-formatting existing PSV tables via the `psv` command:

% cat input.txt
| name | score
| Alice | 3
| Bob | 2
% psv < input.txt > output.txt
% cat output.txt
| name  | score |
| Alice | 3     |
| Bob   | 2     |

Usage

[Document] is the main aggregate for building or accessing PSV data. The [Document] type fulfills the encoding.TextMarshaler and encoding.TextUnmarshaler interfaces for conversion to and from the document's text form.

Documents are built incrementally via Append methods and may be read as a slice of rows. The ability to edit data or randomly access data is not provided.

Internally, a [Document] may contain any number of [Table] objects which can be accessed via the [Document.Tables] method.

Each table then has its own set of column names, prefix etc.

[Ruler] objects may be used to add separation lines to a table and may be placed anywhere within a table.

The [Markdown] formatter, however, will ignore all but the ruler that appears directly after the first row of data in the table, thus conforming to markdown's requirements.

e.g.

+---------+-------+
| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| ------- - ----- |
| Dave    | 9     |
+---------+-------+

When re-formatted for Markdown, this would become:

| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| Dave    | 9     |

Directories

Path Synopsis
cmd
psv
psv command
psv command
encoding
row
Data Encoding and Decoding entails the protection of data characters from corruption after being rendered as a PSV table, and restoring the original data from a PSV table.
Data Encoding and Decoding entails the protection of data characters from corruption after being rendered as a PSV table, and restoring the original data from a PSV table.
fsm provides a finite state machine parser for parsing e.g.
fsm provides a finite state machine parser for parsing e.g.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL