psv

package module

v0.3.2 Latest Latest Go to latest Published: Mar 13, 2025 License: MIT Imports: 0 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

codeberg.org/japh/psv

Links

Open Source Insights

README ¶

PSV - Pipe Separated Values

Index

Introduction
Intended Use Cases
1. Reading data from PSV tables into code
2. Generating PSV tables from code
3. Round-trip re-formatting of existing PSV tables via the psv command
Goals
Progress
User Journey Event Maps
PSV Document Structure
Basic Formatting Rules
Introductory Examples
Creating PSV Tables Manually
Using PSV Tables Programmatically
Detailed Description
Main Features
Specification
What About RFC 4180?
Not Supported
Design Principles
Markdown Support
TODO's
Documentation Links
Installation
Alternatives
References
Copyright

Introduction

PSV (Pipe Separated Values) is a go module and command line program for reading and writing tables of data as plain text.

The PSV format is similar in concept to Comma-Separated Values (CSV), Tab-Separated Values (TSV) or Delimiter-Separated Values (DSV), but with the distinction that additional space is added so that that

all rows have the same number of columns
and all columns align vertically

PSV tables are deliberately hunan readable, while still also being machine readable.

e.g.:

Some data, as CSV:

name,score
Alexander,3
Tim,5
Johannes,17

... and the same data as PSV:

| name      | score |
| --------- | ----: |
| Alexander |     3 |
| Tim       |     5 |
| Johannes  |    17 |

PSV tables are also used by Markdown, with some minor differences

Intended Use Cases

psv was created to help with three specific scenarios:

Reading data from PSV tables into code
Generating PSV tables from code
Round-trip re-formatting of existing PSV tables via the psv command

Reading data from PSV tables into code

input := `
	| name  | score |
	| ----- | ----- |
	| Alice | 3     |
	| Bob   | 2     |
	`

// read the input
doc := &psv.Document{}
doc.UnmarshalText([]byte(input))

// the input may contain any number of tables
// loop through each table looking for 'interesting' data
for _, table := range doc.Tables() {

	// ColumnByNameFunc provides a convenient name => column mapping
	value := table.ColumnByName()

	// DataRows() returns all rows except the first row (assumed to be a header)
	// AllRows() returns all rows including the first row.
	for r, row := range table.DataRows() {
		fmt.Printf("%s points were awarded to %q\n",
			value(row,"score"),
			value(row,"name"),
		)
	}
}

Output

3 points were awarded to "Alice"
2 points were awarded to "Bob"

Generating PSV tables from code

doc := &psv.Document{}
doc.AppendRow([]string{"name","score"})
doc.AppendRow([]string{"Alice","3"})
doc.AppendRow([]string{"Bob","2"})
ouput, _ := doc.MarshalText()
fmt.Println(ouput)

| name  | score |
| Alice | 3     |
| Bob   | 2     |

Round-Trip re-formatting of existing PSV tables via the `psv` command

% cat input.txt
| name | score
| ---
    | Alice |       3
|Bob|2
% psv < input.txt > output.txt
% cat output.txt
| name  | score |
| ----- | ----- |
| Alice | 3     |
| Bob   | 2     |

Goals

all data is stored as utf-8 text
- no numerical formats - numbers are just text
- no special 'date' or 'time' formats
- the transformation of text to internal formats is left to client applications
full "round trip" support
- all generated PSV tables should be parsable without loss of information
- non-PSV data should be left un-changed
it should be possible to generate documents with multiple tables
it should be possible to parse incorrectly formatted tables
- incosistent row lengths are harmonized to the length of the longest row in the table
- adjacent column separators
- missing column headers
- multiple horizontal ruler lines
- empty rows
it should be possible to sort the rows in a table
it should be possible to remove empty columns from a table
it should be possible to add columns to a table
it should be possible to re-format or generate tables with a known row-prefix (comment lead-in, e.g. // or #)

Progress

doc/progress.svg

doc/progress.gv

User Journey Event Maps

doc/use_case_events.svg

doc/use_case_events.gv

PSV Document Structure

parsing text always returns a Document
all tables in a document may be aligned with each other by enabling the align_all option
a ruler after the first row of data in a table is special. it can
- specify left,right,center,numeric data alignment per column
- specify that a column should be sorted before joining (actually, before encoding!)
- all other rulers are for decoration purposes only, and any additional markers within them will be ignored

Basic Formatting Rules

PSV is encoded in UTF-8
data rows
- data rows must begin with a | (ASCII 0x7c, Unicode U+007c)
- new columns are introduced by further | characters (one | per column)
  - a trailing | at the end of a data row is optonal
  - empty columns at the end of a line are always truncated
- empty columns inside a table may be removed by enabling the squash-empty option
- UTF-8 whitespace surrounding |s is ignored
- any other UTF-8 characters are considered data
  - whitespace within data is retained verbatim
  - whitespace and | can be included as data by preceding them with a \ (ASCII 0x5c, Unicode U+005c)
- \n (ASCII 0x0a, Unicode U+000a) separates data rows
  - \r (ASCII 0x0d, Unicode U+000a) is included as whitespace, and is thus ignored
  - a trailing \n at the end of a file is not required
any text lines which do not begin with a | are retained verbatim, but are not part of a PSV table

These rules are enough to produce simple PSV tables. Horizontal rulers are also available, however, they are "somewhat more complicated" and are thus explained in ruler formatting or psv_format.md

Introductory examples

Creating PSV Tables Manually

To write a PSV table, simply start a line with with a | and some text. Don't worry about spacing or indentation, the psv tool will fix that in a minute. For example, the following, deliberately sloppily entered table:

    |A| B     |     A   anb B
| -
  | false | false | false
|false| true        | false ||||||
  |true       |       false | false
    |true   | true  | true    | yay

will be turned into this:

    | A     | B     | A   anb B |     |
    | ----- | ----- | --------- | --- |
    | false | false | false     |     |
    | false | true  | false     |     |
    | true  | false | false     |     |
    | true  | true  | true      | yay |

with a single call to psv (in this case, the vim [^1] command: vip!psv [^2]).

Some things of note:

all table rows are indented to align with the first row
all rows have been trimmed to the same number of columns
all columns are vertically aligned
a trailing | is always included on every data row
the horizontal ruler has been resized to match the width of each column
the contents of the table has not changed
- e.g. the extra spacing between A and B was retained

(see ruler formatting)

[^1]: You don't have to use vim! psv can be used from any editor or shell script that lets you pipe text through shell commands.

[^2]: which translates to: - v start a visual selection ... - i select everything in ... - p the current paragraph - !psv and replace the current selection with whatever psv makes of it

Using psv Tables Programmatically

psv Tables can also help improve the readibility of test data.

Here is an example of an actual test suite (containing 14 individual unit tests) from psv's own unit testing code (sort_test.go):

func TestSingleSectionSorting(t *testing.T) {

    testTable := psv.TableFromString(`
        | 0 | b | 3  | partial
        | 1 | D
        | 2 | E | 5
        | 3 | a | 4  | unequal
        | 4 | c | 20
        | 5 | C | 10 | row | lengths
        | 6 | e | 5
        | 7 | d | 7
        `)

    testCases := sortingTestCasesFromTable(`
	| name                         | sort  | columns | exp-col | exp-rows        |
	| ---------------------------- | ----- | ------- | ------- | --------------- |
	| no sort                      | false |         |         | 0 1 2 3 4 5 6 7 |
	| default sort                 |       |         |         | 0 1 2 3 4 5 6 7 |
	| sort only when asked to      | false | 2       |         | 0 1 2 3 4 5 6 7 |
	| reverse default sort         |       | ~       |         | 7 6 5 4 3 2 1 0 |
	| reverse reverse default sort |       | ~~      |         | 0 1 2 3 4 5 6 7 |
	| indexed column sort          |       | 2       |         | 3 0 4 5 7 1 6 2 |
	| indexed column sort          |       | 2       | 2       | a b c C d D e E |
	| reverse column sort          |       | ~2      |         | 2 6 1 7 5 4 0 3 |
	| third column sort            |       | 3       |         | 1 5 4 0 3 2 6 7 |
	| numeric sort                 |       | #3      |         | 1 0 3 2 6 7 5 4 |
	| reverse numeric sort         |       | ~#3     |         | 4 5 7 6 2 3 0 1 |
	| numeric reverse sort         |       | #~3     |         | 4 5 7 6 2 3 0 1 |
	| reverse reverse column sort  |       | ~ #~3   |         | 1 0 3 2 6 7 5 4 |
	| partial column sort          |       | 4 2     |         | 4 7 1 6 2 0 5 3 |
	| non-existent column sort     |       | 9       |         | 0 1 2 3 4 5 6 7 |
	`)

    runSortingTestCases(t, testTable.AllRows(), testCases.DataRows())
}

In the example above, two tables are defined:

testTable is the reference table to be tested
- it simply contains a few rows of data, in various forms suitable for testing some features of psv
- testTable.AllRows() is used to get a [][]string containing all of the rows in the table.
testCases then defines a series of individual unit tests to be run on testTable
- the first rows (|name|...) is used as a header for the table
  - psv always refers to columns by the value in their first row
    - but the first row is treated the same as all other rows
  - testCases.DataRows() is used to get all of the rows except the first row
  - the second row in the table is a ruler
    - rulers are decorative in nature and may be used to influence column alignment and sorting preferences, but they do not appear in the [][]string array of data!

Detailed Description

psv reads, formats and writes simple tables of data in text files.

In doing so, psv focuses on human readibility and ease of use, rather than trying to provide a loss-less, ubiquitous, machine-readable data transfer format.

The same could be said of markdown, and indeed, psv can be used to generate github-style markdown tables that look nice in their markdown source code, and not just after they have been converted to HTML by the markdown renderer.

Another intended use case is data tables in Gherkin files, which are a central component of Behaviour Driven Development (BDD).

However, the real reason for creating psv was to be able to use text tables as the source of data for running automated tests. Hence the go package.

Main Features

normalisation of rows and columns, so that every row has the same number of cells
automatic table indentation and column alignment
the ability to automatically draw horizontal separation lines, called rulers
the ability to re-format existing tables, while leaving lines which "do not look like table rows" unchanged
a simple way to read data from tables into go programs via the psv go package
the (limited) ability to sort table data
- without interfering with the rest of the table's formatting
and more ...

Specification

A semi-formal specification is available as a separate, RFC-like document.

What about RFC 4180?

psv deliberately does not fulfill RFC 4180, as the two specification have very different goals.

psv is intended for presenting tabular data in a human-friendly form
RFC 4180 is intended for encoding and transmitting data between programs

Comparison of PSV with RFC 4180

Feature	`psv`	RFC 4180	Description
new lines	native	CRLF	`psv` does not prefer any particular line separator
final new line	yes	optional	`psv` will accept a final row without a new line, but will always terminate the final row with a new line
header row	optional	optional	`psv` and `RFC 4180` both allow an optional header row that provides column names
horizontal rulers	yes	no	`psv` allows the use of decorative horizontal rulers to separate logical groups within a table
separator	pipe	comma	vertical lines are more visualy distinctive
trailing separator	yes	no	`psv` prioritises visual clarity over document size
padded values	yes	no	`psv` aligns columns visually for human consumption, which requires the addition/removal of leading/trailing spaces
embedded spaces	yes	yes	`psv` preserves space within a value
all rows have same width	yes	yes	`psv` and `RFC 4180` both recommend rows have the same number of columns
quoted values	no	yes	`psv` tables reduce cognitive load by only using backslashes for escaping
multi-line values	no	yes	`psv` is not intended to be a lossless format for any data, but a visual representation for humans

Not Supported

psv is not intended to replace spreadsheets etc 😄

Among a myriad of other non-features, the following are definitely not supported by psv:

the inclusion of | characters in a cell's data
multi-line cell data
any kind of cell merging or splitting
sorting of complex data formats, including:
- date and/or timestamps (unless they are in ISO-8601 format, which sorts nicely)
- signed numbers (+ and - signs confuse go's collators 😦)
- floating point numbers
- scientific notation
- hexadecimal notation
...

Design Principles

self contained
- psv is a single go binary with no external dependencies
- the psv go package is a single package, also with no external dependecies other than go's standard packages
  - exception: I do include another package of mine to provide simplified testing with meaningful success and error messages.
- all psv actions occur locally (no network access required)
non-destructive
- if psv doesn't know how to interperet a line of text, the text remains unchanged
  - only data rows (lines beginning with a |) and rulers are re-formatted, all other lines remain unchanged
idempotent
- any table generated by psv can also be read be psv
- running a formatted table through psv again must not change the table in any way
easy of use
- normal use should not require any configuration or additional parameters

Markdown Support

Markdown's table format is a subset of the formatting options provided by psv.

Specifically:

Markdown tables MUST begin with a header row of column names
Markdown tables MUST have exactly one ruler as their second line
Markdown rulers MAY contain the alignment hints :- (left-aligned), -: (right-aligned) or :-: (centered)
Markdown tables MUST NOT have embedded rulers anywhere else

TODO's

add ability to configure the scanner
- allow auto-indent detection
  - -I detect indent by capturing the indent before the first | encountered
- explicitly specify ruler characters (for cli)
  - default autodetect
  - explicit rulers
    - turns off autodetection
    - allows the use of + and - as data
    - options:
      - -rh '-' horizontal ruler
      - -ro '|' outer ruler
      - -ri ':' inner ruler
      - -rc '+' corners
      - -rp 'ophi'
        
        o outer vertical ruler
        
        p padding character
        
        h horizontal ruler (default: same as padding character)
        
        i inner vertical ruler (default: same as outer ruler)
Replace table.Data with table.DataRows

Documentation Links

Installation

psv consists of two components: the psv command and the psv go package.

To use the psv command, you only need the psv binary in your PATH, e.g. ~/bin/psv (see binary installation below).

If you don't want to install "a binary, downloaded from the 'net", you can download the source, (inspect it 😄), and build your own version.

Installation From Source

Prerequisites

go 1.18 or later
make (optional, but recommended)

Build Steps

Clone the psv git repository and use make to build, test and install psv in your $GOBIN directory (typically $GOPATH/bin or ~/Go/bin)

git clone -o codeberg https://codeberg.org/japh/psv
cd psv
make install
psv -v

Binary Installation

Note: currently only available for darwin amd64 (64-bit Intel Macs)

download the latest psv.gz from https://codeberg.org/japh/psv/releases
verify psv.gz with gpg --verify psv.gz.asc
compare psv.gz's checksums against those provided with shasum -c psv.gz.sha256
unpack psv.gz with gunzip psv.gz
copy psv to any directory in your $PATH, or use it directly via ./psv
don't forget to check that it is executable, e.g. chmod +x psv

Now you can use the psv command...

Using The `psv` Package In Go Projects

Prerequisites

go 1.18 or later

To use psv in your go project, simply import codeberg.org/japh/psv and go mod tidy will download it, build it and make it available for your project.

See the psv package documentation for the API and code examples.

Alternatives

csv, tsv and delimeter-separated-values tables | wikipedia
- generally, psv tables are just a single type of delimeter separated values format
ASCII Table Writer
- go package for creating tables of almost any form
- more traditional table.SetHeader, table.SetFooter() interface
- more features (incl. colors)
- does not read tables
  - no good for defining test cases etc in code
psv-spec (unrelated project!)
- an attempt to standardize a CSV replacement using pipes as the delimiter
- focuses on electronic data transfers
- does not provide a tabular layout
- escaping just |, \, \n and \r is nice
  - but does not allow for whitespace quoting
  - future: | " " | could be used by psv to represent a space

References

Copyright

Documentation ¶

Overview ¶

Package psv provides methods for handling tables of Pipe-Separated-Values (PSV)

Three basic use cases are supported:

1. Generating PSV tables from code:

doc := &psv.NewDocument()
doc.AppendRow([]string{"name","score"})
doc.AppendRow([]string{"Alice","3"})
doc.AppendRow([]string{"Bob","2"})
ouput, _ := doc.MarshalText()
fmt.Println(ouput)

| name  | score |
| Alice | 3     |
| Bob   | 2     |

2. Reading data from PSV tables into code:

input := `
	| name  | score |
	| Alice | 3     |
	| Bob   | 2     |
	`

// read the input
doc := psv.NewDocument()
doc.UnmarshalText([]byte(input))

// the input may contain any number of tables
// loop through each table looking for 'interesting' data
for _, table := range doc.Tables() {

	// ColumnByNameFunc provides a convenient name => column mapping
	value := table.ColumnByNameFunc()

	// DataRows() returns all rows except the first row (assumed to be a header)
	// AllRows() returns all rows including the first row.
	for r, row := range table.DataRows() {
		fmt.Printf("%s points were awarded to %q\n",
			value(row,"score"),
			value(row,"name"),
		)
	}
}

// Output:
// 3 points were awarded to "Alice"
// 2 points were awarded to "Bob"

3. Re-formatting existing PSV tables via the `psv` command:

% cat input.txt
| name | score
| Alice | 3
| Bob | 2
% psv < input.txt > output.txt
% cat output.txt
| name  | score |
| Alice | 3     |
| Bob   | 2     |

Usage ¶

[Document] is the main aggregate for building or accessing PSV data. The [Document] type fulfills the encoding.TextMarshaler and encoding.TextUnmarshaler interfaces for conversion to and from the document's text form.

Documents are built incrementally via Append methods and may be read as a slice of rows. The ability to edit data or randomly access data is not provided.

Internally, a [Document] may contain any number of [Table] objects which can be accessed via the [Document.Tables] method.

Each table then has its own set of column names, prefix etc.

[Ruler] objects may be used to add separation lines to a table and may be placed anywhere within a table.

The [Markdown] formatter, however, will ignore all but the ruler that appears directly after the first row of data in the table, thus conforming to markdown's requirements.

e.g.

+---------+-------+
| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| ------- - ----- |
| Dave    | 9     |
+---------+-------+

When re-formatted for Markdown, this would become:

| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| Dave    | 9     |

Source Files ¶

View all Source files

doc.go

Directories ¶

Path	Synopsis
cmd
psv psv command	psv command
encoding
document
prefix
row Data Encoding and Decoding entails the protection of data characters from corruption after being rendered as a PSV table, and restoring the original data from a PSV table.	Data Encoding and Decoding entails the protection of data characters from corruption after being rendered as a PSV table, and restoring the original data from a PSV table.
ruler
table
text
fsm fsm provides a finite state machine parser for parsing e.g.	fsm provides a finite state machine parser for parsing e.g.
model
sort
test

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

PSV - Pipe Separated Values

Index

Introduction

Intended Use Cases

Reading data from PSV tables into code

Generating PSV tables from code

Round-Trip re-formatting of existing PSV tables via the psv command

Goals

Progress

User Journey Event Maps

PSV Document Structure

Basic Formatting Rules

Introductory examples

Creating PSV Tables Manually

Using psv Tables Programmatically

Detailed Description

Main Features

Specification

What about RFC 4180?

Comparison of PSV with RFC 4180

Not Supported

Design Principles

Markdown Support

TODO's

Documentation Links

Installation

Installation From Source

Prerequisites

Build Steps

Binary Installation

Using The psv Package In Go Projects

Prerequisites

Alternatives

References

Copyright

Documentation ¶

Overview ¶

Usage ¶

Source Files ¶

Directories ¶

Round-Trip re-formatting of existing PSV tables via the `psv` command

Using The `psv` Package In Go Projects