yambs (Yet Another MusicBrainz Seeder)

yambs
is a command-line program for seeding edits to the MusicBrainz music
database.
It can simplify adding multiple standalone recordings: given a CSV or TSV
file describing recordings, yambs
can open the Add standalone recording page
for each with various fields pre-filled. The Add artist, Add event,
Add label, Add place, Add release group, Add series, and Add work
pages can be seeded in a similar manner.
yambs
can also read key=value
lines from text files to seed the Add
release page, and it can use Bandcamp, Qobuz, and Tidal album pages,
Metal Archives artist and band pages, and local MP3 files or RSS feeds to seed
edits too.
There's a web frontend that can be used to add releases from online sources or
entities from text files at yambs.erat.org.
Installation
To compile and install the yambs executable, install Go and run the
following command:
go install ./cmd/yambs
If the account that you will use to run the yambs
executable does not have
access to the Google Cloud Translation API (used to detect releases'
languages and scripts), you can supply the nogcp
build tag to avoid attempting
to connect to the service:
go install -tags nogcp ./cmd/yambs
Prebuilt executables are also available.
Usage
Usage: yambs [flag]... <FILE/URL>
Seeds MusicBrainz edits.
-action value
Action to perform with seed URLs (open, print, serve, write) (default open)
-addr string
Address to listen on for -action=serve (default "localhost:8999")
-charset string
Charset for text input if not UTF-8 (IANA or MIME, e.g. "ISO-8859-1" or "latin1")
-country string
Country code for querying Tidal API (ISO 3166, e.g. "US" or "DE"; "XW" for all)
-edit-release-recordings string
Release MBID or URL whose recordings will be edited
-fields string
Comma-separated fields for CSV/TSV columns (e.g. "artist,name,length")
-format value
Format for text input (csv, keyval, tsv) (default tsv)
-list-fields
Print available fields for -type and exit
-merge-ranges string
Colon-separated list of comma-separated custom ranges for merging (e.g. "1-4,6-10:1-9")
-merge-release-recordings string
Comma-separated release MBIDs or URLs whose recordings will be merged
-modify-data value
Comma-separated modifications for online data (extract-artists, punctuation, remove-parens, split-artists)
-server string
MusicBrainz server hostname (default "musicbrainz.org")
-set value
Set a field for all entities (e.g. "edit_note=from https://www.example.org")
-timeout duration
Timeout for generating edits (e.g. "30s" or "2m")
-type value
Entity type for text or MP3 input (artist, event, label, place, recording, release, release-group, series, work)
-verbose
Enable verbose logging
-version
Print the version and exit
yambs
reads the supplied file or URL (or stdin if no positional argument is
supplied) and performs the action specified by the -action
flag:
open
: Open edits in a browser using a temporary file.
print
: Write edit links to stdout (only possible for recordings).
serve
: Open edits in a browser using a short-lived webserver launched at
-addr
(useful if you're running yambs
in a container).
write
: Write a webpage containing the edits to stdout.
If you supply a URL, yambs
will fetch and parse it.
If you supply a filename, you should also pass the -type
, -format
,
-fields
, and -set
flags to tell yambs
how to interpret the file.
Examples
To add multiple non-album recordings for a single artist, you can run a command
like the following:
yambs \
-type recording \
-format tsv \
-fields name,length,edit_note \
-set artist=7e84f845-ac16-41fe-9ff8-df12eb32af55 \
-set url0_url=https://www.example.org/ \
-set url0_type=255 \
<recordings.tsv
with a recordings.tsv
file like the following (with tab characters between the
fields):
Song #1 4:35 info from https://example.org/song1.html
Song #2 53234.35 info from https://example.org/song2.html
The recordings' names, lengths, and edit notes will be read from the TSV file,
and the -set artist=...
flag sets all recordings' artist
field to the
specified artist.
Likewise, the -set url0_...
flags add a URL relationship to each recording.
seed/enums.go enumerates the different link types that can be specified
between entities; 255
corresponds to LinkType_DownloadForFree_Recording_URL
.
To edit existing recordings, specify their MBIDs via the mbid
field:
yambs \
-type recording \
-format csv \
-fields mbid,name \
<recordings.csv
recordings.csv
:
c55e74ff-bd7d-40ff-a591-c6993c59bda8,Sgt. Pepper’s Lonely Hearts Club Band
...
Note that this example uses the csv
format rather than tsv
.
As a convenience, you can generate edits for all of the recordings associated
with a release by passing the release’s MBID via the -edit-release-recordings
flag:
yambs \
-edit-release-recordings cf731f61-d8ba-438a-b346-644456fd27e2 \
-set 'disambiguation=live, 2008-12-07: Rose Garden, Portland, OR, USA' \
-set 'edit_note=update for https://musicbrainz.org/doc/Style/Recording#Live_recordings'
yambs \
-edit-release-recordings 6d71e03a-a7d6-4c4d-a645-b279b8a07b77 \
-set 'artist0_mbid=2fcffd9a-f02a-4a2c-8085-6257f918949d' \
-set 'artist0_credited=Pau Casals' \
-set 'edit_note=[Style/Classical/Recording_Artist]'
This is equivalent to passing -fields mbid
and supplying the recording MBIDs
as input.
More-complicated artist credits can also be assigned:
yambs \
-type recording \
-format tsv
-fields ... \
-set artist0_mbid=1a054dd8-c5fa-40b6-9397-61c26b0185d4 \
-set artist0_credited=virt \
-set 'artist0_join= & ' \
-set artist1_name=Rush \
...
(Note that repeated fields are 0-indexed.)
The keyval
format can be used to seed a single entity across multiple lines:
yambs -type release -format keyval <release.txt
release.txt
:
title=Some Album
artist0_name=Some Artist
types=Album,Soundtrack
status=Official
packaging=Jewel Case
language=eng
script=Latn
event0_date=2021-05-15
event0_country=XW
medium0_format=CD
medium0_track0_title=First Track
medium0_track0_length=3:45.04
medium0_track1_title=Second Track
medium1_format=CD
medium1_track0_title=First Track on Second Disc
url0_url=https://www.example.org/
url0_type=75
edit_note=https://www.example.org
seed/enums.go shows that the url0_type=75
line corresponds to
LinkType_DownloadForFree_Release_URL
.
If you'd like to bulk-add LinkType_Published_Label_Release
(ID 362
)
relationships between the existing label 02442aba
and releases 43bcfb95
and
a9d8b538
, you can set the mbid
field to edit the label and seed the new
relationships:
yambs -type label -format keyval <label.txt
label.txt
:
mbid=02442aba-cf00-445c-877e-f0eaa504d8c2
rel0_target=43bcfb95-f26c-4f8d-84f8-7b2ac5b8ab72
rel0_type=362
rel1_target=a9d8b538-c20a-4025-aea1-5530d616a20a
rel1_type=362
Pass the -list-fields
flag to list all available fields for a given entity
type:
yambs -type artist -list-fields
yambs -type event -list-fields
yambs -type label -list-fields
yambs -type place -list-fields
yambs -type recording -list-fields
yambs -type release -list-fields
yambs -type release-group -list-fields
yambs -type series -list-fields
yambs -type work -list-fields
Acceptable values for various fields are listed in
seed/enums.go, which is automatically generated from
t/sql/initial.sql
in the musicbrainz-server
repository.
A column in an input file can be assigned to multiple fields by supplying
slash-separated field names. For example, -fields name,url0_url/edit_note,length
maps the first column to field name
, the
second column to fields url0_url
and edit_note
, and the third column to
field length
.
A column can be skipped by passing an empty field name. For example, -fields name,,length
maps the first column to field name
, skips the second column,
and maps the third column to length
.
You can pass Bandcamp, Qobuz, or Tidal album URLs to seed release edits:
yambs https://austinwintory.bandcamp.com/album/journey
yambs https://www.qobuz.com/us-en/album/the-dark-side-of-the-moon-pink-floyd/xggxq5w5dmljb
yambs https://tidal.com/browse/album/55391786
The page that is opened will include a link to the album's highest-resolution
cover art to make it easier to add in a followup edit.
If you pass a Bandcamp track URL that isn't part of an album, an edit to add it
as a single will be created:
yambs https://caribouband.bandcamp.com/track/tin
You can pass the path to a local MP3 file to use it to seed a (single) release
or standalone recording edit:
yambs \
-type recording \
-set artist=7e84f845-ac16-41fe-9ff8-df12eb32af55 \
-set edit_note='from artist-provided MP3 at https://www.example.org/song.mp3' \
/path/to/a/song.mp3
If the MP3 file contains embedded images, they will be extracted to temporary
files so they can be added as cover art.
You can also pass the path to a local RSS feed (e.g. for a podcast) and use it
to either seed a separate release (per the broadcast programs guidelines) or a
standalone recording for each item:
yambs \
-type release \
-set artist0_name='Podcast Host' \
-set edit_note='from https://www.example.org/feed.xml' \
/path/to/a/feed.xml
You can generate edits to merge all of the recordings between two or more
releases by passing a comma-separated list of release MBIDs or URLs via the
-merge-release-recordings
flag:
yambs -merge-release-recordings 7cce69ab-08b2-48ba-93f0-0ba458d98adc,a7c74fca-9392-4fc8-9aac-a8e122dbbfa8
The recording merge edits must be opened and submitted one at a time (due to the
MusicBrainz website maintaining its own internal queue of entities to merge).
By default, the releases must have the same number of tracks. You can
additionally pass the -merge-ranges
flag to specify a colon-separated list of
per-release comma-separated track ranges. For example, -merge-ranges 1-4,7-9,5:1-8
will merge:
- tracks 1-4 from the first release with tracks 1-4 from the second
- track 7 from the first release with track 5 from the second
- track 8 from the first release with track 6 from the second
- track 9 from the first release with track 7 from the second
- track 5 from the first release with track 8 from the second
There's also a yambsd executable that exposes most of the same functionality
through a webpage (with some limits to avoid abuse).
Why?
There are a bunch of MusicBrainz userscripts that run in the browser with the
help of an extension like Tampermonkey to seed edits. They're well-tested, so
why not just use them instead of writing a new thing?
Well, at first I was adding a bunch of standalone recordings that I'd downloaded
from random musicians' homepages. I couldn't find any userscripts to help with
that, since the main focus seems to be seeding releases from major websites. I
ended up hacking together a shell script to generate URLs that would seed my
edits, but I figured it'd be nice to have something more robust and convenient
to use next time.
I had also been using the bandcamp_importer.user.js userscript to import
releases from Bandcamp, but I'm nervous about using extensions like Tampermonkey
that require permission to modify data on all sites. I'm not so worried about
malice on the part of extension or userscript developers, but I have no idea
about their security practices and I'm fearful of attackers compromising their
computers and uploading malicious versions of their code.
I created a separate browser profile that I could use to run Tampermonkey
without exposing any of my (non-MusicBrainz) credentials, but using it was a
pain, so I decided to add Bandcamp support to this codebase as well since
that's where I get most of my music.
Further reading