text

package
v0.20.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 17, 2025 License: AGPL-3.0 Imports: 33 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Demojify

func Demojify(text string) string

Demojify replaces emoji shortcodes like `:example:` in the given text fragment with empty strings, essentially stripping them from the text. This is useful for text used in OG Meta headers.

func EmojifyRSS

func EmojifyRSS(emojis []apimodel.Emoji, text string) string

EmojifyRSS replaces emoji shortcodes like `:example:` in the given text fragment with `<img>` tags suitable for rendering as RSS content.

func EmojifyWeb

func EmojifyWeb(emojis []apimodel.Emoji, html template.HTML) template.HTML

EmojifyWeb replaces emoji shortcodes like `:example:` in the given HTML fragment with `<img>` tags suitable for rendering on the web frontend.

func FirstNBytesByWords

func FirstNBytesByWords(s string, n int) string

FirstNBytesByWords produces a prefix substring of up to n bytes from a given string, respecting Unicode grapheme and word boundaries. The substring may be empty, and may include leading or trailing whitespace.

func MinifyHTML

func MinifyHTML(in string) string

MinifyHTML minifies the given string under the assumption that it's HTML.

If input is not HTML encoded, this function will try to do minimization anyway, but this may produce unexpected results.

If an error occurs during minimization, it will be logged and the original string returned unmodified.

func NormalizeHashtag

func NormalizeHashtag(text string) (string, bool)

NormalizeHashtag normalizes the given hashtag text by removing the initial '#' symbol, and then decomposing and canonically recomposing chars + combining diacritics in the text to single unicode characters, following Normalization Form C (https://unicode.org/reports/tr15/).

Finally, it will do a check on the normalized string to ensure that it's below maximumHashtagLength chars, and contains only letters, numbers, and underscores (and not *JUST* underscores).

If all this passes, returned bool will be true.

func ParseHTMLToPlain

func ParseHTMLToPlain(html string) string

ParseHTMLToPlain parses the given HTML string, then outputs it to equivalent plaintext while trying to keep as much of the smenantic intent of the input HTML as possible, ie., titles are placed on separate lines, `<br>`s are converted to newlines, text inside `<strong>` and `<em>` tags is retained, but without emphasis, `<a>` links are unnested and the URL they link to is placed in angle brackets next to them, lists are replaced with newline-separated indented items, etc.

This function is useful when you need to filter on HTML and want to avoid catching tags in the filter, or when you want to serve something in a plaintext format that may contain HTML tags (eg., CWs).

func SanitizeHTML

func SanitizeHTML(html string) string

SanitizeHTML sanitizes only risky html elements from the given string, allowing safe ones through.

It returns an HTML string.

func StripHTMLFromText

func StripHTMLFromText(text string) string

StripHTMLFromText runs text through strict sanitization to completely remove any HTML from the input without trying to preserve the semantic intent of any HTML tags.

This is useful in cases where the input was not allowed to contain HTML at all, and the output isn't either.

Types

type FormatFunc

type FormatFunc func(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	text string,
) *FormatResult

FormatFunc is fulfilled by FromPlain, FromPlainNoParagraph, and FromMarkdown.

type FormatResult

type FormatResult struct {
	HTML     string
	Mentions []*gtsmodel.Mention
	Tags     []*gtsmodel.Tag
	Emojis   []*gtsmodel.Emoji
}

type Formatter

type Formatter struct {
	// contains filtered or unexported fields
}

Formatter wraps logic and functions for parsing statuses and other text input into nice html.

func NewFormatter

func NewFormatter(db db.DB) *Formatter

NewFormatter returns a new Formatter.

func (*Formatter) FromMarkdown

func (f *Formatter) FromMarkdown(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	input string,
) *FormatResult

FromMarkdown fulfils FormatFunc by parsing the given markdown input into a FormatResult.

Inline (aka unsafe) HTML elements are allowed, as they should be sanitized afterwards anyway.

func (*Formatter) FromMarkdownBasic

func (f *Formatter) FromMarkdownBasic(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	input string,
) *FormatResult

FromMarkdownBasic fulfils FormatFunc by parsing the given markdown input into a FormatResult.

Unlike FromMarkdown, it will only parse emojis with the custom renderer, leaving aside mentions and tags.

Inline (aka unsafe) HTML elements are not allowed.

If the result is a single paragraph, it will not be wrapped in <p> tags.

func (*Formatter) FromPlain

func (f *Formatter) FromPlain(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	input string,
) *FormatResult

FromPlain fulfils FormatFunc by parsing the given plaintext input into a FormatResult.

func (*Formatter) FromPlainBasic

func (f *Formatter) FromPlainBasic(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	input string,
) *FormatResult

FromPlainBasic fulfils FormatFunc by parsing the given plaintext input into a FormatResult.

Unlike FromPlain, it will only parse emojis with the custom renderer, leaving aside mentions and tags.

Resulting HTML will also NOT be wrapped in <p> tags.

func (*Formatter) FromPlainNoParagraph

func (f *Formatter) FromPlainNoParagraph(
	ctx context.Context,
	parseMention gtsmodel.ParseMentionFunc,
	authorID string,
	statusID string,
	input string,
) *FormatResult

FromPlainNoParagraph fulfils FormatFunc by parsing the given plaintext input into a FormatResult.

Unlike FromPlain, it will not wrap the resulting HTML in <p> tags, making it useful for parsing short fragments of text that oughtn't be formally wrapped as a paragraph.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL