detection

package
v0.0.0-...-28d3146 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 24, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package detection provides multi-method code duplication detection.

This package coordinates multiple detection algorithms and combines their results to provide comprehensive duplicate reporting.

Detection Methods Supported: - DetectionMethodArtDupl: Suffix tree algorithm on AST tokens - DetectionMethodHash: Rolling hash on file content - DetectionMethodTodos: Finds TODO comments - DetectionMethodLegacy: Finds legacy code patterns

Core Type: - MultiDetector: Coordinates multiple detection methods

Usage:

// Create multi-detector with configuration
md := detection.NewMultiDetector(cfg, data, tree, verbose)

// Run all configured detection methods
matches := md.FindDuplOver(threshold)

Design: - Methods are configured via config.DetectionMethods - Results are combined and deduplicated - Verbose logging available for debugging - Respects config.IsDefault() for optimization

Performance: - Runs selected methods in parallel (goroutines) - Channels used for non-blocking result delivery - Each method runs independently, results combined at output

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type LegacyDetector

type LegacyDetector struct {
	// contains filtered or unexported fields
}

LegacyDetector finds legacy code patterns.

func NewLegacyDetector

func NewLegacyDetector() *LegacyDetector

NewLegacyDetector creates a new legacy detector with default patterns.

func (*LegacyDetector) FindLegacy

func (ld *LegacyDetector) FindLegacy(data []*syntax.Node) <-chan syntax.Match

FindLegacy finds all legacy patterns in provided nodes.

type LegacyIssue

type LegacyIssue struct {
	Filename domain.Filepath      `json:"filename"`
	Line     domain.LineNumber    `json:"line"`
	Type     string               `json:"type"` // deprecated function, old pattern, etc.
	Message  string               `json:"message"`
	Severity domain.CloneSeverity `json:"severity"` // low, medium, high
}

LegacyIssue represents a legacy code pattern.

func (LegacyIssue) GetLine

func (l LegacyIssue) GetLine() domain.LineNumber

GetLine returns the line number for this issue (implements LineExtractor interface).

type LegacyPattern

type LegacyPattern struct {
	Type      string   `json:"type"`
	Message   string   `json:"message"`
	Severity  string   `json:"severity"`
	Functions []string `json:"functions,omitempty"` // Deprecated function names
	Patterns  []string `json:"patterns,omitempty"`  // Regex patterns for code
	Imports   []string `json:"imports,omitempty"`   // Deprecated import paths
}

LegacyPattern represents a pattern to detect legacy code.

type LineExtractor

type LineExtractor interface {
	GetLine() domain.LineNumber
}

LineExtractor is an interface for issue types that have a line number. This allows findIssuesGeneric to work with any issue type without needing a separate line extraction function.

type MultiDetector

type MultiDetector struct {
	// contains filtered or unexported fields
}

MultiDetector runs multiple detection methods and combines results.

func NewMultiDetector

func NewMultiDetector(cfg *config.Config, data []*syntax.Node, tree *suffixtree.STree, verbose bool) *MultiDetector

NewMultiDetector creates a new multi-method detector.

func (*MultiDetector) FindDuplOver

func (md *MultiDetector) FindDuplOver(threshold int) <-chan syntax.Match

FindDuplOver runs all configured detection methods.

type SimpleDetector

type SimpleDetector interface {
	// FindDuplOver finds all clones/sequences with size >= threshold.
	//
	// Parameters:
	// - threshold: Minimum size in tokens to consider as duplicate
	//
	// Returns:
	// - <-chan syntax.Match: Channel for streaming results
	//
	// Behavior:
	// - Channel is closed when all matches are sent
	// - Results are sent as they are found (non-blocking)
	// - Caller should range over channel to receive all matches
	FindDuplOver(threshold int) <-chan syntax.Match
}

SimpleDetector is a minimal interface for clone detection.

This interface matches the existing signature used by MultiDetector and HashDetector, allowing them to be used interchangeably without requiring changes to existing code.

Methods:

  • FindDuplOver(threshold int) <-chan syntax.Match Finds all clones/sequences with size >= threshold Returns channel for streaming results

Usage:

// Can use any detector implementing this interface
var detector SimpleDetector
if useMultiDetector {
    detector = NewMultiDetector(...)
} else if useHashDetector {
    detector = NewHashDetector(...)
}

matches := detector.FindDuplOver(threshold)
for match := range matches {
    // Process matches
}

type TodoDetector

type TodoDetector struct {
	// contains filtered or unexported fields
}

TodoDetector finds TODO comments in Go source code.

func NewTodoDetector

func NewTodoDetector() *TodoDetector

NewTodoDetector creates a new TODO detector.

func (*TodoDetector) FindTodos

func (td *TodoDetector) FindTodos(data []*syntax.Node) <-chan syntax.Match

FindTodos finds all TODO-style comments in the provided nodes.

type TodoIssue

type TodoIssue struct {
	Filename domain.Filepath   `json:"filename"`
	Line     domain.LineNumber `json:"line"`
	Text     string            `json:"text"`
	Type     string            `json:"type"`           //nolint:godox // TODO, FIXME, XXX, etc.
	Tags     []string          `json:"tags,omitempty"` // @username, date, etc.
}

TodoIssue represents a TODO comment found in code.

func (TodoIssue) GetLine

func (t TodoIssue) GetLine() domain.LineNumber

GetLine returns the line number for this issue (implements LineExtractor interface).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL