Documentation
¶
Index ¶
Examples ¶
Constants ¶
const MaxScanTokenSize = 64 * 1024 * 1024
MaxScanTokenSize is the maximum size used to buffer a token. The actual maximum token size may be smaller as the buffer may need to include, for instance, a newline.
Variables ¶
var ( ErrTooLong = errors.New("mbox.Scanner: token too long") ErrNegativeAdvance = errors.New("mbox.Scanner: SplitFunc returns negative advance count") ErrAdvanceTooFar = errors.New("mbox.Scanner: SplitFunc returns advance count beyond input") )
Errors returned by Scanner.
var ErrorUnexpectedEOF = fmt.Errorf("Expected separator line, Got EOF")
ErrorUnexpectedEOF signals that EOF was found before an expected separator line
Functions ¶
func FindSeparator ¶
FindSeparator returns the start index and length of the first RFC 4155 "default" compliant separator line: `From <RFC 2822 "addr-spec"> <timestamp in UNIX ctime format><EOL marker>`. idx is negative when a separator line is not found
Types ¶
type Scanner ¶
type Scanner struct { MaxTokenSize int // Maximum size of a token; modified by tests. // contains filtered or unexported fields }
Scanner provides a simliar interface as bufio.Scanner, with the default SplitFunc set to SplitMessage
Example ¶
package main import ( "bufio" "bytes" "fmt" "net/mail" "os" "github.com/korylprince/mbox" ) func main() { f, err := os.Open("/path/to/mbox") if err != nil { // do something with err } defer f.Close() s := mbox.NewScanner(f) s.MaxTokenSize = 1024 * 1024 * 1024 // 1GB max size, or whatever you want for s.Scan() { b := s.Bytes() // copy bytes to buffer, otherwise they will be overwritten buf := make([]byte, len(b)) copy(buf, b) r := bufio.NewReader(bytes.NewReader(b)) _, _, err = r.ReadLine() // read in mbox separator line if err != nil { // do something with err } msg, err := mail.ReadMessage(r) if err != nil { // do something with err } // do something with msg fmt.Println(msg.Header) if err := s.Err(); err != nil { // do something with err } } }
Output:
func NewScanner ¶
NewScanner returns a new Scanner to read from r. The split function defaults to ScanLines.
func (*Scanner) Bytes ¶
Bytes returns the most recent token generated by a call to Scan. The underlying array may point to data that will be overwritten by a subsequent call to Scan. It does no allocation.
func (*Scanner) Scan ¶
Scan advances the Scanner to the next token, which will then be available through the Bytes or Text method. It returns false when the scan stops, either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil. Scan panics if the split function returns 100 empty tokens without advancing the input. This is a common error mode for scanners.