Full Code of shixzie/nlp for AI

master 39fec05b9991 cached
12 files
57.2 KB
18.0k tokens
116 symbols
1 requests
Download .txt
Repository: shixzie/nlp
Branch: master
Commit: 39fec05b9991
Files: 12
Total size: 57.2 KB

Directory structure:
gitextract_2oze1ahc/

├── .gitignore
├── .travis.yml
├── Gopkg.toml
├── LICENSE
├── Makefile
├── README.md
├── benchmark_test.go
├── nlp.go
├── nlp_test.go
└── parser/
    ├── nlp.peg
    ├── parser.go
    └── parser_test.go

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# dependencies
vendor

================================================
FILE: .travis.yml
================================================
language: go

go:
  - 1.8.x
  - 1.9.x
  - tip

before_install:
  - go get -u github.com/golang/dep/cmd/dep
  - dep ensure

script:
  - go test -v -race -coverprofile=coverage.txt -covermode=atomic

after_success:
  - bash <(curl -s https://codecov.io/bash)

================================================
FILE: Gopkg.toml
================================================

required = [
    "github.com/mna/pigeon"
]

================================================
FILE: LICENSE
================================================
The MIT License (MIT)

Copyright (c) 2017 Juan Álvarez / @Shixzie

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


================================================
FILE: Makefile
================================================
help:
	@echo "deps   -> Get all dependencies"
	@echo "parser -> Generates the sample parser"
	@echo "tests  -> Run all tests"

deps:
	@go get -u github.com/golang/dep/cmd/dep
	@dep ensure

parser:
	@pigeon -o "./parser/parser.go" "./parser/nlp.peg"

tests:
	@go test -v -race ./...

================================================
FILE: README.md
================================================
[![GoDoc](https://godoc.org/github.com/shixzie/nlp?status.svg)](https://godoc.org/github.com/shixzie/nlp) 
[![Go Report Card](https://goreportcard.com/badge/github.com/shixzie/nlp)](https://goreportcard.com/report/github.com/shixzie/nlp)
[![Build Status](https://travis-ci.org/shixzie/nlp.svg?branch=master)](https://travis-ci.org/shixzie/nlp)
[![codecov](https://codecov.io/gh/shixzie/nlp/branch/master/graph/badge.svg)](https://codecov.io/gh/shixzie/nlp)


# nlp

> `nlp` is a general purpose any-lang Natural Language Processor that parses the data inside a text and returns a filled model

## Supported types
```go
int  int8  int16  int32  int64
uint uint8 uint16 uint32 uint64
float32 float64
string
time.Time
time.Duration
```

## Installation
```
// go1.8+ is required
go get -u github.com/shixzie/nlp
```


**Feel free to create PR's and open Issues :)**

## How it works

You will always begin by creating a NL type calling nlp.New(), the NL type is a 
Natural Language Processor that owns 3 funcs, RegisterModel(), Learn() and P().

### RegisterModel(i interface{}, samples []string, ops ...ModelOption) error

RegisterModel takes 3 parameters, an empty struct, a set of samples and some options for the model.

The empty struct lets nlp know all possible values inside the text, for example:
```go
type Song struct {
	Name        string // fields must be exported
	Artist      string
	ReleasedAt  time.Time
}
err := nl.RegisterModel(Song{}, someSamples, nlp.WithTimeFormat("2006"))
if err != nil {
	panic(err)
}
// ...
```

tells nlp that inside the text may be a Song.Name, a Song.Artist and a Song.ReleasedAt.

The samples are the key part about nlp, not just because they set the *limits*
between *keywords* but also because they will be used to choose which model 
use to handle an expression.

Samples must have a special syntax to set those *limits* and *keywords*.
```go
songSamples := []string{
	"play {Name} by {Artist}",
	"play {Name} from {Artist}",
	"play {Name}",
	"from {Artist} play {Name}",
	"play something from {ReleasedAt}",
}
```

In the example below, you can see we're reffering to the Name and Artist fields
of the `Song` type declared above, both `{Name}` and `{Artist}` are our *keywords* 
and yes! you guessed it! Everything between `play` and `by` will be treated as a
`{Name}`, and everything that's after `by` will be treated as an `{Artist}` meaning 
that `play` and `by` are our *limits*.
```
     limits
 ┌─────┴─────┐
┌┴─┐        ┌┴┐
play {Name} by  {Artist}
     └─┬──┘     └───┬──┘
       └──────┬─────┘
           keywords
```

Any character can be a *limit*, a `,` for example can be used as a limit.

*keywords* as well as *limits* are `CaseSensitive` so be sure to type them right.

**Note that putting 2 *keywords* together will cause that only 1 or none of them will be detected**

> *limits are important* - Me :3


### Learn() error

Learn maps all models samples to their respective models using the NaiveBayes 
algorithm based on those samples. `Learn()` also trains all registered models
so they're able to fit expressions in the future.

```go
// must call after all models are registrated and before calling nl.P()
err := nl.Learn() 
if err != nil {
	panic(err)
}
// ...
```

Once the algorithm has finished learning, we're now ready to start Processing 
those texts.

**Note that you must call NL.Learn() after all models are registrated and before calling NL.P()**

### P(expr string) interface{}

P first asks the trained algorithm which model should be used, once we get
the right *and already trained* model, we just make it fit the expression.

**Note that everything in the expression must be separated by a _space_ or _tab_**

When processing an expression, nlp searches for the *limits* inside that 
expression and evaluates which sample fits better the expression, it doesn't
matter if the text has `trash`. In this example:
```
     limits
 ┌─────┴─────┐
┌┴─┐        ┌┴┐
play {Name} by  {Artist}
     └─┬──┘     └───┬──┘
       └──────┬─────┘
           keywords
```

we have 2 *limits*, `play` and `by`, it doesn't matter if we had an expression 
*hello sir can you pleeeeeease play King by Lauren Aquilina*, since:
```
                                  limits
            trash              ┌────┴────┐
┌─────────────┴─────────────┐ ┌┴─┐      ┌┴┐
hello sir can you pleeeeeease play King by  Lauren Aquilina
                                   └┬─┘     └─────┬───────┘
                                 {Name}       {Artist}
                                 └─┬──┘       └───┬──┘
                                   └──────┬───────┘
                                       keywords
```

`{Name}` would be replaced with `King`, 
`{Artist}` would be replaced with `Lauren Aquilina`, 
`trash` would be ignored as well as the *limits* `play` and `by`, 
and then **a pointer to a filled struct with the type used to register the model** (`Song`) 
( `Song.Name` being `{Name}` and `Song.Artist` beign `{Artist}` ) 
**will be returned**.

## Usage

```go
type Song struct {
	Name       string
	Artist     string
	ReleasedAt time.Time
}

songSamples := []string{
	"play {Name} by {Artist}",
	"play {Name} from {Artist}",
	"play {Name}",
	"from {Artist} play {Name}",
	"play something from {ReleasedAt}",
}

nl := nlp.New()
err := nl.RegisterModel(Song{}, songSamples, nlp.WithTimeFormat("2006"))
if err != nil {
	panic(err)
}

err = nl.Learn() // you must call Learn after all models are registered and before calling P
if err != nil {
	panic(err)
}

// after learning you can call P the times you want
s := nl.P("hello sir can you pleeeeeease play King by Lauren Aquilina") 
if song, ok := s.(*Song); ok {
	fmt.Println("Success")
	fmt.Printf("%#v\n", song)
} else {
	fmt.Println("Failed")
}

// Prints
//
// Success
// &main.Song{Name: "King", Artist: "Lauren Aquilina"}
```


================================================
FILE: benchmark_test.go
================================================
package nlp

import (
	"testing"
	"time"
)

func BenchmarkNL_P(b *testing.B) {
	type T struct {
		String string
		Int    int
		Uint   uint
		Float  float32
		Time   time.Time
		Dur    time.Duration
	}

	tSamples := []string{
		"string {String}",
		"int {Int}",
		"uint {Uint}",
		"float {Float}",
		"time {Time}",
		"dur {Dur}",
		"string {String} int {Int}",
		"string {String} time {Time}",
	}

	nl := New()

	nl.RegisterModel(T{}, tSamples)

	err := nl.RegisterModel(T{}, tSamples)
	if err != nil {
		b.Error(err)
	}

	err = nl.Learn()
	if err != nil {
		b.Error(err)
	}

	tim, err := time.ParseInLocation("01-02-2006_3:04pm", "05-18-1999_6:42pm", time.Local)
	if err != nil {
		b.Error(err)
	}

	dur, err := time.ParseDuration("4h2m")
	if err != nil {
		b.Error(err)
	}

	cases := []struct {
		name       string
		expression string
		want       interface{}
	}{
		{
			"string",
			"string Hello World",
			"Hello World",
		},
		{
			"int",
			"int 42",
			int(42),
		},
		{
			"uint",
			"uint 43",
			uint(43),
		},
		{
			"float",
			"float 44",
			float32(44),
		},
		{
			"time",
			"time 05-18-1999_6:42pm",
			tim,
		},
		{
			"duration",
			"dur 4h2m",
			dur,
		},
		{
			"string int",
			"string Lmao int 42",
			&T{
				String: "Lmao",
				Int:    42,
			},
		},
		{
			"string time",
			"string What's Up Boy time 05-18-1999_6:42pm",
			&T{
				String: "What's Up Boy",
				Time:   tim,
			},
		},
	}

	for _, c := range cases {
		b.Run(c.name, func(b *testing.B) {
			nl.P(c.expression)
		})
	}
}


================================================
FILE: nlp.go
================================================
// Package nlp provides general purpose Natural Language Processing.
package nlp

import (
	"bytes"
	"errors"
	"fmt"
	"reflect"
	"strconv"
	"time"
	"unicode"

	"github.com/cdipaolo/goml/base"
	"github.com/cdipaolo/goml/text"
	"github.com/shixzie/nlp/parser"
)

// NL is a Natural Language Processor
type NL struct {
	models []*model
	naive  *text.NaiveBayes
	// Output contains the training output for the
	// NaiveBayes algorithm
	Output *bytes.Buffer
}

// New returns a *NL
func New() *NL { return &NL{Output: bytes.NewBufferString("")} }

// P proccesses the expr and returns one of
// the types passed as the i parameter to the RegistryModel
// func filled with the data inside expr
func (nl *NL) P(expr string) interface{} { return nl.models[nl.naive.Predict(expr)].fit(expr) }

// Learn maps the models samples to the models themselves and
// returns an error if something occurred while learning
func (nl *NL) Learn() error {
	if len(nl.models) > 0 {
		stream := make(chan base.TextDatapoint)
		errors := make(chan error)
		nl.naive = text.NewNaiveBayes(stream, uint8(len(nl.models)), base.OnlyWordsAndNumbers)
		nl.naive.Output = nl.Output
		go nl.naive.OnlineLearn(errors)
		for i := range nl.models {
			err := nl.models[i].learn()
			if err != nil {
				return fmt.Errorf("model#%d %v", i, err)
			}
			for _, s := range nl.models[i].samples {
				stream <- base.TextDatapoint{
					X: string(s),
					Y: uint8(i),
				}
			}
		}
		close(stream)
		for {
			err := <-errors
			if err != nil {
				return fmt.Errorf("error occurred while learning: %s", err)
			}
			// training is done!
			break
		}
		return nil
	}
	return fmt.Errorf("register at least one model before learning")
}

type model struct {
	tpy          reflect.Type
	fields       []field
	expected     [][]item
	samples      [][]byte
	timeFormat   string
	timeLocation *time.Location
}

type item struct {
	limit bool
	value []byte
	field field
}

type field struct {
	index int
	name  string
	kind  interface{}
}

// ModelOption is an option for a specific model
type ModelOption func(*model) error

// WithTimeFormat sets the format used in time.Parse(format, val),
// note that format can't contain any spaces, the default is 01-02-2006_3:04pm
func WithTimeFormat(format string) ModelOption {
	return func(m *model) error {
		for _, v := range format {
			if unicode.IsSpace(v) {
				return errors.New("time format can't contain any spaces")
			}
		}
		m.timeFormat = format
		return nil
	}
}

// WithTimeLocation sets the location used in time.ParseInLocation(format, value, loc),
// the default is time.Local
func WithTimeLocation(loc *time.Location) ModelOption {
	return func(m *model) error {
		if loc == nil {
			return errors.New("time location can't be nil")
		}
		m.timeLocation = loc
		return nil
	}
}

// RegisterModel registers a model i and creates possible patterns
// from samples, the default layout when parsing time is 01-02-2006_3:04pm
// and the default location is time.Local.
// Samples must have special formatting:
//
//	"play {Name} by {Artist}"
func (nl *NL) RegisterModel(i interface{}, samples []string, ops ...ModelOption) error {
	if i == nil {
		return fmt.Errorf("can't create model from nil value")
	}
	if len(samples) == 0 {
		return fmt.Errorf("samples can't be nil or empty")
	}
	tpy, val := reflect.TypeOf(i), reflect.ValueOf(i)
	if tpy.Kind() == reflect.Struct {
		mod := &model{
			tpy:          tpy,
			expected:     make([][]item, len(samples)),
			timeFormat:   "01-02-2006_3:04pm",
			timeLocation: time.Local,
		}
		mod.setSamples(samples)
		for _, op := range ops {
			err := op(mod)
			if err != nil {
				return err
			}
		}
	NextField:
		for i := 0; i < tpy.NumField(); i++ {
			if tpy.Field(i).Anonymous || tpy.Field(i).PkgPath != "" {
				continue NextField
			}
			if v, ok := val.Field(i).Interface().(time.Time); ok {
				mod.fields = append(mod.fields, field{i, tpy.Field(i).Name, v})
				continue NextField
			} else if v, ok := val.Field(i).Interface().(time.Duration); ok {
				mod.fields = append(mod.fields, field{i, tpy.Field(i).Name, v})
				continue NextField
			}
			switch val.Field(i).Kind() {
			case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64, reflect.Float32, reflect.Float64, reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.String:
				mod.fields = append(mod.fields, field{i, tpy.Field(i).Name, val.Field(i).Kind()})
			}
		}
		nl.models = append(nl.models, mod)
		return nil
	}
	return fmt.Errorf("can't create model from non-struct type")
}

func (m *model) learn() error {
	for sid, s := range m.samples {
		tokens, err := parser.ParseSample(sid, s)
		if err != nil {
			return err
		}
		var exps []item
		var hasAtLeastOneKey bool
		l := len(tokens)
		for i, tk := range tokens {
			if tk.Kw {
				hasAtLeastOneKey = true
				mistypedField := true
				for _, f := range m.fields {
					if string(tk.Val) == f.name {
						mistypedField = false
						exps = append(exps, item{field: f, value: tk.Val})
					}
				}
				if mistypedField {
					return fmt.Errorf("sample#%d: mistyped field %q", sid, tk.Val)
				}
			} else {
				if i+1 < l {
					if tokens[i+1].Kw {
						exps = append(exps, item{limit: true, value: tk.Val})
						continue
					}
				}
			}
		}
		if !hasAtLeastOneKey {
			return fmt.Errorf("sample#%d: need at least one keyword", sid)
		}
		m.expected[sid] = exps
	}
	return nil
}

func (m *model) selectBestSample(expr []byte) []item {
	// slice [sample_id]score
	scores := make([]int, len(m.samples))

	tokens, _ := parser.ParseSample(0, expr)

	mapping := make([][]item, len(m.samples))
	limitsOrder := make([][][]byte, len(m.samples)+1)

	for sid, exps := range m.expected {
		var currentVal [][]byte
		var reading bool
		var lastToken int
	expecteds:
		for _, e := range exps {
			// fmt.Printf("expecting: %s - limit: %v\n", e.value, e.limit)
			if e.limit {
				reading = false
				limitsOrder[sid+1] = append(limitsOrder[sid+1], e.value)
			} else {
				reading = true
			}
			// fmt.Printf("reading: %v\n", reading)
			for i := lastToken; i < len(tokens); i++ {
				t := tokens[i]
				// fmt.Printf("token: %s - isLimit: %v\n", t.Val, m.isLimit(t.Val, sid))
				if m.isLimit(t.Val, sid) {
					if sid == 0 {
						limitsOrder[0] = append(limitsOrder[0], t.Val)
					}
					scores[sid]++
					if len(currentVal) > 0 {
						// fmt.Printf("appending: %s {%v}\n", bytes.Join(currentVal, []byte{' '}), e.field.name)
						mapping[sid] = append(mapping[sid], item{field: e.field, value: bytes.Join(currentVal, []byte{' '})})
						currentVal = currentVal[:0]
						lastToken = i
						continue expecteds
					}
					lastToken = i + 1
					continue expecteds
				} else {
					if reading {
						// fmt.Printf("adding: %s\n", t.Val)
						currentVal = append(currentVal, t.Val)
					}
				}
			}
			if len(currentVal) > 0 {
				// fmt.Printf("appending: %s {%v}\n", bytes.Join(currentVal, []byte{' '}), e.field.name)
				mapping[sid] = append(mapping[sid], item{field: e.field, value: bytes.Join(currentVal, []byte{' '})})
			}
		}
		// fmt.Printf("\n\n")
	}
order:
	for i := 1; i < len(limitsOrder); i++ {
		if len(limitsOrder[0]) < len(limitsOrder[i]) {
			continue order
		}
		for j := range limitsOrder[i] {
			if !bytes.Equal(limitsOrder[i][j], limitsOrder[0][j]) {
				continue order
			}
		}
		scores[i-1]++
	}

	// fmt.Printf("orders: %s\n\n", limitsOrder)
	// fmt.Printf("scores: %v\n", scores)

	bestMapping := selectBestMapping(scores)
	if bestMapping == -1 {
		return nil
	}
	return mapping[bestMapping]
}

func selectBestMapping(scores []int) int {
	bestScore, bestMapping := -1, -1
	for id, score := range scores {
		if score > bestScore {
			bestScore = score
			bestMapping = id
		}
	}
	return bestMapping
}

func (m *model) fit(expr string) interface{} {
	val := reflect.New(m.tpy)
	if len(expr) == 0 {
		return val.Interface()
	}
	exps := m.selectBestSample([]byte(expr))
	if len(exps) > 0 {
		for _, e := range exps {
			switch t := e.field.kind.(type) {
			case reflect.Kind:
				switch t {
				case reflect.String:
					val.Elem().Field(e.field.index).SetString(string(e.value))
				case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
					v, _ := strconv.ParseUint(string(e.value), 10, 0)
					val.Elem().Field(e.field.index).SetUint(v)
				case reflect.Float32, reflect.Float64:
					v, _ := strconv.ParseFloat(string(e.value), 64)
					val.Elem().Field(e.field.index).SetFloat(v)
				case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
					v, _ := strconv.ParseInt(string(e.value), 10, 0)
					val.Elem().Field(e.field.index).SetInt(v)
				}
			case time.Time:
				v, _ := time.ParseInLocation(m.timeFormat, string(e.value), m.timeLocation)
				val.Elem().Field(e.field.index).Set(reflect.ValueOf(v))
			case time.Duration:
				v, _ := time.ParseDuration(string(e.value))
				val.Elem().Field(e.field.index).Set(reflect.ValueOf(v))
			}
		}
	}
	return val.Interface()
}

// isLimit returns true if s is a limit on expected[id]
func (m *model) isLimit(s []byte, id int) bool {
	for _, e := range m.expected[id] {
		if bytes.Equal(e.value, s) {
			return true
		}
	}
	return false
}

// setSample converts the []string samples to [][]byte
func (m *model) setSamples(samples []string) {
	for _, s := range samples {
		m.samples = append(m.samples, []byte(s))
	}
}


================================================
FILE: nlp_test.go
================================================
package nlp

import (
	"bytes"
	"reflect"
	"testing"
	"time"

	"github.com/cdipaolo/goml/text"
)

func failTest(t *testing.T, err error) {
	if err != nil {
		t.Error(err)
	}
}

func TestNL_P(t *testing.T) {
	type T struct {
		String string
		Int    int
		Uint   uint
		Float  float32
		Time   time.Time
		Dur    time.Duration
	}

	tSamples := []string{
		"string {String}",
		"int {Int}",
		"uint {Uint}",
		"float {Float}",
		"time {Time}",
		"dur {Dur}",
		"string {String} int {Int}",
		"string {String} time {Time}",
		"need {String} since {Time}",
	}

	nl := New()

	err := nl.RegisterModel(T{}, tSamples)
	failTest(t, err)

	err = nl.Learn()
	failTest(t, err)

	tim, err := time.ParseInLocation("01-02-2006_3:04pm", "05-18-1999_6:42pm", time.Local)
	failTest(t, err)

	dur, err := time.ParseDuration("4h2m")
	failTest(t, err)

	cases := []struct {
		name       string
		expression string
		want       *T
	}{
		0: {
			"string",
			"string Hello World",
			&T{String: "Hello World"},
		},
		1: {
			"int",
			"int 42",
			&T{Int: 42},
		},
		2: {
			"uint",
			"uint 43",
			&T{Uint: 43},
		},
		3: {
			"float",
			"float 44",
			&T{Float: 44},
		},
		4: {
			"time",
			"time 05-18-1999_6:42pm",
			&T{Time: tim},
		},
		5: {
			"duration",
			"dur 4h2m",
			&T{Dur: dur},
		},
		6: {
			"string int",
			"string Lmao int 42",
			&T{
				String: "Lmao",
				Int:    42,
			},
		},
		7: {
			"string time",
			"string What's Up Boy time 05-18-1999_6:42pm",
			&T{
				String: "What's Up Boy",
				Time:   tim,
			},
		},
		8: {
			"word string time",
			"Hi, I am Patrice, I need Issue#4 since 05-18-1999_6:42pm",
			&T{
				String: "Issue#4",
				Time:   tim,
			},
		},
	}
	for i, tt := range cases {
		t.Run(tt.name, func(t *testing.T) {
			if res := nl.P(tt.expression); !reflect.DeepEqual(res, tt.want) {
				t.Errorf("test#%d: got %v want %v", i, res, tt.want)
			}
		})
	}
}

func TestNL_RegisterModel(t *testing.T) {
	type fields struct {
		models []*model
		naive  *text.NaiveBayes
		Output *bytes.Buffer
	}
	type args struct {
		i       interface{}
		samples []string
		ops     []ModelOption
	}
	type T struct {
		unexported int
		Time       time.Time
	}
	tests := []struct {
		name    string
		fields  fields
		args    args
		wantErr bool
	}{
		{
			"nil struct",
			fields{},
			args{nil, nil, nil},
			true,
		},
		{
			"nil samples",
			fields{},
			args{args{}, nil, nil},
			true,
		},
		{
			"non-struct",
			fields{},
			args{[]int{}, []string{""}, nil},
			true,
		},
		{
			"unexported & time.Time",
			fields{},
			args{T{}, []string{""}, nil},
			false,
		},
		{
			"options",
			fields{},
			args{T{}, []string{""}, []ModelOption{
				WithTimeFormat("02-01-2006"),
				WithTimeLocation(time.Local),
			}},
			false,
		},
	}
	for i, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			nl := &NL{
				models: tt.fields.models,
				naive:  tt.fields.naive,
				Output: tt.fields.Output,
			}
			if err := nl.RegisterModel(tt.args.i, tt.args.samples, tt.args.ops...); (err != nil) != tt.wantErr {
				t.Errorf("[%d] NL.RegisterModel() error = %v, wantErr %v", i, err, tt.wantErr)
			}
		})
	}
}

func TestNL_Learn(t *testing.T) {
	type fields struct {
		models []*model
		naive  *text.NaiveBayes
		Output *bytes.Buffer
	}
	type T struct {
		Name string
	}
	tests := []struct {
		name    string
		fields  fields
		wantErr bool
	}{
		{
			"no models",
			fields{},
			true,
		},
		{
			"empty model sample",
			fields{
				models: []*model{
					{
						samples: [][]byte{{}},
					},
				},
				Output: bytes.NewBufferString(""),
			},
			true,
		},
		{
			"mistyped field",
			fields{
				models: []*model{
					{
						samples: [][]byte{[]byte("Hello {Namee}")},
					},
				},
				Output: bytes.NewBufferString(""),
			},
			true,
		},
		{
			"sample with no keys",
			fields{
				models: []*model{
					{
						samples: [][]byte{[]byte("Hello")},
					},
				},
				Output: bytes.NewBufferString(""),
			},
			true,
		},
	}
	for i, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			nl := &NL{
				models: tt.fields.models,
				naive:  tt.fields.naive,
				Output: tt.fields.Output,
			}
			if err := nl.Learn(); (err != nil) != tt.wantErr {
				t.Errorf("[%d] NL.Learn() error = %v, wantErr %v", i, err, tt.wantErr)
			}
		})
	}
}

func TestWithTimeFormat(t *testing.T) {
	type args struct {
		format string
		m      *model
	}
	tests := []struct {
		name    string
		args    args
		wantErr bool
	}{
		{
			"invalid format",
			args{"2006 01 02", &model{}},
			true,
		},
		{
			"valid format",
			args{"2006", &model{}},
			false,
		},
	}
	for i, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			op := WithTimeFormat(tt.args.format)
			if err := op(tt.args.m); (err != nil) != tt.wantErr {
				t.Errorf("[%d] WithTimeFormat() error = %v, wantErr %v", i, err, tt.wantErr)
			}
		})
	}
}

func TestWithTimeLocation(t *testing.T) {
	type args struct {
		loc *time.Location
		m   *model
	}
	tests := []struct {
		name    string
		args    args
		wantErr bool
	}{
		{
			"invalid location",
			args{nil, &model{}},
			true,
		},
		{
			"valid format",
			args{time.Local, &model{}},
			false,
		},
	}
	for i, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			op := WithTimeLocation(tt.args.loc)
			if err := op(tt.args.m); (err != nil) != tt.wantErr {
				t.Errorf("[%d] WithTimeFormat() error = %v, wantErr %v", i, err, tt.wantErr)
			}
		})
	}
}


================================================
FILE: parser/nlp.peg
================================================
{
// Package parser contains the sample parser for nlp
package parser

import "fmt"
import "errors"

// Token is a sample token
type Token struct {
    Kw bool
    Val []byte
}

// ParseSample will return the tokens within the sample
func ParseSample(sampleID int, sample []byte) ([]Token, error) {
    samplename := fmt.Sprintf("sample#%d", sampleID)
    tokens, err := Parse(samplename, sample)
    var errs errList
    if err != nil {
        list := err.(errList)
        for _, err := range list {
            pe := err.(*parserError)
            errs.add(fmt.Errorf("%s: %v", samplename, pe.Inner))
        }
        return nil, errs
    }
    return tokens.([]Token), nil
}

}

Sample "sample"
= vs:(Identifier / Keyword / Spacing)* {
    if len(vs.([]interface{})) == 0 {
        return nil, errors.New("empty sample")
    }
    var tokens []Token
    for _, v := range vs.([]interface{}) {
        switch tk := v.(type) {
        case Token:
            tokens = append(tokens, tk)
        default:
        }
    }
    return tokens, nil
}

Keyword "keyword"
= '{' Spacing+ v:Identifier '}' {
    return Token{Kw: true, Val: v.(Token).Val}, nil
}
/ '{' v:Identifier Spacing+ '}' {
    return Token{Kw: true, Val: v.(Token).Val}, nil
}
/ '{' Spacing+ v:Identifier Spacing+ '}' {
    return Token{Kw: true, Val: v.(Token).Val}, nil
}
/ '{' v:Identifier '}' {
    return Token{Kw: true, Val: v.(Token).Val}, nil
}


Punct "punct"
= [^a-zA-Z0-9{} ]+ {
    return Token{Val: c.text}, nil
}


Identifier "identifier"
= Punct / [^{} \t\r\n]+ {
    return Token{Val: c.text}, nil
}

Spacing "spacing"
= Space+ / _+

Space "Space"
= ' '

_ "whitespace"
= [\t\r\n]


================================================
FILE: parser/parser.go
================================================
// Package parser contains the sample parser for nlp
package parser

import (
	"bytes"
	"errors"
	"fmt"
	"io"
	"io/ioutil"
	"math"
	"os"
	"sort"
	"strings"
	"unicode"
	"unicode/utf8"
)

// Token is a sample token
type Token struct {
	Kw  bool
	Val []byte
}

// ParseSample will return the tokens within the sample
func ParseSample(sampleID int, sample []byte) ([]Token, error) {
	samplename := fmt.Sprintf("sample#%d", sampleID)
	tokens, err := Parse(samplename, sample)
	var errs errList
	if err != nil {
		list := err.(errList)
		for _, err := range list {
			pe := err.(*parserError)
			errs.add(fmt.Errorf("%s: %v", samplename, pe.Inner))
		}
		return nil, errs
	}
	return tokens.([]Token), nil
}

var g = &grammar{
	rules: []*rule{
		{
			name:        "Sample",
			displayName: "\"sample\"",
			pos:         position{line: 32, col: 1, offset: 685},
			expr: &actionExpr{
				pos: position{line: 33, col: 3, offset: 703},
				run: (*parser).callonSample1,
				expr: &labeledExpr{
					pos:   position{line: 33, col: 3, offset: 703},
					label: "vs",
					expr: &zeroOrMoreExpr{
						pos: position{line: 33, col: 6, offset: 706},
						expr: &choiceExpr{
							pos: position{line: 33, col: 7, offset: 707},
							alternatives: []interface{}{
								&ruleRefExpr{
									pos:  position{line: 33, col: 7, offset: 707},
									name: "Identifier",
								},
								&ruleRefExpr{
									pos:  position{line: 33, col: 20, offset: 720},
									name: "Keyword",
								},
								&ruleRefExpr{
									pos:  position{line: 33, col: 30, offset: 730},
									name: "Spacing",
								},
							},
						},
					},
				},
			},
		},
		{
			name:        "Keyword",
			displayName: "\"keyword\"",
			pos:         position{line: 48, col: 1, offset: 1050},
			expr: &choiceExpr{
				pos: position{line: 49, col: 3, offset: 1070},
				alternatives: []interface{}{
					&actionExpr{
						pos: position{line: 49, col: 3, offset: 1070},
						run: (*parser).callonKeyword2,
						expr: &seqExpr{
							pos: position{line: 49, col: 3, offset: 1070},
							exprs: []interface{}{
								&litMatcher{
									pos:        position{line: 49, col: 3, offset: 1070},
									val:        "{",
									ignoreCase: false,
								},
								&oneOrMoreExpr{
									pos: position{line: 49, col: 7, offset: 1074},
									expr: &ruleRefExpr{
										pos:  position{line: 49, col: 7, offset: 1074},
										name: "Spacing",
									},
								},
								&labeledExpr{
									pos:   position{line: 49, col: 16, offset: 1083},
									label: "v",
									expr: &ruleRefExpr{
										pos:  position{line: 49, col: 18, offset: 1085},
										name: "Identifier",
									},
								},
								&litMatcher{
									pos:        position{line: 49, col: 29, offset: 1096},
									val:        "}",
									ignoreCase: false,
								},
							},
						},
					},
					&actionExpr{
						pos: position{line: 52, col: 3, offset: 1158},
						run: (*parser).callonKeyword10,
						expr: &seqExpr{
							pos: position{line: 52, col: 3, offset: 1158},
							exprs: []interface{}{
								&litMatcher{
									pos:        position{line: 52, col: 3, offset: 1158},
									val:        "{",
									ignoreCase: false,
								},
								&labeledExpr{
									pos:   position{line: 52, col: 7, offset: 1162},
									label: "v",
									expr: &ruleRefExpr{
										pos:  position{line: 52, col: 9, offset: 1164},
										name: "Identifier",
									},
								},
								&oneOrMoreExpr{
									pos: position{line: 52, col: 20, offset: 1175},
									expr: &ruleRefExpr{
										pos:  position{line: 52, col: 20, offset: 1175},
										name: "Spacing",
									},
								},
								&litMatcher{
									pos:        position{line: 52, col: 29, offset: 1184},
									val:        "}",
									ignoreCase: false,
								},
							},
						},
					},
					&actionExpr{
						pos: position{line: 55, col: 3, offset: 1246},
						run: (*parser).callonKeyword18,
						expr: &seqExpr{
							pos: position{line: 55, col: 3, offset: 1246},
							exprs: []interface{}{
								&litMatcher{
									pos:        position{line: 55, col: 3, offset: 1246},
									val:        "{",
									ignoreCase: false,
								},
								&oneOrMoreExpr{
									pos: position{line: 55, col: 7, offset: 1250},
									expr: &ruleRefExpr{
										pos:  position{line: 55, col: 7, offset: 1250},
										name: "Spacing",
									},
								},
								&labeledExpr{
									pos:   position{line: 55, col: 16, offset: 1259},
									label: "v",
									expr: &ruleRefExpr{
										pos:  position{line: 55, col: 18, offset: 1261},
										name: "Identifier",
									},
								},
								&oneOrMoreExpr{
									pos: position{line: 55, col: 29, offset: 1272},
									expr: &ruleRefExpr{
										pos:  position{line: 55, col: 29, offset: 1272},
										name: "Spacing",
									},
								},
								&litMatcher{
									pos:        position{line: 55, col: 38, offset: 1281},
									val:        "}",
									ignoreCase: false,
								},
							},
						},
					},
					&actionExpr{
						pos: position{line: 58, col: 3, offset: 1343},
						run: (*parser).callonKeyword28,
						expr: &seqExpr{
							pos: position{line: 58, col: 3, offset: 1343},
							exprs: []interface{}{
								&litMatcher{
									pos:        position{line: 58, col: 3, offset: 1343},
									val:        "{",
									ignoreCase: false,
								},
								&labeledExpr{
									pos:   position{line: 58, col: 7, offset: 1347},
									label: "v",
									expr: &ruleRefExpr{
										pos:  position{line: 58, col: 9, offset: 1349},
										name: "Identifier",
									},
								},
								&litMatcher{
									pos:        position{line: 58, col: 20, offset: 1360},
									val:        "}",
									ignoreCase: false,
								},
							},
						},
					},
				},
			},
		},
		{
			name:        "Punct",
			displayName: "\"punct\"",
			pos:         position{line: 63, col: 1, offset: 1422},
			expr: &actionExpr{
				pos: position{line: 64, col: 3, offset: 1438},
				run: (*parser).callonPunct1,
				expr: &oneOrMoreExpr{
					pos: position{line: 64, col: 3, offset: 1438},
					expr: &charClassMatcher{
						pos:        position{line: 64, col: 3, offset: 1438},
						val:        "[^a-zA-Z0-9{} ]",
						chars:      []rune{'{', '}', ' '},
						ranges:     []rune{'a', 'z', 'A', 'Z', '0', '9'},
						ignoreCase: false,
						inverted:   true,
					},
				},
			},
		},
		{
			name:        "Identifier",
			displayName: "\"identifier\"",
			pos:         position{line: 69, col: 1, offset: 1496},
			expr: &choiceExpr{
				pos: position{line: 70, col: 3, offset: 1522},
				alternatives: []interface{}{
					&ruleRefExpr{
						pos:  position{line: 70, col: 3, offset: 1522},
						name: "Punct",
					},
					&actionExpr{
						pos: position{line: 70, col: 11, offset: 1530},
						run: (*parser).callonIdentifier3,
						expr: &oneOrMoreExpr{
							pos: position{line: 70, col: 11, offset: 1530},
							expr: &charClassMatcher{
								pos:        position{line: 70, col: 11, offset: 1530},
								val:        "[^{} \\t\\r\\n]",
								chars:      []rune{'{', '}', ' ', '\t', '\r', '\n'},
								ignoreCase: false,
								inverted:   true,
							},
						},
					},
				},
			},
		},
		{
			name:        "Spacing",
			displayName: "\"spacing\"",
			pos:         position{line: 74, col: 1, offset: 1584},
			expr: &choiceExpr{
				pos: position{line: 75, col: 3, offset: 1604},
				alternatives: []interface{}{
					&oneOrMoreExpr{
						pos: position{line: 75, col: 3, offset: 1604},
						expr: &ruleRefExpr{
							pos:  position{line: 75, col: 3, offset: 1604},
							name: "Space",
						},
					},
					&oneOrMoreExpr{
						pos: position{line: 75, col: 12, offset: 1613},
						expr: &ruleRefExpr{
							pos:  position{line: 75, col: 12, offset: 1613},
							name: "_",
						},
					},
				},
			},
		},
		{
			name:        "Space",
			displayName: "\"Space\"",
			pos:         position{line: 77, col: 1, offset: 1617},
			expr: &litMatcher{
				pos:        position{line: 78, col: 3, offset: 1633},
				val:        " ",
				ignoreCase: false,
			},
		},
		{
			name:        "_",
			displayName: "\"whitespace\"",
			pos:         position{line: 80, col: 1, offset: 1638},
			expr: &charClassMatcher{
				pos:        position{line: 81, col: 3, offset: 1655},
				val:        "[\\t\\r\\n]",
				chars:      []rune{'\t', '\r', '\n'},
				ignoreCase: false,
				inverted:   false,
			},
		},
	},
}

func (c *current) onSample1(vs interface{}) (interface{}, error) {
	if len(vs.([]interface{})) == 0 {
		return nil, errors.New("empty sample")
	}
	var tokens []Token
	for _, v := range vs.([]interface{}) {
		switch tk := v.(type) {
		case Token:
			tokens = append(tokens, tk)
		default:
		}
	}
	return tokens, nil
}

func (p *parser) callonSample1() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onSample1(stack["vs"])
}

func (c *current) onKeyword2(v interface{}) (interface{}, error) {
	return Token{Kw: true, Val: v.(Token).Val}, nil
}

func (p *parser) callonKeyword2() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onKeyword2(stack["v"])
}

func (c *current) onKeyword10(v interface{}) (interface{}, error) {
	return Token{Kw: true, Val: v.(Token).Val}, nil
}

func (p *parser) callonKeyword10() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onKeyword10(stack["v"])
}

func (c *current) onKeyword18(v interface{}) (interface{}, error) {
	return Token{Kw: true, Val: v.(Token).Val}, nil
}

func (p *parser) callonKeyword18() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onKeyword18(stack["v"])
}

func (c *current) onKeyword28(v interface{}) (interface{}, error) {
	return Token{Kw: true, Val: v.(Token).Val}, nil
}

func (p *parser) callonKeyword28() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onKeyword28(stack["v"])
}

func (c *current) onPunct1() (interface{}, error) {
	return Token{Val: c.text}, nil
}

func (p *parser) callonPunct1() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onPunct1()
}

func (c *current) onIdentifier3() (interface{}, error) {
	return Token{Val: c.text}, nil
}

func (p *parser) callonIdentifier3() (interface{}, error) {
	stack := p.vstack[len(p.vstack)-1]
	_ = stack
	return p.cur.onIdentifier3()
}

var (
	// errNoRule is returned when the grammar to parse has no rule.
	errNoRule = errors.New("grammar has no rule")

	// errInvalidEncoding is returned when the source is not properly
	// utf8-encoded.
	errInvalidEncoding = errors.New("invalid encoding")

	// errMaxExprCnt is used to signal that the maximum number of
	// expressions have been parsed.
	errMaxExprCnt = errors.New("max number of expresssions parsed")
)

// Option is a function that can set an option on the parser. It returns
// the previous setting as an Option.
type Option func(*parser) Option

// MaxExpressions creates an Option to stop parsing after the provided
// number of expressions have been parsed, if the value is 0 then the parser will
// parse for as many steps as needed (possibly an infinite number).
//
// The default for maxExprCnt is 0.
func MaxExpressions(maxExprCnt uint64) Option {
	return func(p *parser) Option {
		oldMaxExprCnt := p.maxExprCnt
		p.maxExprCnt = maxExprCnt
		return MaxExpressions(oldMaxExprCnt)
	}
}

// Debug creates an Option to set the debug flag to b. When set to true,
// debugging information is printed to stdout while parsing.
//
// The default is false.
func Debug(b bool) Option {
	return func(p *parser) Option {
		old := p.debug
		p.debug = b
		return Debug(old)
	}
}

// Memoize creates an Option to set the memoize flag to b. When set to true,
// the parser will cache all results so each expression is evaluated only
// once. This guarantees linear parsing time even for pathological cases,
// at the expense of more memory and slower times for typical cases.
//
// The default is false.
func Memoize(b bool) Option {
	return func(p *parser) Option {
		old := p.memoize
		p.memoize = b
		return Memoize(old)
	}
}

// Recover creates an Option to set the recover flag to b. When set to
// true, this causes the parser to recover from panics and convert it
// to an error. Setting it to false can be useful while debugging to
// access the full stack trace.
//
// The default is true.
func Recover(b bool) Option {
	return func(p *parser) Option {
		old := p.recover
		p.recover = b
		return Recover(old)
	}
}

// GlobalStore creates an Option to set a key to a certain value in
// the globalStore.
func GlobalStore(key string, value interface{}) Option {
	return func(p *parser) Option {
		old := p.cur.globalStore[key]
		p.cur.globalStore[key] = value
		return GlobalStore(key, old)
	}
}

// ParseFile parses the file identified by filename.
func ParseFile(filename string, opts ...Option) (i interface{}, err error) {
	f, err := os.Open(filename)
	if err != nil {
		return nil, err
	}
	defer func() {
		if closeErr := f.Close(); closeErr != nil {
			err = closeErr
		}
	}()
	return ParseReader(filename, f, opts...)
}

// ParseReader parses the data from r using filename as information in the
// error messages.
func ParseReader(filename string, r io.Reader, opts ...Option) (interface{}, error) {
	b, err := ioutil.ReadAll(r)
	if err != nil {
		return nil, err
	}

	return Parse(filename, b, opts...)
}

// Parse parses the data from b using filename as information in the
// error messages.
func Parse(filename string, b []byte, opts ...Option) (interface{}, error) {
	return newParser(filename, b, opts...).parse(g)
}

// position records a position in the text.
type position struct {
	line, col, offset int
}

func (p position) String() string {
	return fmt.Sprintf("%d:%d [%d]", p.line, p.col, p.offset)
}

// savepoint stores all state required to go back to this point in the
// parser.
type savepoint struct {
	position
	rn rune
	w  int
}

type current struct {
	pos  position // start position of the match
	text []byte   // raw text of the match

	// the globalStore allows the parser to store arbitrary values
	globalStore map[string]interface{}
}

// the AST types...

type grammar struct {
	pos   position
	rules []*rule
}

type rule struct {
	pos         position
	name        string
	displayName string
	expr        interface{}
}

type choiceExpr struct {
	pos          position
	alternatives []interface{}
}

type actionExpr struct {
	pos  position
	expr interface{}
	run  func(*parser) (interface{}, error)
}

type seqExpr struct {
	pos   position
	exprs []interface{}
}

type labeledExpr struct {
	pos   position
	label string
	expr  interface{}
}

type expr struct {
	pos  position
	expr interface{}
}

type andExpr expr
type notExpr expr
type zeroOrOneExpr expr
type zeroOrMoreExpr expr
type oneOrMoreExpr expr

type ruleRefExpr struct {
	pos  position
	name string
}

type andCodeExpr struct {
	pos position
	run func(*parser) (bool, error)
}

type notCodeExpr struct {
	pos position
	run func(*parser) (bool, error)
}

type litMatcher struct {
	pos        position
	val        string
	ignoreCase bool
}

type charClassMatcher struct {
	pos             position
	val             string
	basicLatinChars [128]bool
	chars           []rune
	ranges          []rune
	classes         []*unicode.RangeTable
	ignoreCase      bool
	inverted        bool
}

type anyMatcher position

// errList cumulates the errors found by the parser.
type errList []error

func (e *errList) add(err error) {
	*e = append(*e, err)
}

func (e errList) err() error {
	if len(e) == 0 {
		return nil
	}
	e.dedupe()
	return e
}

func (e *errList) dedupe() {
	var cleaned []error
	set := make(map[string]bool)
	for _, err := range *e {
		if msg := err.Error(); !set[msg] {
			set[msg] = true
			cleaned = append(cleaned, err)
		}
	}
	*e = cleaned
}

func (e errList) Error() string {
	switch len(e) {
	case 0:
		return ""
	case 1:
		return e[0].Error()
	default:
		var buf bytes.Buffer

		for i, err := range e {
			if i > 0 {
				buf.WriteRune('\n')
			}
			buf.WriteString(err.Error())
		}
		return buf.String()
	}
}

// parserError wraps an error with a prefix indicating the rule in which
// the error occurred. The original error is stored in the Inner field.
type parserError struct {
	Inner    error
	pos      position
	prefix   string
	expected []string
}

// Error returns the error message.
func (p *parserError) Error() string {
	return p.prefix + ": " + p.Inner.Error()
}

// newParser creates a parser with the specified input source and options.
func newParser(filename string, b []byte, opts ...Option) *parser {
	p := &parser{
		filename: filename,
		errs:     new(errList),
		data:     b,
		pt:       savepoint{position: position{line: 1}},
		recover:  true,
		cur: current{
			globalStore: make(map[string]interface{}),
		},
		maxFailPos:      position{col: 1, line: 1},
		maxFailExpected: make([]string, 0, 20),
	}
	p.setOptions(opts)

	if p.maxExprCnt == 0 {
		p.maxExprCnt = math.MaxUint64
	}

	return p
}

// setOptions applies the options to the parser.
func (p *parser) setOptions(opts []Option) {
	for _, opt := range opts {
		opt(p)
	}
}

type resultTuple struct {
	v   interface{}
	b   bool
	end savepoint
}

type parser struct {
	filename string
	pt       savepoint
	cur      current

	data []byte
	errs *errList

	depth   int
	recover bool
	debug   bool

	memoize bool
	// memoization table for the packrat algorithm:
	// map[offset in source] map[expression or rule] {value, match}
	memo map[int]map[interface{}]resultTuple

	// rules table, maps the rule identifier to the rule node
	rules map[string]*rule
	// variables stack, map of label to value
	vstack []map[string]interface{}
	// rule stack, allows identification of the current rule in errors
	rstack []*rule

	// parse fail
	maxFailPos            position
	maxFailExpected       []string
	maxFailInvertExpected bool

	// stats and used for stopping the parser
	// after a maximum number of expressions are parsed
	exprCnt uint64

	// max number of expressions to be parsed
	maxExprCnt uint64
}

// push a variable set on the vstack.
func (p *parser) pushV() {
	if cap(p.vstack) == len(p.vstack) {
		// create new empty slot in the stack
		p.vstack = append(p.vstack, nil)
	} else {
		// slice to 1 more
		p.vstack = p.vstack[:len(p.vstack)+1]
	}

	// get the last args set
	m := p.vstack[len(p.vstack)-1]
	if m != nil && len(m) == 0 {
		// empty map, all good
		return
	}

	m = make(map[string]interface{})
	p.vstack[len(p.vstack)-1] = m
}

// pop a variable set from the vstack.
func (p *parser) popV() {
	// if the map is not empty, clear it
	m := p.vstack[len(p.vstack)-1]
	if len(m) > 0 {
		// GC that map
		p.vstack[len(p.vstack)-1] = nil
	}
	p.vstack = p.vstack[:len(p.vstack)-1]
}

func (p *parser) print(prefix, s string) string {
	if !p.debug {
		return s
	}

	fmt.Printf("%s %d:%d:%d: %s [%#U]\n",
		prefix, p.pt.line, p.pt.col, p.pt.offset, s, p.pt.rn)
	return s
}

func (p *parser) in(s string) string {
	p.depth++
	return p.print(strings.Repeat(" ", p.depth)+">", s)
}

func (p *parser) out(s string) string {
	p.depth--
	return p.print(strings.Repeat(" ", p.depth)+"<", s)
}

func (p *parser) addErr(err error) {
	p.addErrAt(err, p.pt.position, []string{})
}

func (p *parser) addErrAt(err error, pos position, expected []string) {
	var buf bytes.Buffer
	if p.filename != "" {
		buf.WriteString(p.filename)
	}
	if buf.Len() > 0 {
		buf.WriteString(":")
	}
	buf.WriteString(fmt.Sprintf("%d:%d (%d)", pos.line, pos.col, pos.offset))
	if len(p.rstack) > 0 {
		if buf.Len() > 0 {
			buf.WriteString(": ")
		}
		rule := p.rstack[len(p.rstack)-1]
		if rule.displayName != "" {
			buf.WriteString("rule " + rule.displayName)
		} else {
			buf.WriteString("rule " + rule.name)
		}
	}
	pe := &parserError{Inner: err, pos: pos, prefix: buf.String(), expected: expected}
	p.errs.add(pe)
}

func (p *parser) failAt(fail bool, pos position, want string) {
	// process fail if parsing fails and not inverted or parsing succeeds and invert is set
	if fail == p.maxFailInvertExpected {
		if pos.offset < p.maxFailPos.offset {
			return
		}

		if pos.offset > p.maxFailPos.offset {
			p.maxFailPos = pos
			p.maxFailExpected = p.maxFailExpected[:0]
		}

		if p.maxFailInvertExpected {
			want = "!" + want
		}
		p.maxFailExpected = append(p.maxFailExpected, want)
	}
}

// read advances the parser to the next rune.
func (p *parser) read() {
	p.pt.offset += p.pt.w
	rn, n := utf8.DecodeRune(p.data[p.pt.offset:])
	p.pt.rn = rn
	p.pt.w = n
	p.pt.col++
	if rn == '\n' {
		p.pt.line++
		p.pt.col = 0
	}

	if rn == utf8.RuneError {
		if n == 1 {
			p.addErr(errInvalidEncoding)
		}
	}
}

// restore parser position to the savepoint pt.
func (p *parser) restore(pt savepoint) {
	if p.debug {
		defer p.out(p.in("restore"))
	}
	if pt.offset == p.pt.offset {
		return
	}
	p.pt = pt
}

// get the slice of bytes from the savepoint start to the current position.
func (p *parser) sliceFrom(start savepoint) []byte {
	return p.data[start.position.offset:p.pt.position.offset]
}

func (p *parser) getMemoized(node interface{}) (resultTuple, bool) {
	if len(p.memo) == 0 {
		return resultTuple{}, false
	}
	m := p.memo[p.pt.offset]
	if len(m) == 0 {
		return resultTuple{}, false
	}
	res, ok := m[node]
	return res, ok
}

func (p *parser) setMemoized(pt savepoint, node interface{}, tuple resultTuple) {
	if p.memo == nil {
		p.memo = make(map[int]map[interface{}]resultTuple)
	}
	m := p.memo[pt.offset]
	if m == nil {
		m = make(map[interface{}]resultTuple)
		p.memo[pt.offset] = m
	}
	m[node] = tuple
}

func (p *parser) buildRulesTable(g *grammar) {
	p.rules = make(map[string]*rule, len(g.rules))
	for _, r := range g.rules {
		p.rules[r.name] = r
	}
}

func (p *parser) parse(g *grammar) (val interface{}, err error) {
	if len(g.rules) == 0 {
		p.addErr(errNoRule)
		return nil, p.errs.err()
	}

	// TODO : not super critical but this could be generated
	p.buildRulesTable(g)

	if p.recover {
		// panic can be used in action code to stop parsing immediately
		// and return the panic as an error.
		defer func() {
			if e := recover(); e != nil {
				if p.debug {
					defer p.out(p.in("panic handler"))
				}
				val = nil
				switch e := e.(type) {
				case error:
					p.addErr(e)
				default:
					p.addErr(fmt.Errorf("%v", e))
				}
				err = p.errs.err()
			}
		}()
	}

	// start rule is rule [0]
	p.read() // advance to first rune
	val, ok := p.parseRule(g.rules[0])
	if !ok {
		if len(*p.errs) == 0 {
			// If parsing fails, but no errors have been recorded, the expected values
			// for the farthest parser position are returned as error.
			maxFailExpectedMap := make(map[string]struct{}, len(p.maxFailExpected))
			for _, v := range p.maxFailExpected {
				maxFailExpectedMap[v] = struct{}{}
			}
			expected := make([]string, 0, len(maxFailExpectedMap))
			eof := false
			if _, ok := maxFailExpectedMap["!."]; ok {
				delete(maxFailExpectedMap, "!.")
				eof = true
			}
			for k := range maxFailExpectedMap {
				expected = append(expected, k)
			}
			sort.Strings(expected)
			if eof {
				expected = append(expected, "EOF")
			}
			p.addErrAt(errors.New("no match found, expected: "+listJoin(expected, ", ", "or")), p.maxFailPos, expected)
		}

		return nil, p.errs.err()
	}
	return val, p.errs.err()
}

func listJoin(list []string, sep string, lastSep string) string {
	switch len(list) {
	case 0:
		return ""
	case 1:
		return list[0]
	default:
		return fmt.Sprintf("%s %s %s", strings.Join(list[:len(list)-1], sep), lastSep, list[len(list)-1])
	}
}

func (p *parser) parseRule(rule *rule) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseRule " + rule.name))
	}

	if p.memoize {
		res, ok := p.getMemoized(rule)
		if ok {
			p.restore(res.end)
			return res.v, res.b
		}
	}

	start := p.pt
	p.rstack = append(p.rstack, rule)
	p.pushV()
	val, ok := p.parseExpr(rule.expr)
	p.popV()
	p.rstack = p.rstack[:len(p.rstack)-1]
	if ok && p.debug {
		p.print(strings.Repeat(" ", p.depth)+"MATCH", string(p.sliceFrom(start)))
	}

	if p.memoize {
		p.setMemoized(start, rule, resultTuple{val, ok, p.pt})
	}
	return val, ok
}

func (p *parser) parseExpr(expr interface{}) (interface{}, bool) {
	var pt savepoint

	if p.memoize {
		res, ok := p.getMemoized(expr)
		if ok {
			p.restore(res.end)
			return res.v, res.b
		}
		pt = p.pt
	}

	p.exprCnt++
	if p.exprCnt > p.maxExprCnt {
		panic(errMaxExprCnt)
	}

	var val interface{}
	var ok bool
	switch expr := expr.(type) {
	case *actionExpr:
		val, ok = p.parseActionExpr(expr)
	case *andCodeExpr:
		val, ok = p.parseAndCodeExpr(expr)
	case *andExpr:
		val, ok = p.parseAndExpr(expr)
	case *anyMatcher:
		val, ok = p.parseAnyMatcher(expr)
	case *charClassMatcher:
		val, ok = p.parseCharClassMatcher(expr)
	case *choiceExpr:
		val, ok = p.parseChoiceExpr(expr)
	case *labeledExpr:
		val, ok = p.parseLabeledExpr(expr)
	case *litMatcher:
		val, ok = p.parseLitMatcher(expr)
	case *notCodeExpr:
		val, ok = p.parseNotCodeExpr(expr)
	case *notExpr:
		val, ok = p.parseNotExpr(expr)
	case *oneOrMoreExpr:
		val, ok = p.parseOneOrMoreExpr(expr)
	case *ruleRefExpr:
		val, ok = p.parseRuleRefExpr(expr)
	case *seqExpr:
		val, ok = p.parseSeqExpr(expr)
	case *zeroOrMoreExpr:
		val, ok = p.parseZeroOrMoreExpr(expr)
	case *zeroOrOneExpr:
		val, ok = p.parseZeroOrOneExpr(expr)
	default:
		panic(fmt.Sprintf("unknown expression type %T", expr))
	}
	if p.memoize {
		p.setMemoized(pt, expr, resultTuple{val, ok, p.pt})
	}
	return val, ok
}

func (p *parser) parseActionExpr(act *actionExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseActionExpr"))
	}

	start := p.pt
	val, ok := p.parseExpr(act.expr)
	if ok {
		p.cur.pos = start.position
		p.cur.text = p.sliceFrom(start)
		actVal, err := act.run(p)
		if err != nil {
			p.addErrAt(err, start.position, []string{})
		}
		val = actVal
	}
	if ok && p.debug {
		p.print(strings.Repeat(" ", p.depth)+"MATCH", string(p.sliceFrom(start)))
	}
	return val, ok
}

func (p *parser) parseAndCodeExpr(and *andCodeExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseAndCodeExpr"))
	}

	ok, err := and.run(p)
	if err != nil {
		p.addErr(err)
	}
	return nil, ok
}

func (p *parser) parseAndExpr(and *andExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseAndExpr"))
	}

	pt := p.pt
	p.pushV()
	_, ok := p.parseExpr(and.expr)
	p.popV()
	p.restore(pt)
	return nil, ok
}

func (p *parser) parseAnyMatcher(any *anyMatcher) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseAnyMatcher"))
	}

	if p.pt.rn != utf8.RuneError {
		start := p.pt
		p.read()
		p.failAt(true, start.position, ".")
		return p.sliceFrom(start), true
	}
	p.failAt(false, p.pt.position, ".")
	return nil, false
}

func (p *parser) parseCharClassMatcher(chr *charClassMatcher) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseCharClassMatcher"))
	}

	cur := p.pt.rn
	start := p.pt

	// can't match EOF
	if cur == utf8.RuneError {
		p.failAt(false, start.position, chr.val)
		return nil, false
	}

	if chr.ignoreCase {
		cur = unicode.ToLower(cur)
	}

	// try to match in the list of available chars
	for _, rn := range chr.chars {
		if rn == cur {
			if chr.inverted {
				p.failAt(false, start.position, chr.val)
				return nil, false
			}
			p.read()
			p.failAt(true, start.position, chr.val)
			return p.sliceFrom(start), true
		}
	}

	// try to match in the list of ranges
	for i := 0; i < len(chr.ranges); i += 2 {
		if cur >= chr.ranges[i] && cur <= chr.ranges[i+1] {
			if chr.inverted {
				p.failAt(false, start.position, chr.val)
				return nil, false
			}
			p.read()
			p.failAt(true, start.position, chr.val)
			return p.sliceFrom(start), true
		}
	}

	// try to match in the list of Unicode classes
	for _, cl := range chr.classes {
		if unicode.Is(cl, cur) {
			if chr.inverted {
				p.failAt(false, start.position, chr.val)
				return nil, false
			}
			p.read()
			p.failAt(true, start.position, chr.val)
			return p.sliceFrom(start), true
		}
	}

	if chr.inverted {
		p.read()
		p.failAt(true, start.position, chr.val)
		return p.sliceFrom(start), true
	}
	p.failAt(false, start.position, chr.val)
	return nil, false
}

func (p *parser) parseChoiceExpr(ch *choiceExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseChoiceExpr"))
	}

	for _, alt := range ch.alternatives {
		p.pushV()
		val, ok := p.parseExpr(alt)
		p.popV()
		if ok {
			return val, ok
		}
	}
	return nil, false
}

func (p *parser) parseLabeledExpr(lab *labeledExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseLabeledExpr"))
	}

	p.pushV()
	val, ok := p.parseExpr(lab.expr)
	p.popV()
	if ok && lab.label != "" {
		m := p.vstack[len(p.vstack)-1]
		m[lab.label] = val
	}
	return val, ok
}

func (p *parser) parseLitMatcher(lit *litMatcher) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseLitMatcher"))
	}

	ignoreCase := ""
	if lit.ignoreCase {
		ignoreCase = "i"
	}
	val := fmt.Sprintf("%q%s", lit.val, ignoreCase)
	start := p.pt
	for _, want := range lit.val {
		cur := p.pt.rn
		if lit.ignoreCase {
			cur = unicode.ToLower(cur)
		}
		if cur != want {
			p.failAt(false, start.position, val)
			p.restore(start)
			return nil, false
		}
		p.read()
	}
	p.failAt(true, start.position, val)
	return p.sliceFrom(start), true
}

func (p *parser) parseNotCodeExpr(not *notCodeExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseNotCodeExpr"))
	}

	ok, err := not.run(p)
	if err != nil {
		p.addErr(err)
	}
	return nil, !ok
}

func (p *parser) parseNotExpr(not *notExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseNotExpr"))
	}

	pt := p.pt
	p.pushV()
	p.maxFailInvertExpected = !p.maxFailInvertExpected
	_, ok := p.parseExpr(not.expr)
	p.maxFailInvertExpected = !p.maxFailInvertExpected
	p.popV()
	p.restore(pt)
	return nil, !ok
}

func (p *parser) parseOneOrMoreExpr(expr *oneOrMoreExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseOneOrMoreExpr"))
	}

	var vals []interface{}

	for {
		p.pushV()
		val, ok := p.parseExpr(expr.expr)
		p.popV()
		if !ok {
			if len(vals) == 0 {
				// did not match once, no match
				return nil, false
			}
			return vals, true
		}
		vals = append(vals, val)
	}
}

func (p *parser) parseRuleRefExpr(ref *ruleRefExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseRuleRefExpr " + ref.name))
	}

	if ref.name == "" {
		panic(fmt.Sprintf("%s: invalid rule: missing name", ref.pos))
	}

	rule := p.rules[ref.name]
	if rule == nil {
		p.addErr(fmt.Errorf("undefined rule: %s", ref.name))
		return nil, false
	}
	return p.parseRule(rule)
}

func (p *parser) parseSeqExpr(seq *seqExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseSeqExpr"))
	}

	vals := make([]interface{}, 0, len(seq.exprs))

	pt := p.pt
	for _, expr := range seq.exprs {
		val, ok := p.parseExpr(expr)
		if !ok {
			p.restore(pt)
			return nil, false
		}
		vals = append(vals, val)
	}
	return vals, true
}

func (p *parser) parseZeroOrMoreExpr(expr *zeroOrMoreExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseZeroOrMoreExpr"))
	}

	var vals []interface{}

	for {
		p.pushV()
		val, ok := p.parseExpr(expr.expr)
		p.popV()
		if !ok {
			return vals, true
		}
		vals = append(vals, val)
	}
}

func (p *parser) parseZeroOrOneExpr(expr *zeroOrOneExpr) (interface{}, bool) {
	if p.debug {
		defer p.out(p.in("parseZeroOrOneExpr"))
	}

	p.pushV()
	val, _ := p.parseExpr(expr.expr)
	p.popV()
	// whether it matched or not, consider it a match
	return val, true
}


================================================
FILE: parser/parser_test.go
================================================
package parser

import (
	"reflect"
	"testing"
)

func TestParseSample(t *testing.T) {
	type args struct {
		sampleID int
		sample   []byte
	}
	tests := []struct {
		name    string
		args    args
		want    []Token
		wantErr bool
	}{
		0: {
			"err: empty sample",
			args{0, nil},
			nil,
			true,
		},
		1: {
			"normal sample",
			args{1, []byte("play {Name} from {Artist}")},
			[]Token{
				{Val: []byte("play")},
				{Kw: true, Val: []byte("Name")},
				{Val: []byte("from")},
				{Kw: true, Val: []byte("Artist")},
			},
			false,
		},
		2: {
			"spacing inside keys",
			args{1, []byte("play { 	Name} from {	Artist		}")},
			[]Token{
				{Val: []byte("play")},
				{Kw: true, Val: []byte("Name")},
				{Val: []byte("from")},
				{Kw: true, Val: []byte("Artist")},
			},
			false,
		},
		3: {
			"multi word",
			args{1, []byte("I need {Name} since {Since}")},
			[]Token{
				{Val: []byte("I")},
				{Val: []byte("need")},
				{Kw: true, Val: []byte("Name")},
				{Val: []byte("since")},
				{Kw: true, Val: []byte("Since")},
			},
			false,
		},
	}
	for i, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			got, err := ParseSample(tt.args.sampleID, tt.args.sample)
			if (err != nil) != tt.wantErr {
				t.Errorf("Test#%d: ParseSample() error = %v, wantErr %v", i, err, tt.wantErr)
				return
			}
			if !reflect.DeepEqual(got, tt.want) {
				t.Errorf("Test#%d: ParseSample() = %v, want %v", i, got, tt.want)
			}
		})
	}
}
Download .txt
gitextract_2oze1ahc/

├── .gitignore
├── .travis.yml
├── Gopkg.toml
├── LICENSE
├── Makefile
├── README.md
├── benchmark_test.go
├── nlp.go
├── nlp_test.go
└── parser/
    ├── nlp.peg
    ├── parser.go
    └── parser_test.go
Download .txt
SYMBOL INDEX (116 symbols across 5 files)

FILE: benchmark_test.go
  function BenchmarkNL_P (line 8) | func BenchmarkNL_P(b *testing.B) {

FILE: nlp.go
  type NL (line 19) | type NL struct
    method P (line 33) | func (nl *NL) P(expr string) interface{} { return nl.models[nl.naive.P...
    method Learn (line 37) | func (nl *NL) Learn() error {
    method RegisterModel (line 126) | func (nl *NL) RegisterModel(i interface{}, samples []string, ops ...Mo...
  function New (line 28) | func New() *NL { return &NL{Output: bytes.NewBufferString("")} }
  type model (line 70) | type model struct
    method learn (line 171) | func (m *model) learn() error {
    method selectBestSample (line 210) | func (m *model) selectBestSample(expr []byte) []item {
    method fit (line 298) | func (m *model) fit(expr string) interface{} {
    method isLimit (line 334) | func (m *model) isLimit(s []byte, id int) bool {
    method setSamples (line 344) | func (m *model) setSamples(samples []string) {
  type item (line 79) | type item struct
  type field (line 85) | type field struct
  type ModelOption (line 92) | type ModelOption
  function WithTimeFormat (line 96) | func WithTimeFormat(format string) ModelOption {
  function WithTimeLocation (line 110) | func WithTimeLocation(loc *time.Location) ModelOption {
  function selectBestMapping (line 287) | func selectBestMapping(scores []int) int {

FILE: nlp_test.go
  function failTest (line 12) | func failTest(t *testing.T, err error) {
  function TestNL_P (line 18) | func TestNL_P(t *testing.T) {
  function TestNL_RegisterModel (line 123) | func TestNL_RegisterModel(t *testing.T) {
  function TestNL_Learn (line 192) | func TestNL_Learn(t *testing.T) {
  function TestWithTimeFormat (line 262) | func TestWithTimeFormat(t *testing.T) {
  function TestWithTimeLocation (line 293) | func TestWithTimeLocation(t *testing.T) {

FILE: parser/parser.go
  type Token (line 19) | type Token struct
  function ParseSample (line 25) | func ParseSample(sampleID int, sample []byte) ([]Token, error) {
  type Option (line 414) | type Option
  function MaxExpressions (line 421) | func MaxExpressions(maxExprCnt uint64) Option {
  function Debug (line 433) | func Debug(b bool) Option {
  function Memoize (line 447) | func Memoize(b bool) Option {
  function Recover (line 461) | func Recover(b bool) Option {
  function GlobalStore (line 471) | func GlobalStore(key string, value interface{}) Option {
  function ParseFile (line 480) | func ParseFile(filename string, opts ...Option) (i interface{}, err erro...
  function ParseReader (line 495) | func ParseReader(filename string, r io.Reader, opts ...Option) (interfac...
  function Parse (line 506) | func Parse(filename string, b []byte, opts ...Option) (interface{}, erro...
  type position (line 511) | type position struct
    method String (line 515) | func (p position) String() string {
  type savepoint (line 521) | type savepoint struct
  type current (line 527) | type current struct
    method onSample1 (line 318) | func (c *current) onSample1(vs interface{}) (interface{}, error) {
    method onKeyword2 (line 339) | func (c *current) onKeyword2(v interface{}) (interface{}, error) {
    method onKeyword10 (line 349) | func (c *current) onKeyword10(v interface{}) (interface{}, error) {
    method onKeyword18 (line 359) | func (c *current) onKeyword18(v interface{}) (interface{}, error) {
    method onKeyword28 (line 369) | func (c *current) onKeyword28(v interface{}) (interface{}, error) {
    method onPunct1 (line 379) | func (c *current) onPunct1() (interface{}, error) {
    method onIdentifier3 (line 389) | func (c *current) onIdentifier3() (interface{}, error) {
  type grammar (line 537) | type grammar struct
  type rule (line 542) | type rule struct
  type choiceExpr (line 549) | type choiceExpr struct
  type actionExpr (line 554) | type actionExpr struct
  type seqExpr (line 560) | type seqExpr struct
  type labeledExpr (line 565) | type labeledExpr struct
  type expr (line 571) | type expr struct
  type andExpr (line 576) | type andExpr
  type notExpr (line 577) | type notExpr
  type zeroOrOneExpr (line 578) | type zeroOrOneExpr
  type zeroOrMoreExpr (line 579) | type zeroOrMoreExpr
  type oneOrMoreExpr (line 580) | type oneOrMoreExpr
  type ruleRefExpr (line 582) | type ruleRefExpr struct
  type andCodeExpr (line 587) | type andCodeExpr struct
  type notCodeExpr (line 592) | type notCodeExpr struct
  type litMatcher (line 597) | type litMatcher struct
  type charClassMatcher (line 603) | type charClassMatcher struct
  type anyMatcher (line 614) | type anyMatcher
  type errList (line 617) | type errList
    method add (line 619) | func (e *errList) add(err error) {
    method err (line 623) | func (e errList) err() error {
    method dedupe (line 631) | func (e *errList) dedupe() {
    method Error (line 643) | func (e errList) Error() string {
  type parserError (line 664) | type parserError struct
    method Error (line 672) | func (p *parserError) Error() string {
  function newParser (line 677) | func newParser(filename string, b []byte, opts ...Option) *parser {
  type resultTuple (line 706) | type resultTuple struct
  type parser (line 712) | type parser struct
    method callonSample1 (line 333) | func (p *parser) callonSample1() (interface{}, error) {
    method callonKeyword2 (line 343) | func (p *parser) callonKeyword2() (interface{}, error) {
    method callonKeyword10 (line 353) | func (p *parser) callonKeyword10() (interface{}, error) {
    method callonKeyword18 (line 363) | func (p *parser) callonKeyword18() (interface{}, error) {
    method callonKeyword28 (line 373) | func (p *parser) callonKeyword28() (interface{}, error) {
    method callonPunct1 (line 383) | func (p *parser) callonPunct1() (interface{}, error) {
    method callonIdentifier3 (line 393) | func (p *parser) callonIdentifier3() (interface{}, error) {
    method setOptions (line 700) | func (p *parser) setOptions(opts []Option) {
    method pushV (line 750) | func (p *parser) pushV() {
    method popV (line 771) | func (p *parser) popV() {
    method print (line 781) | func (p *parser) print(prefix, s string) string {
    method in (line 791) | func (p *parser) in(s string) string {
    method out (line 796) | func (p *parser) out(s string) string {
    method addErr (line 801) | func (p *parser) addErr(err error) {
    method addErrAt (line 805) | func (p *parser) addErrAt(err error, pos position, expected []string) {
    method failAt (line 829) | func (p *parser) failAt(fail bool, pos position, want string) {
    method read (line 849) | func (p *parser) read() {
    method restore (line 868) | func (p *parser) restore(pt savepoint) {
    method sliceFrom (line 879) | func (p *parser) sliceFrom(start savepoint) []byte {
    method getMemoized (line 883) | func (p *parser) getMemoized(node interface{}) (resultTuple, bool) {
    method setMemoized (line 895) | func (p *parser) setMemoized(pt savepoint, node interface{}, tuple res...
    method buildRulesTable (line 907) | func (p *parser) buildRulesTable(g *grammar) {
    method parse (line 914) | func (p *parser) parse(g *grammar) (val interface{}, err error) {
    method parseRule (line 986) | func (p *parser) parseRule(rule *rule) (interface{}, bool) {
    method parseExpr (line 1015) | func (p *parser) parseExpr(expr interface{}) (interface{}, bool) {
    method parseActionExpr (line 1074) | func (p *parser) parseActionExpr(act *actionExpr) (interface{}, bool) {
    method parseAndCodeExpr (line 1096) | func (p *parser) parseAndCodeExpr(and *andCodeExpr) (interface{}, bool) {
    method parseAndExpr (line 1108) | func (p *parser) parseAndExpr(and *andExpr) (interface{}, bool) {
    method parseAnyMatcher (line 1121) | func (p *parser) parseAnyMatcher(any *anyMatcher) (interface{}, bool) {
    method parseCharClassMatcher (line 1136) | func (p *parser) parseCharClassMatcher(chr *charClassMatcher) (interfa...
    method parseChoiceExpr (line 1202) | func (p *parser) parseChoiceExpr(ch *choiceExpr) (interface{}, bool) {
    method parseLabeledExpr (line 1218) | func (p *parser) parseLabeledExpr(lab *labeledExpr) (interface{}, bool) {
    method parseLitMatcher (line 1233) | func (p *parser) parseLitMatcher(lit *litMatcher) (interface{}, bool) {
    method parseNotCodeExpr (line 1260) | func (p *parser) parseNotCodeExpr(not *notCodeExpr) (interface{}, bool) {
    method parseNotExpr (line 1272) | func (p *parser) parseNotExpr(not *notExpr) (interface{}, bool) {
    method parseOneOrMoreExpr (line 1287) | func (p *parser) parseOneOrMoreExpr(expr *oneOrMoreExpr) (interface{},...
    method parseRuleRefExpr (line 1309) | func (p *parser) parseRuleRefExpr(ref *ruleRefExpr) (interface{}, bool) {
    method parseSeqExpr (line 1326) | func (p *parser) parseSeqExpr(seq *seqExpr) (interface{}, bool) {
    method parseZeroOrMoreExpr (line 1345) | func (p *parser) parseZeroOrMoreExpr(expr *zeroOrMoreExpr) (interface{...
    method parseZeroOrOneExpr (line 1363) | func (p *parser) parseZeroOrOneExpr(expr *zeroOrOneExpr) (interface{},...
  function listJoin (line 975) | func listJoin(list []string, sep string, lastSep string) string {

FILE: parser/parser_test.go
  function TestParseSample (line 8) | func TestParseSample(t *testing.T) {
Condensed preview — 12 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (67K chars).
[
  {
    "path": ".gitignore",
    "chars": 21,
    "preview": "# dependencies\nvendor"
  },
  {
    "path": ".travis.yml",
    "chars": 256,
    "preview": "language: go\n\ngo:\n  - 1.8.x\n  - 1.9.x\n  - tip\n\nbefore_install:\n  - go get -u github.com/golang/dep/cmd/dep\n  - dep ensur"
  },
  {
    "path": "Gopkg.toml",
    "chars": 43,
    "preview": "\nrequired = [\n    \"github.com/mna/pigeon\"\n]"
  },
  {
    "path": "LICENSE",
    "chars": 1090,
    "preview": "The MIT License (MIT)\n\nCopyright (c) 2017 Juan Álvarez / @Shixzie\n\nPermission is hereby granted, free of charge, to any "
  },
  {
    "path": "Makefile",
    "chars": 281,
    "preview": "help:\n\t@echo \"deps   -> Get all dependencies\"\n\t@echo \"parser -> Generates the sample parser\"\n\t@echo \"tests  -> Run all t"
  },
  {
    "path": "README.md",
    "chars": 5835,
    "preview": "[![GoDoc](https://godoc.org/github.com/shixzie/nlp?status.svg)](https://godoc.org/github.com/shixzie/nlp) \n[![Go Report "
  },
  {
    "path": "benchmark_test.go",
    "chars": 1513,
    "preview": "package nlp\n\nimport (\n\t\"testing\"\n\t\"time\"\n)\n\nfunc BenchmarkNL_P(b *testing.B) {\n\ttype T struct {\n\t\tString string\n\t\tInt   "
  },
  {
    "path": "nlp.go",
    "chars": 9380,
    "preview": "// Package nlp provides general purpose Natural Language Processing.\npackage nlp\n\nimport (\n\t\"bytes\"\n\t\"errors\"\n\t\"fmt\"\n\t\"r"
  },
  {
    "path": "nlp_test.go",
    "chars": 5413,
    "preview": "package nlp\n\nimport (\n\t\"bytes\"\n\t\"reflect\"\n\t\"testing\"\n\t\"time\"\n\n\t\"github.com/cdipaolo/goml/text\"\n)\n\nfunc failTest(t *testi"
  },
  {
    "path": "parser/nlp.peg",
    "chars": 1664,
    "preview": "{\n// Package parser contains the sample parser for nlp\npackage parser\n\nimport \"fmt\"\nimport \"errors\"\n\n// Token is a sampl"
  },
  {
    "path": "parser/parser.go",
    "chars": 31614,
    "preview": "// Package parser contains the sample parser for nlp\npackage parser\n\nimport (\n\t\"bytes\"\n\t\"errors\"\n\t\"fmt\"\n\t\"io\"\n\t\"io/iouti"
  },
  {
    "path": "parser/parser_test.go",
    "chars": 1443,
    "preview": "package parser\n\nimport (\n\t\"reflect\"\n\t\"testing\"\n)\n\nfunc TestParseSample(t *testing.T) {\n\ttype args struct {\n\t\tsampleID in"
  }
]

About this extraction

This page contains the full source code of the shixzie/nlp GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 12 files (57.2 KB), approximately 18.0k tokens, and a symbol index with 116 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!