an infinite loop [token == 0 "" L (1, n)-(1, n-1)] #22

xaionaro · 2018-02-26T06:18:47Z

This test-case will cause an infinite loop. Of course the code is incorrect anyway (character ":" would cause an error). But maybe lexmachine shouldn't go to an infinite loop and should report an error.

package main

import (
        "fmt"
        "github.com/timtadh/lexmachine"
        lexmachines "github.com/timtadh/lexmachine/machines"
)


func newLexer() *lexmachine.Lexer {
        tokens := []string{
                "VALUE",
        }
        tokenIds := map[string]int{}
        for i, tok := range tokens {
                tokenIds[tok] = i
        }
        lex := lexmachine.NewLexer()

        skip := func(*lexmachine.Scanner, *lexmachines.Match) (interface{}, error) {
                return nil, nil
        }
        token := func(name string) lexmachine.Action {
                return func(s *lexmachine.Scanner, m *lexmachines.Match) (interface{}, error) {
                        return s.Token(tokenIds[name], string(m.Bytes), m), nil
                }
        }

        lex.Add([]byte(`([a-z]|[A-Z]|[0-9]|_|\-|\.|=)*`), token("VALUE"))
        lex.Add([]byte("[\n \t]"), skip)

        err := lex.Compile()
        if err != nil {
                panic(err)
        }

        return lex
}

func main() {
        lex := newLexer()
        scanner, err := lex.Scanner([]byte("line1:\nline2"))
        if err != nil {
                panic(err)
        }

        for tok, err, eof := scanner.Next(); !eof; tok, err, eof = scanner.Next() {
                fmt.Println("token ==", tok)
                fmt.Println("err ==", err)
        }
}

token == 0 "line1" 0 (1, 1)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
...

The text was updated successfully, but these errors were encountered:

ty2 · 2018-02-26T08:55:07Z

I had run your code. It should fixed on #21 cd38be4

xaionaro · 2018-02-26T10:01:46Z

Yep, there's no infinite loop, now. However the behavior seems to be wrong:

$ ./test
token == 0 "line1" 0 (1, 1)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "line2" 7 (2, 1)-(2, 5)
err == <nil>

There should be an error, IMHO.

Expected behavior:

token == 0 "line1" 0 (1, 1)-(1, 5)
err == <nil>
token == <nil>
err == Lexer error: could not match text starting at 1:6 failing at 2:0.
        unmatched text: ":"

ty2 · 2018-02-26T11:06:34Z

It should work if replace * to + that matches at least one character in the group

use this regex:
([a-z]|[A-Z]|[0-9]|_|\-|\.|=)+

xaionaro · 2018-02-26T11:34:35Z

Yes, it will work if I replace the character. However it should report an error <unmatcher text ":"> with "*" too, IMHO. There's no expression for ":", how this character was skipped?

token == 0 "line1" 0 (1, 1)-(1, 5)
err == <nil>
token == 0 "" 5 (1, 6)-(1, 5)
err == <nil>
token == 0 "line2" 7 (2, 1)-(2, 5)
err == <nil>

ty2 · 2018-02-26T14:29:32Z

@xaionaro you are right.

Also, if input string is start from non matched char, it will panic.

func main() {
        lex := newLexer()
        scanner, err := lex.Scanner([]byte("@line1:\nline2"))
        if err != nil {
                panic(err)
        }

        for tok, err, eof := scanner.Next(); !eof; tok, err, eof = scanner.Next() {
                fmt.Println("token ==", tok)
                fmt.Println("err ==", err)
        }
}

panic: runtime error: index out of range

goroutine 1 [running]:
github.com/ty2-exp/gdlparser/vendor/github.com/timtadh/lexmachine/machines.DFALexerEngine.func1(0x0, 0x4, 0x4, 0xc4200e2000, 0x4, 0x4)
        /Users/terry/go/src/github.com/ty2-exp/gdlparser/vendor/github.com/timtadh/lexmachine/machines/dfa_machine.go:65 +0x955
github.com/ty2-exp/gdlparser/vendor/github.com/timtadh/lexmachine.(*Scanner).Next(0xc420093320, 0xc42003df48, 0x4, 0x20, 0xc420093320, 0x0)
        /Users/terry/go/src/github.com/ty2-exp/gdlparser/vendor/github.com/timtadh/lexmachine/lexer.go:146 +0x50
github.com/ty2-exp/gdlparser.(*Lexer).Scan(0xc4200dd120, 0xc42003df48, 0x4, 0x20, 0x4, 0x20, 0x118e0e0, 0xc420076300, 0xc42003df50)
        /Users/terry/go/src/github.com/ty2-exp/gdlparser/lexer.go:252 +0x90
main.main()
        /Users/terry/go/src/github.com/ty2-exp/gdlparser/example/main.go:7 +0x7a
exit status 2

timtadh · 2018-02-26T14:41:17Z

@xaionaro ok I am going to take a look at this. I am still concerned about the correct behavior when matching the empty string. If you have an unmatchable character (like :) in your example but you also have a token when matches the empty string then no progress can be made. The empty string will be matched over and over again. On the other hand, I can ensure progress by always incrementing the tc (text counter) after a match. This prevents an infinite loop at the cost of potentially skipping characters which it should have returned an error on.

A third option is to disallow at compilation matching the empty string. What do you think?

xaionaro · 2018-02-26T14:43:08Z

I'm as an user of this library would be happier if matching of empty strings will be forbidden (the third option). It will help me to write better code (on my side).

Sorry for my English :)

timtadh · 2018-02-26T14:52:15Z

@xaionaro that seems reasonable. I will make a separate PR for that (and change the current one to not preserve progress).

timtadh · 2018-02-27T14:48:34Z

fixed by #23

xaionaro · 2019-07-14T08:52:48Z

Thank you :)

xaionaro mentioned this issue Feb 26, 2018

panic: runtime error: index out of range #18

Closed

xaionaro changed the title ~~an infinite loop~~ an infinite loop [token == 0 "" L (1, n)-(1, n-1)] Feb 26, 2018

timtadh mentioned this issue Feb 26, 2018

Statically prevent matching the empty string #23

Merged

timtadh closed this as completed Feb 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

an infinite loop [token == 0 "" L (1, n)-(1, n-1)] #22

an infinite loop [token == 0 "" L (1, n)-(1, n-1)] #22

xaionaro commented Feb 26, 2018 •

edited

Loading

ty2 commented Feb 26, 2018

xaionaro commented Feb 26, 2018

ty2 commented Feb 26, 2018 •

edited

Loading

xaionaro commented Feb 26, 2018 •

edited

Loading

ty2 commented Feb 26, 2018

timtadh commented Feb 26, 2018

xaionaro commented Feb 26, 2018 •

edited

Loading

timtadh commented Feb 26, 2018

timtadh commented Feb 27, 2018

xaionaro commented Jul 14, 2019

an infinite loop [token == 0 "" L (1, n)-(1, n-1)] #22

an infinite loop [token == 0 "" L (1, n)-(1, n-1)] #22

Comments

xaionaro commented Feb 26, 2018 • edited Loading

ty2 commented Feb 26, 2018

xaionaro commented Feb 26, 2018

ty2 commented Feb 26, 2018 • edited Loading

xaionaro commented Feb 26, 2018 • edited Loading

ty2 commented Feb 26, 2018

timtadh commented Feb 26, 2018

xaionaro commented Feb 26, 2018 • edited Loading

timtadh commented Feb 26, 2018

timtadh commented Feb 27, 2018

xaionaro commented Jul 14, 2019

xaionaro commented Feb 26, 2018 •

edited

Loading

ty2 commented Feb 26, 2018 •

edited

Loading

xaionaro commented Feb 26, 2018 •

edited

Loading

xaionaro commented Feb 26, 2018 •

edited

Loading