-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge remote-tracking branch 'upstream/develop' into develop
- Loading branch information
Showing
354 changed files
with
580 additions
and
840 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: source Makefile | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Steven Bird <[email protected]> | ||
# Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
3.8.1a | ||
3.8.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit (NLTK) | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Authors: Steven Bird <[email protected]> | ||
# Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
|
@@ -42,7 +42,7 @@ | |
|
||
# Copyright notice | ||
__copyright__ = """\ | ||
Copyright (C) 2001-2022 NLTK Project. | ||
Copyright (C) 2001-2023 NLTK Project. | ||
Distributed and Licensed under the Apache License, Version 2.0, | ||
which is included by reference. | ||
|
@@ -52,7 +52,7 @@ | |
# Description of the toolkit, keywords, and the project's primary URL. | ||
__longdescr__ = """\ | ||
The Natural Language Toolkit (NLTK) is a Python package for | ||
natural language processing. NLTK requires Python 3.7, 3.8, 3.9 or 3.10.""" | ||
natural language processing. NLTK requires Python 3.7, 3.8, 3.9, 3.10 or 3.11.""" | ||
__keywords__ = [ | ||
"NLP", | ||
"CL", | ||
|
@@ -88,6 +88,7 @@ | |
"Programming Language :: Python :: 3.8", | ||
"Programming Language :: Python :: 3.9", | ||
"Programming Language :: Python :: 3.10", | ||
"Programming Language :: Python :: 3.11", | ||
"Topic :: Scientific/Engineering", | ||
"Topic :: Scientific/Engineering :: Artificial Intelligence", | ||
"Topic :: Scientific/Engineering :: Human Machine Interfaces", | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Applications package | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Edward Loper <[email protected]> | ||
# Steven Bird <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Chart Parser Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Edward Loper <[email protected]> | ||
# Jean Mark Gawron <[email protected]> | ||
# Steven Bird <[email protected]> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Regexp Chunk Parser Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Collocations Application | ||
# Much of the GUI code is imported from concordance.py; We intend to merge these tools together | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Sumukh Ghodke <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Concordance Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Sumukh Ghodke <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Recursive Descent Parser Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Shift-Reduce Parser Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Wordfreq Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Sumukh Ghodke <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: WordNet Browser Application | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Jussi Salmela <[email protected]> | ||
# Paul Bone <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
|
@@ -47,11 +47,10 @@ | |
|
||
import base64 | ||
import copy | ||
import datetime | ||
import getopt | ||
import io | ||
import os | ||
import pickle | ||
import re | ||
import sys | ||
import threading | ||
import time | ||
|
@@ -60,17 +59,12 @@ | |
from http.server import BaseHTTPRequestHandler, HTTPServer | ||
|
||
# Allow this program to run inside the NLTK source tree. | ||
from sys import argv, path | ||
from sys import argv | ||
from urllib.parse import unquote_plus | ||
|
||
from nltk.corpus import wordnet as wn | ||
from nltk.corpus.reader.wordnet import Lemma, Synset | ||
|
||
# now included in local file | ||
# from util import html_header, html_trailer, \ | ||
# get_static_index_page, get_static_page_by_path, \ | ||
# page_from_word, page_from_href | ||
|
||
firstClient = True | ||
|
||
# True if we're not also running a web browser. The value f server_mode | ||
|
@@ -127,7 +121,12 @@ def do_GET(self): | |
else: | ||
# Handle files here. | ||
word = sp | ||
page = get_static_page_by_path(usp) | ||
try: | ||
page = get_static_page_by_path(usp) | ||
except FileNotFoundError: | ||
page = "Internal error: Path for static page '%s' is unknown" % usp | ||
# Set type to plain to prevent XSS by printing the path as HTML | ||
type = "text/plain" | ||
elif sp.startswith("search"): | ||
# This doesn't seem to work with MWEs. | ||
type = "text/html" | ||
|
@@ -654,6 +653,16 @@ def make_synset_html(db_name, disp_name, rels): | |
return html | ||
|
||
|
||
class RestrictedUnpickler(pickle.Unpickler): | ||
""" | ||
Unpickler that prevents any class or function from being used during loading. | ||
""" | ||
|
||
def find_class(self, module, name): | ||
# Forbid every function | ||
raise pickle.UnpicklingError(f"global '{module}.{name}' is forbidden") | ||
|
||
|
||
class Reference: | ||
""" | ||
A reference to a page that may be generated by page_word | ||
|
@@ -689,7 +698,7 @@ def decode(string): | |
Decode a reference encoded with Reference.encode | ||
""" | ||
string = base64.urlsafe_b64decode(string.encode()) | ||
word, synset_relations = pickle.loads(string) | ||
word, synset_relations = RestrictedUnpickler(io.BytesIO(string)).load() | ||
return Reference(word, synset_relations) | ||
|
||
def toggle_synset_relation(self, synset, relation): | ||
|
@@ -789,7 +798,7 @@ def page_from_reference(href): | |
except KeyError: | ||
pass | ||
if not body: | ||
body = "The word or words '%s' where not found in the dictionary." % word | ||
body = "The word or words '%s' were not found in the dictionary." % word | ||
return body, word | ||
|
||
|
||
|
@@ -816,8 +825,7 @@ def get_static_page_by_path(path): | |
return get_static_web_help_page() | ||
elif path == "wx_help.html": | ||
return get_static_wx_help_page() | ||
else: | ||
return "Internal error: Path for static page '%s' is unknown" % path | ||
raise FileNotFoundError() | ||
|
||
|
||
def get_static_web_help_page(): | ||
|
@@ -828,7 +836,7 @@ def get_static_web_help_page(): | |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> | ||
<html> | ||
<!-- Natural Language Toolkit: Wordnet Interface: Graphical Wordnet Browser | ||
Copyright (C) 2001-2022 NLTK Project | ||
Copyright (C) 2001-2023 NLTK Project | ||
Author: Jussi Salmela <[email protected]> | ||
URL: <https://www.nltk.org/> | ||
For license information, see LICENSE.TXT --> | ||
|
@@ -898,7 +906,7 @@ def get_static_index_page(with_shutdown): | |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd"> | ||
<HTML> | ||
<!-- Natural Language Toolkit: Wordnet Interface: Graphical Wordnet Browser | ||
Copyright (C) 2001-2022 NLTK Project | ||
Copyright (C) 2001-2023 NLTK Project | ||
Author: Jussi Salmela <[email protected]> | ||
URL: <https://www.nltk.org/> | ||
For license information, see LICENSE.TXT --> | ||
|
@@ -931,7 +939,7 @@ def get_static_upper_page(with_shutdown): | |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> | ||
<html> | ||
<!-- Natural Language Toolkit: Wordnet Interface: Graphical Wordnet Browser | ||
Copyright (C) 2001-2022 NLTK Project | ||
Copyright (C) 2001-2023 NLTK Project | ||
Author: Jussi Salmela <[email protected]> | ||
URL: <https://www.nltk.org/> | ||
For license information, see LICENSE.TXT --> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Some texts for exploration in chapter 1 of the book | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Steven Bird <[email protected]> | ||
# | ||
# URL: <https://www.nltk.org/> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Combinatory Categorial Grammar | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Graeme Gange <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: CCG Categories | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Graeme Gange <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Combinatory Categorial Grammar | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Graeme Gange <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Combinatory Categorial Grammar | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Graeme Gange <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Combinatory Categorial Grammar | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Graeme Gange <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Chatbots | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Authors: Steven Bird <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Eliza | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Authors: Steven Bird <[email protected]> | ||
# Edward Loper <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Teen Chatbot | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Selina Dennis <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Rude Chatbot | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Peter Spiller <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Chatbot Utilities | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Authors: Steven Bird <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Natural Language Toolkit: Zen Chatbot | ||
# | ||
# Copyright (C) 2001-2022 NLTK Project | ||
# Copyright (C) 2001-2023 NLTK Project | ||
# Author: Amy Holland <[email protected]> | ||
# URL: <https://www.nltk.org/> | ||
# For license information, see LICENSE.TXT | ||
|
Oops, something went wrong.