nlp-ner

This program takes-in data in picture or text format, determines what it is, extract the text and get the amount of errors in the document, also outputs the name of the company and location of the address found in the picture or text.

Dependencies:

nltk(stopwords, punkt tokenizer) for the sub-dependencies: python -m nltk.downloader punkt, python -m nltk.downloader stopwords

gingerit

pillow

spacy sub-dependency: python -m spacy download en

pytesseract (python wrapper)

tesseract-ocr.exe gotten from https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.0-alpha.20191010.exe

pytesseract.pytesseract.tesseract_cmd = Installed loaction/folder

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
nlp.py		nlp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-ner

About

Releases

Packages

Languages

agbanusi/nlp-ner

Folders and files

Latest commit

History

Repository files navigation

nlp-ner

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages