text-data
Here are 59 public repositories matching this topic...
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
-
Updated
Aug 25, 2017 - HTML
For reading from and writing to parallel data files in Python
-
Updated
Sep 7, 2017 - Python
-
Updated
Jan 7, 2019
Directional Co-clustering with a Conscience (DCC)
-
Updated
Aug 15, 2019 - R
A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.
-
Updated
Oct 12, 2019 - Jupyter Notebook
Cleans Reddit Text Data 📜 🧹
-
Updated
Apr 14, 2020 - Python
Applying NLP techniques on WhatsApp text to gain insights.
-
Updated
Jun 2, 2020 - Jupyter Notebook
This repository contains examples on stages in NLP pipeline
-
Updated
Jun 26, 2020 - Jupyter Notebook
An algorithm to generate the word cloud for time-varying dynamical text data in order to minimize the relative movement of the word over time.
-
Updated
Aug 1, 2020 - Mathematica
Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation
-
Updated
Aug 31, 2020 - Python
IMDb-Scraping is for retrieving user-generated movie text reviews as well as relevant movie characteristics from imdb.com.
-
Updated
Sep 6, 2020 - Python
This is a simple graphical representation of Zipf's Law using term frequencies, calculated for three different text data.
-
Updated
Sep 25, 2020 - Python
Question Classification for the dataset CogComp QC Dataset - [ http://cogcomp.org/Data/QA/QC/ ].
-
Updated
Nov 10, 2020 - Python
-
Updated
Dec 19, 2020 - HTML
While built-in string methods and regular expressions have limitations, they can be leveraged in creative ways to implement scalable workflows that process and analyze text data. This article explores these tools and introduces a few useful peripheral techniques within the context of a use case involving a large text data corpus.
-
Updated
Dec 28, 2020 - Jupyter Notebook
Rank 3/85 MachineHack
-
Updated
Jan 6, 2021 - Jupyter Notebook
Rank 16/98 MachineHack
-
Updated
Jan 6, 2021 - Jupyter Notebook
A machine learning model that predicts tags for a given question and body.
-
Updated
Aug 9, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to the text-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the text-data topic, visit your repo's landing page and select "manage topics."