Skip to content
View erip's full-sized avatar
  • Fairfax, VA

Block or report erip

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An English lexical database from the Big 🍎, let's go Mets baby love da Mets

Python 2 Updated Sep 21, 2024

FAIR Sequence Modeling Toolkit 2

Python 678 78 Updated Sep 24, 2024

A library for data streaming and augmentation

Python 20 2 Updated Mar 18, 2024

Code for SaGe subword tokenizer (EACL 2023)

Python 21 3 Updated Sep 17, 2024

This repository contains an extension of fairseq for pixel / visual representations for machine translation.

Python 33 5 Updated Feb 2, 2024

A toolkit to create, launch and monitor SLURM jobs over existing python scripts.

Python 11 2 Updated May 13, 2024

Foundation Architecture for (M)LLMs

Python 3,002 202 Updated Apr 11, 2024

remote pbcopy over ssh

Go 17 1 Updated Mar 24, 2024

A tool for holistic analysis of language generations systems

Python 465 58 Updated Mar 22, 2022

MAFAND-MT

Jupyter Notebook 52 25 Updated Jul 9, 2024

Open information and community for machine translation

HTML 71 56 Updated Aug 12, 2024

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

Python 1,118 149 Updated Sep 24, 2024

Code and data for the IWSLT 2022 shared task on Formality Control for SLT

Ruby 21 6 Updated May 24, 2023

Cross language information retrieval pipeline

Python 18 6 Updated Jun 9, 2023

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,391 595 Updated Sep 20, 2024

Learned string similarity for entity names using optimal transport.

Python 34 3 Updated Nov 17, 2020

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Python 1,194 143 Updated Jan 16, 2024

State-of-the-Art Text Embeddings

Python 14,904 2,439 Updated Sep 19, 2024

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Python 3,621 423 Updated Aug 29, 2024

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

Python 1,485 427 Updated Aug 27, 2021

🔍 AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your da…

Python 16,872 1,846 Updated Sep 24, 2024

Models, data loaders and abstractions for language processing, powered by PyTorch

Python 3,498 814 Updated Sep 24, 2024

A data augmentations library for audio, image, text, and video.

Python 4,946 299 Updated Sep 20, 2024

OPUS-CAT is a collection of software which make it possible to OPUS-MT neural machine translation models in professional translation. OPUS-CAT includes a local offline MT engine and a collection of…

C# 70 11 Updated Aug 27, 2024

skweak: A software toolkit for weak supervision applied to NLP tasks

Python 918 71 Updated Sep 2, 2024

Document Layout Analysis

Python 337 28 Updated Sep 24, 2024

document image degradation

Jupyter Notebook 1 Updated May 18, 2020

A Unified Toolkit for Deep Learning Based Document Image Analysis

Python 4,800 462 Updated Aug 15, 2024

A PyTorch-based Speech Toolkit

Python 8,605 1,367 Updated Sep 24, 2024

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 6,429 657 Updated Aug 29, 2024
Next