numb3r3

Follow

felix-wang numb3r3

Follow

@jina-ai working on Artificial Intelligence, Deep Learning and Natural Language Processing. Past @HUYA-AI, @ Tencent-AI

151 followers · 1k following

@jina-ai
Shenzhen, China
@felix1987_

Achievements

Achievements

Organizations

Lists (2)

Sort

🪨 ANN

whisper

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

AIDC-AI / Ovis

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 217 7 Updated Sep 19, 2024

InfiMM / mllm-hd

Official code for infimm-hd

Python 15 Updated Sep 4, 2024

kyutai-labs / moshi

Python 3,266 203 Updated Sep 20, 2024

lepture / mistune

A fast yet powerful Python Markdown parser with renderers and plugins.

Python 2,550 250 Updated Aug 15, 2024

dleemiller / WordLlama

Things you can do with the token embeddings of an LLM

Python 1,052 30 Updated Sep 20, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 3,718 282 Updated Sep 19, 2024

brandonstarxel / chunking_evaluation

This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation. It allows users to compare different chunking methods and i…

Python 96 12 Updated Jul 14, 2024

relari-ai / continuous-eval

Data-Driven Evaluation for LLM-Powered Applications

Python 433 27 Updated Sep 2, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 2,971 152 Updated Sep 20, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,532 244 Updated Sep 14, 2024

Cinnamon / kotaemon

An open-source RAG-based tool for chatting with your documents.

Python 12,083 900 Updated Sep 18, 2024

apple / ml-superposition-prompting

Python 129 14 Updated Jul 19, 2024

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 663 31 Updated Sep 19, 2024

euirim / goodwiki

Package and scripts used to build a dataset of Wikipedia articles in Markdown.

Jupyter Notebook 17 1 Updated Sep 11, 2023

highlight / highlight

highlight.io: The open source, full-stack monitoring platform. Error monitoring, session replay, logging, distributed tracing, and more.

TypeScript 7,466 350 Updated Sep 20, 2024

evilsocket / cake

Distributed LLM and StableDiffusion inference for mobile, desktop and server.

Rust 2,470 131 Updated Aug 30, 2024

VikParuchuri / surya

OCR, layout analysis, reading order, line detection in 90+ languages

Python 9,871 644 Updated Sep 20, 2024

matthewwithanm / python-markdownify

Convert HTML to Markdown

Python 1,015 135 Updated Jul 14, 2024

mrmps / SMRY

A tool to get summaries and get past paywalls

TypeScript 345 22 Updated Jun 11, 2024

tianyi-lab / Superfiltering

[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning

Python 105 8 Updated Sep 6, 2024

leptonai / gpud

Go 159 9 Updated Sep 20, 2024

mlabonne / llm-datasets

High-quality datasets, tools, and concepts for LLM fine-tuning.

1,682 157 Updated Aug 18, 2024

koaning / icepickle

It's a cooler way to store simple linear models.

Python 28 1 Updated Jul 15, 2024

chroma-core / chroma

the AI-native open-source embedding database

Rust 14,619 1,219 Updated Sep 20, 2024

thu-coai / SelfCont

Code for the paper "Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation"

Python 3 Updated Jul 6, 2023

Jxu-Thu / DITTO

The code of paper "Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation" published at NeurIPS 2022

Python 39 6 Updated Oct 9, 2022

opendatalab / Miner-PDF-Benchmark

MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.

Python 11 5 Updated Aug 2, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE-1 and EAGLE-2

Python 756 74 Updated Aug 28, 2024

gmftbyGMFTBY / science-llm

A large-scale language model for scientific domain, trained on redpajama arXiv split

Python 120 14 Updated Mar 1, 2024

reka-ai / reka-vibe-eval

Multimodal language model benchmark, featuring challenging examples

Python 144 6 Updated Aug 13, 2024

Starred topics

infini-attention

large-language-models

Rust

3D

document-similarity

vosk

speech-recognition

adversarial-networks

Neural Network

Machine learning

See all starred topics