Stars
Label Studio is a multi-type data labeling and annotation tool with standardized output format
一个还算强大的Web思维导图。A relatively powerful web mind map.
pke_zh, python keyphrase extraction for chinese(zh). 中文关键词或关键句提取工具,实现了KeyBert、PositionRank、TopicRank、TextRank等算法,开箱即用。
List of Dirty, Naughty, Obscene, and Otherwise Bad Words
Content Farm Terminator browser extension/「終結內容農場」瀏覽器套件
An opinionated list of awesome Python frameworks, libraries, software and resources.
Summarize existing representative LLMs text datasets.
欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
最好用的北京联通、北京移动IPTV频道列表。https://bjiptv.gq/
✯ 一个可直连访问的电视/广播图标库与相关工具项目 ✯ 🔕 永久免费 直连访问 完整开源 不断完善的台标 支持IPv4/IPv6双栈访问 🔕
FongMi影视和tvbox配置文件,如果喜欢,请Fork自用。使用前请仔细阅读仓库说明,一旦使用将被视为你已了解。
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT,Cross Encoder
李宏毅2021/2022/2023春季机器学习课程课件及作业