Starred repositories
Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Robust recipes to align language models with human and AI preferences
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
Extensible, parallel implementations of t-SNE
Integrated Image-based Deep Learning and Language Models for Primary Diabetes Care
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
A LLM project as a part of CS-5660
Deep Generative Modelling of Patient Timelines using Electronic Health Records
An opensource ChatBot built with ExpertPrompting which achieves 96% of ChatGPT's capability.
Alpaca dataset from Stanford, cleaned and curated
Home of StarCoder: fine-tuning & inference!
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Training and serving large-scale neural networks with auto parallelization.
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
ClusterFuzzLite - Simple continuous fuzzing that runs in CI.
Secure Software Development Fundamentals courses (from the OpenSSF Best Practices WG)