Highlights
- Pro
Stars
A topic-centric list of HQ open datasets.
A native PyTorch Library for large model training
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Unsupervised text tokenizer for Neural Network-based text generation.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
[:warning:WIP] A JavaScript implementation of the Parkrun API gathered from reverse-engineering the official app.
Yes, it's another chat over documents implementation... but this one is entirely local!
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Turns Data and AI algorithms into production-ready web applications in no time.
Neural Networks: Zero to Hero
π₯π¬ get public diary data for letterboxd users
A React widget for displaying a user's public bookshelf
Models and examples built with TensorFlow
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement β¦
A collection of JavaScript modern interview code challenges for beginners to experts
All Algorithms implemented in Python
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Best Practices on Recommendation Systems
Example π Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using π§ Amazon SageMaker.
π Awesome lists about all kinds of interesting topics
Comp10550 (Software Engineering) - Year 1 - Assignment 3