Skip to content
View jeremiah-wa's full-sized avatar

Highlights

  • Pro

Block or report jeremiah-wa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A topic-centric list of HQ open datasets.

60,301 9,867 Updated Sep 6, 2024

A native PyTorch Library for large model training

Python 2,256 165 Updated Sep 29, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,698 451 Updated May 3, 2024

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 10,145 1,003 Updated Sep 27, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,071 839 Updated Jul 1, 2024

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 10,124 1,166 Updated Sep 1, 2024

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 11,943 812 Updated Aug 15, 2024

[:warning:WIP] A JavaScript implementation of the Parkrun API gathered from reverse-engineering the official app.

TypeScript 15 6 Updated Feb 27, 2023

Yes, it's another chat over documents implementation... but this one is entirely local!

TypeScript 1,630 295 Updated Sep 23, 2024

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 18,306 1,855 Updated Sep 30, 2024

Turns Data and AI algorithms into production-ready web applications in no time.

Python 12,109 851 Updated Sep 30, 2024

Neural Networks: Zero to Hero

Jupyter Notebook 11,609 1,452 Updated Aug 18, 2024

πŸŽ₯🎬 get public diary data for letterboxd users

JavaScript 48 9 Updated Sep 27, 2024

A React widget for displaying a user's public bookshelf

TypeScript 19 9 Updated Jun 18, 2024

Models and examples built with TensorFlow

Python 76,984 45,785 Updated Sep 30, 2024

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …

C# 16,979 4,144 Updated Sep 30, 2024

A collection of JavaScript modern interview code challenges for beginners to experts

MDX 4,226 850 Updated Sep 30, 2024

All Algorithms implemented in Python

Python 184,649 44,369 Updated Sep 30, 2024

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings

C 6,834 1,512 Updated Sep 19, 2023

TensorFlow documentation

Jupyter Notebook 6,108 5,288 Updated Sep 26, 2024

Best Practices on Recommendation Systems

Python 18,887 3,075 Updated Sep 29, 2024

Example πŸ““ Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

Jupyter Notebook 10,029 6,748 Updated Sep 27, 2024

😎 Awesome lists about all kinds of interesting topics

326,837 27,754 Updated Sep 9, 2024

Comp10550 (Software Engineering) - Year 1 - Assignment 3

C 1 Updated Apr 17, 2017