- Frankfurt, Germany
- https://ineil77.github.io/about/
- https://orcid.org/0000-0001-8215-4764
- @androneil54
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Source Code Data Augmentation for Deep Learning: A Survey.
Code to create bugged python scripts for OpenAssistant Training, maintained by https://twitter.com/Cyndesama
A simple code complexity analyser without caring about the C/C++ header files or Java imports, supports most of the popular languages.
BigCodeBench: Benchmarking Code Generation Towards AGI
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Recipes to train reward model for RLHF.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
Document aligner which uses neural technologies to search matches across bilingual documents
Data creation, training and eval scripts for the IRCoder paper
pyan is a Python module that performs static analysis of Python code to determine a call dependency graph between functions and methods. This is different from running the code and seeing which fun…
Run code inference-only benchmarks quickly using vLLM
Repository for "SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques" published in MSR4P&S'22.
[TMLR] A curated list of language modeling researches for code and related datasets.
Large Language Models Meet NL2Code: A Survey
⚒️ Tree-sitter custom toolkit for extracting function and class from raw source file
Machine Learning Engineering Open Book
😎 Curated list of awesome things regarding WebAssembly (wasm) ecosystem.
Control the quality of your labeled data with the Python tools you already know.
Collection of important articles to be treated as a textbook
☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV); tools to automatically update the data are provided.
Collected solutions from Google Code Jam programming competition (2008-2020).
Problem statements on System Design and Software Architecture as part of Arpit's System Design Masterclass