Skip to content
View lajd's full-sized avatar

Highlights

  • Pro

Organizations

@PerceiveIO

Block or report lajd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An open-source RAG-based tool for chatting with your documents.

Python 12,288 921 Updated Sep 25, 2024

A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.

Go 1,311 326 Updated Sep 25, 2024

A Cloud Native Batch System (Project under CNCF)

Go 4,108 949 Updated Sep 25, 2024

Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena

Assembly 573 83 Updated Jun 15, 2024

Unofficial Python client library for Semantic Scholar APIs.

Python 294 38 Updated Jul 13, 2024

LLM based autonomous agent that does online comprehensive research on any given topic

Python 14,152 1,839 Updated Sep 25, 2024

the AI-native open-source embedding database

Rust 14,682 1,220 Updated Sep 25, 2024

Biological foundation modeling from molecular to genome scale

Jupyter Notebook 930 111 Updated Sep 5, 2024

A streaming SQL engine, a fast and lightweight alternative to ksqlDB and Apache Flink, 🚀 powered by ClickHouse.

C++ 1,503 66 Updated Sep 24, 2024

Python wrapper for the arXiv API

Python 1,078 120 Updated Jun 26, 2024

The official repository of "ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory".

Python 523 46 Updated Jun 19, 2023

Distributed DataFrame for Python designed for the cloud, powered by Rust

Rust 2,096 142 Updated Sep 25, 2024

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 5,967 518 Updated Sep 6, 2024

KubeGene - A turn-key Genome Sequencing workflow management framework

Go 193 53 Updated Nov 3, 2020

A high-performance, zero-overhead, extensible Python compiler using LLVM

C++ 14,963 510 Updated Sep 12, 2024

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

Python 651 129 Updated Dec 11, 2022

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,366 5,702 Updated Aug 19, 2024

Ankh: Optimized Protein Language Model

Python 206 19 Updated Dec 26, 2023

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 10,108 1,165 Updated Sep 1, 2024

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

Python 574 156 Updated Mar 9, 2024

Implementation of Neural Distance Embeddings for Biological Sequences (NeuroSEED) in PyTorch (NeurIPS 2021)

Python 70 17 Updated Oct 14, 2023

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Python 19,043 2,635 Updated Sep 25, 2024

numsca is numpy for scala

Jupyter Notebook 184 18 Updated Jul 14, 2024

S3 Filesystem

Python 869 271 Updated Sep 4, 2024

Official code repository for GATK versions 4 and up

Java 1,682 588 Updated Sep 25, 2024

Orchestrate Spark Jobs from Kubeflow Pipelines and poll for the status.

Python 50 12 Updated May 26, 2022

An open-source toolkit for large-scale genomic analysis

Scala 265 111 Updated Sep 22, 2024

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.

Scala 997 309 Updated Aug 26, 2024
Next