Skip to content

Pinned Loading

  1. Intro_to_ML_Safety Intro_to_ML_Safety Public

    60 20

  2. trojan-dc-2023 trojan-dc-2023 Public

    JavaScript 1

Repositories

Showing 10 of 18 repositories
  • cerberus-cluster Public

    HPC cluster code and configurations for running on OCI

    centerforaisafety/cerberus-cluster’s past year of commit activity
    Python 4 UPL-1.0 0 73 2 Updated Sep 20, 2024
  • forecasting Public

    Forecasting.

    centerforaisafety/forecasting’s past year of commit activity
    TypeScript 18 6 1 0 Updated Sep 11, 2024
  • centerforaisafety/cluster-docs’s past year of commit activity
    CSS 0 MIT 2 4 1 Updated Sep 9, 2024
  • safetywashing Public

    Measuring correlations between safety benchmarks and general AI capabilities benchmarks.

    centerforaisafety/safetywashing’s past year of commit activity
    Python 0 MIT 0 0 0 Updated Aug 30, 2024
  • AISES Public
    centerforaisafety/AISES’s past year of commit activity
    CSS 0 1 0 0 Updated Aug 20, 2024
  • HarmBench Public

    HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

    centerforaisafety/HarmBench’s past year of commit activity
    Jupyter Notebook 276 MIT 47 19 4 Updated Aug 16, 2024
  • centerforaisafety/course.mlsafety.org’s past year of commit activity
    HTML 3 MIT 0 0 0 Updated Jul 7, 2024
  • tdc2023-starter-kit Public

    This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.

    centerforaisafety/tdc2023-starter-kit’s past year of commit activity
    Python 77 MIT 26 0 0 Updated May 19, 2024
  • wmdp Public

    WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning method which reduces LLM performance on WMDP while retaining general capabilities.

    centerforaisafety/wmdp’s past year of commit activity
    Jupyter Notebook 72 MIT 21 5 1 Updated Apr 27, 2024
  • centerforaisafety/safety_challenge’s past year of commit activity
    HTML 0 MIT 0 0 0 Updated Mar 28, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…