Skip to content

PatricioDieck/CheSS

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chemical Semantic Search (CheSS)

From: Prompt Engineering for Transformer-Based Chemical Similarity Search Identifies Structurally Distinct Functional Analogues (Under review)

Authors: Clayton W. Kosonocky, Aaron L. Feller, Claus O. Wilke, Andrew D. Ellington

Link to arXiv: https://arxiv.org/abs/2305.16330

Contained herein are scripts for the described Chemical Semantic Search (CheSS), alternative canonicalization prompt engineering method, and various analyses of the search results.

Due to the constraints on GitHub filesize:

  • The database file "pubchem-9999810-512-token-compatible.txt" is not available
  • The search results only include the top 1,000 results, rather than the entire database

If you wish to obtain copies of these files, please contact [email protected]. We may host these files on a separate storage platform, in which case we will post a link here.

Citation

@misc{kosonocky2023prompt,
      title={Prompt Engineering for Transformer-based Chemical Similarity Search Identifies Structurally Distinct Functional Analogues}, 
      author={Clayton W. Kosonocky and Aaron L. Feller and Claus O. Wilke and Andrew D. Ellington},
      year={2023},
      eprint={2305.16330},
      archivePrefix={arXiv},
      primaryClass={physics.chem-ph}
}

About

Chemical Semantic Search

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 91.6%
  • Python 8.4%