Skip to content

An open source desktop application for extracting chemical structures from scientific articles in PDF format.

License

Notifications You must be signed in to change notification settings

iwwwish/OSRAChem

Repository files navigation

OSRAChem

OSRAChem is a desktop application that facilitates a semi-automated work-flow for extracting chemical structures (from images) in full-text scientific articles. It relies on the open-source OSRA utility (many thanks to Igor Filippov). The extracted structures are displayed as 2D depictions. The application also takes image and text inputs. All image file formats supported by GraphicsMagick are valid.

The work was part of an internship project under the supervison of Dr. Christoph Steinbeck, at the European Bioinformatics Institute.

alt tag

Packages/Dependencies:
  1. OSRA- Optical Structure Recognition Application
  2. CDK- Chemistry Development Kit
  3. OPSIN- Open Parser for Systematic IUPAC Nomenclature
  4. Apache PDFBox- A Java PDF Library
  5. JPedal- An open source library with fully-featured PDF viewer
Requirements:
  1. Operating System: Mac OS X or Linux (Ubuntu 12.04 or later)
  2. Java 6 or later
OSRA installation steps (Mac OS):
  1. Install Homebrew (if not previously installed): see instructions
  2. Tap the cheminformatics repository (thanks to Matt Swain): brew tap mcs07/cheminformatics
  3. Install OSRA: brew install osra
  4. type osra and you should see something like below (if yes, installation is complete)

alt tag


OSRA installation steps (Linux):

OSRA must be compiled from source in Linux. Find detailed instructions here.

About

An open source desktop application for extracting chemical structures from scientific articles in PDF format.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published