Skip to content

sinamps/acp-extraction

Repository files navigation

acp-extraction

NGAC.py

Extracts Association, Prohibition and Assignment relations in NGAC format from a given text document.

The text is run through AllenNLP's conference resolution model (demo found here - https://demo.allennlp.org/coreference-resolution).
Each sentence is then run through the SRL model (demo found here - https://demo.allennlp.org/semantic-role-labeling). The sentences with labels "B-ARG0" and "B-ARG1" are used to create Association relations. Additionally, Sentences with "B-ARGM-NEG" Labels are used to determine if it is a Prohibition Relations. Sentneces with verbs such as ['contains' , 'includes' , 'include', 'contain', 'is'] are used to create assignment relations.

To run - python NGAC.py <DOCUMENT.docx> <br / >

Requirements

pip install allennlp==2.1.0 allennlp-models==2.1.0

fastModel.py

Implements the fasttext model. Demo can be found here - https://fasttext.cc/docs/en/supervised-tutorial.html.
The data used in the model is a compilation of 4 ACP labeled datasents found here - https://sites.google.com/site/accesscontrolruleextraction/labelled-data-sets
The dataset is preprocessed to include format as processed by fasttext. ACP.valid contains 70% of the dataset used for training and ACP.valid contains 30%.
Results -> Precision - 98% ; Recall - 98%

To run - python fastModel.py

Requirements

wget https://github.com/facebookresearch/fastText/archive/v0.9.2.zip
unzip v0.9.2.zip
cd fastText-0.9.2

for command line tool :

make

for python bindings :

pip install .

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages