Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Result #20

Open
YanLiang1102 opened this issue Aug 1, 2018 · 4 comments
Open

Training Result #20

YanLiang1102 opened this issue Aug 1, 2018 · 4 comments

Comments

@YanLiang1102
Copy link
Collaborator

YanLiang1102 commented Aug 1, 2018

@khaledJabr @ahalterman
1.With Pretrained pruned vector and ner spacy trained model, then update the model only with prodigy labled data, like 800 tokens, we get this: no merged ner class yet
image

2.with no pretrained model eveything else is the same as case 1 we got this: no merged ner class yet (so yes, the pretained model does help)
image

3 trained with only ldc data with Prodigy with pretraind spacy ner model, other case like case 1.
image

4. prodigy data + 23 times prodigy size reheasal data other case like case 3.
since we have 18670 (onto token) and 801 (prodigy labeled token) in order to get used all of the data we use 23 as multiplier since 18670/801=23
with 18670 training samples we get 4122 empty spanned removed.
image

5. with merged ner class, other condition like 4.
image

6 with Khaled cleaned data , other condition like 5
image

@YanLiang1102
Copy link
Collaborator Author

YanLiang1102 commented Aug 1, 2018

Before merge the ner labels, these are the labels distribution:

{'CARDINAL': 336,
 'CARDINAL" E_OFF="1': 2,
 'DATE': 1066,
 'DATE" E_OFF="1': 16,
 'DATE" E_OFF="5': 2,
 'DATE" S_OFF="1': 2,
 'EVENT': 224,
 'FAC': 244,
 'GPE': 1806,
 'GPE" S_OFF="1': 2,
 'LANGUAGE': 18,
 'LAW': 112,
 'LOC': 172,
 'MONEY': 106,
 'NORP': 2164,
 'ORDINAL': 598,
 'ORDINAL" E_OFF="1': 2,
 'ORDINAL" E_OFF="3': 2,
 'ORG': 3424,
 'ORG" E_OFF="1': 8,
 'ORG" S_OFF="1': 16,
 'PERCENT': 74,
 'PERSON': 3586,
 'PERSON" S_OFF="1': 30,
 'PRODUCT': 36,
 'PRODUCT" S_OFF="1': 2,
 'QUANTITY': 198,
 'QUANTITY" E_OFF="1': 2,
 'TIME': 170,
 'TIME" E_OFF="1': 2,
 'WORK_OF_ART': 124,
 'WORK_OF_ART" E_OFF="1': 2}
//prodigy labeled distributuion
{'GPE': 439, 'ORG': 133, 'PERSON': 229}
//after update ldc tags we have:
{'GPE': 2224, 'MISC': 5260, 'ORG': 3448, 'PERSON': 3616}

@ahalterman
Copy link
Collaborator

Good work! So it looks like around 70% is where we're going to be for now. Can you get per-class accuracy, too? We don't really care so much about MISC and it could be that that one is harder than the rest.

@YanLiang1102
Copy link
Collaborator Author

@ahalterman
Training and eval without MISC similar result
image

@YanLiang1102
Copy link
Collaborator Author

YanLiang1102 commented Aug 14, 2018

1.augmented_for_training_1 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on GPE:
eval data set count:
image

Training result:
image

2.augmented_for_training_2 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on PERSON:

image

Training result:
image

3.augmented_for_training_3 (MISC filtered out (trained on PERSON, ORG and GPE) and only eval on ORG:
image

image

This is the overall accuracy including all the class:
image

@ahalterman hey Andy check these training result, so all trained on data without MISC and eval on individual tag class, pretty average the accuracy for each class, you can see how many records has been evaluated on (for each training the first picture).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants