Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

History Pack 1 Black Label Data Entry #118

Open
luceleaftea opened this issue Sep 12, 2022 · 17 comments
Open

History Pack 1 Black Label Data Entry #118

luceleaftea opened this issue Sep 12, 2022 · 17 comments
Labels
data entry Involves data entry into CSVs hacktoberfest Help out while participating in Hacktoberfest! help wanted Extra attention is needed

Comments

@luceleaftea
Copy link
Contributor

This issue will serve as the staging ground for the History Pack 1 Black Label data entry effort.

Please leave a comment here if you have interest in assisting with transcribing data, and which languages you are interested in transcribing for. When I have the repo ready for this effort, I will update this comment with more instructions and contact those who are interested!

@luceleaftea luceleaftea added help wanted Extra attention is needed data entry Involves data entry into CSVs labels Sep 12, 2022
@luceleaftea luceleaftea added this to the History Pack 1 Data Entry milestone Sep 12, 2022
@manwaring
Copy link
Contributor

manwaring commented Sep 13, 2022

Here's the breakdown of work I've been thinking of - is this how you're thinking about it, too?

  • Confirm card.csv fields that are language-specific or specific to the language packs
    • Identifiers
    • Set Identifiers
    • Name
    • Type
    • Card Keywords
    • Abilities and Effects
    • Ability and Effect Keywords
    • Granted Keywords
    • Functional Text
    • Flavor Text
    • Type Text
    • Variations
    • Image URLs
  • Confirm other files that need new fields added
    • Editions (add editions)
    • Keywords (translated keywords - new language-specific keywords.csv file?
    • Sets (add translated sets)
    • Types (add translated types - new language-specific types.csv file?
  • Use Nano ID (or similar) to create identifier for each existing card, saving to card.csv
  • Create new card.csv and card.ods per additional language, ie french-card.csv (in a different folder?)
  • Update image downloading tool to also download from all language-specific csvs

@luceleaftea
Copy link
Contributor Author

I think you covered it all, yup!

@manwaring
Copy link
Contributor

It's a lot! 😅

@luceleaftea
Copy link
Contributor Author

It is 😅 I also will need to update the SQL server script too, as I think about it. So, will take a few days, but should hopefully have the repo ready soon! Will give us time to recruit volunteers.

@luceleaftea
Copy link
Contributor Author

Started setting stuff up over on this branch.

I think the remaining things needing done are adding .ods and .csv files for the files you have listed, and finishing updating the scripts. I'll work more on the scripts either tonight or tomorrow, but if you have time I would appreciate help setting up the rest of the files!

@luceleaftea luceleaftea added the hacktoberfest Help out while participating in Hacktoberfest! label Oct 1, 2022
@kirkbushell
Copy link

kirkbushell commented Oct 3, 2022

I'm curious if this is something you can partly automate using OCR and telling it the language? All the text sits within certain coordinates, and if the image sizes are identical to the rest (as most are), this should be -relatively- straight forward to transcribe automatically, with some gaps?

I mention this because before I came across this repo, I was working on exactly that as the LSS data is just too erroneous (as we all know - lol)

@luceleaftea
Copy link
Contributor Author

I have not messed around with OCR enough to want to dedicate time to making an OCR script for the repo right now - for now I will leave the data to be hand entered and double checked, but if you have an OCR script you'd like to use or add to the repo to input the data, I have no issues with that being an available resource for people!

@kirkbushell
Copy link

Yeah that's fair. It's pretty easy to scan images using something like tesseract, but I hear you.

The best it could do would be to do the initial population of data, and based on text coordinates it would know whether it's a title, copyright info, card text.etc.

But would still need input for things like card stats, such as pitch/attack.etc (although this can certainly be done using machine learning tools, but then that would start to cost money, so...).

@luceleaftea
Copy link
Contributor Author

Honestly the text is the hardest part for me personally (at least if you're counting the amount of text bugfixes in the repo history....), so something to get the initial text in would be pretty rad in the long term.

@kirkbushell
Copy link

It's something I began working on for FaB DB. I'll see if it'll work if we set the language and share :)

@mstraa
Copy link

mstraa commented Oct 4, 2022

Hi ! I can look at the OCR script, but for me, the only things needed are Name and "Inner text". Every stats can be found with initial ref of the card ( EN : 1HP204 : https://storage.googleapis.com/fabmaster/media/images/1HP204.width-450.png - FR : https://storage.googleapis.com/fabmaster/cardfaces/2022-1HP/FR/FR_1HP204.png).

I'll try to do something when I'll have an evening to spare ! :D

@kirkbushell
Copy link

Hi ! I can look at the OCR script, but for me, the only things needed are Name and "Inner text". Every stats can be found with initial ref of the card ( EN : 1HP204 : https://storage.googleapis.com/fabmaster/media/images/1HP204.width-450.png - FR : https://storage.googleapis.com/fabmaster/cardfaces/2022-1HP/FR/FR_1HP204.png).

I'll try to do something when I'll have an evening to spare ! :D

that's a really good point about the card stats already being in place! Ez mode then! haha

The biggest challenge will be the icons in the card text.

@CarlosGGFAB
Copy link

Hello, my name is Carlos Gutiérrez and I would love to help with the Spanish translations.

@Just-a-Human96
Copy link

Hi, my name is Tim, if i can help with the translations in any way let me know.
Since i am german that's probably where i could help the most, but if i can help any other way i would be happy to do so.

@Mofte
Copy link
Contributor

Mofte commented Feb 10, 2023

Hey there!
Not sure if needed, but I could occasionally help to input some German cards, especially with the upcoming HP2 and Outsiders cards there should be plenty of work. ^^
Timo

@gre99ory
Copy link

gre99ory commented Mar 5, 2023

Hello everyone,
My name is Gregory, I'm French and willing to help for French if needed.
Just let me know !

@luceleaftea
Copy link
Contributor Author

Hey. there everyone, thank you so much for all of the offers! I'm still recovering from Outsiders spoiler season, but after I get some rest I'll finish up getting the repo ready for you all to help out 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data entry Involves data entry into CSVs hacktoberfest Help out while participating in Hacktoberfest! help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants