Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update processing-pipelines.md to mention method for doc metadata #7480

Merged
merged 8 commits into from
Apr 19, 2021
Merged

Update processing-pipelines.md to mention method for doc metadata #7480

merged 8 commits into from
Apr 19, 2021

Conversation

langdonholmes
Copy link
Contributor

Description

Under "things to try," inform users they can save metadata when using nlp.pipe(foobar, as_tuples=True)

Link to a new example on the attributes page detailing the following:

data = [
  ("Some text to process", {"meta": "foo"}),
  ("And more text...", {"meta": "bar"})
]

for doc, context in nlp.pipe(data, as_tuples=True):
    # Let's assume you have a "meta" extension registered on the Doc
    doc._.meta = context["meta"]

from (one of) Ines' comments on StackOverflow

Types of change

Update the docs.

Checklist

  • I have submitted the spaCy Contributor Agreement.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

Under "things to try," inform users they can save metadata when using nlp.pipe(foobar, as_tuples=True)

Link to a new example on the attributes page detailing the following:

> ```
> data = [
>   ("Some text to process", {"meta": "foo"}),
>   ("And more text...", {"meta": "bar"})
> ]
> 
> for doc, context in nlp.pipe(data, as_tuples=True):
>     # Let's assume you have a "meta" extension registered on the Doc
>     doc._.meta = context["meta"]
> ```

from https://stackoverflow.com/questions/57058798/make-spacy-nlp-pipe-process-tuples-of-text-and-additional-information-to-add-as
Update the attributes section with example of how extensions can be used to store metadata.
@svlandeg svlandeg added the docs Documentation and website label Mar 18, 2021
@adrianeboyd
Copy link
Contributor

Hi, thanks for this contribution! I think it's a great idea to have an example for as_tuples in the docs, but it's a bit out-of-place in the proposed locations.

Could you add it instead as a short paragraph + executable example at the end of the "Processing text" section? The main focus of the example would be on using as_tuples rather than the custom extension itself, since those are introduced in a separate section, but you can demonstrate how as_tuples is useful by assigning the context info to the custom extension.

Made as_tuples example executable and relocated to the end of the "Processing Text" section.
@langdonholmes
Copy link
Contributor Author

Excellent suggestions. I think I have implemented them and updated the pull request, but I am a bit new to Git so I'm not 100% sure I did that correctly. I am happy to make any additional changes or a new pull request as needed.

@adrianeboyd
Copy link
Contributor

Thanks for the updates! I reformatted and rephrased a bit and I'll leave this open for a little while for further feedback from others...

@svlandeg
Copy link
Member

Looks good to me! I'll go ahead and merge this and push the update to the docs :-)

@svlandeg svlandeg merged commit df541c6 into explosion:master Apr 19, 2021
svlandeg pushed a commit that referenced this pull request Apr 19, 2021
)

* Update processing-pipelines.md

Under "things to try," inform users they can save metadata when using nlp.pipe(foobar, as_tuples=True)

Link to a new example on the attributes page detailing the following:

> ```
> data = [
>   ("Some text to process", {"meta": "foo"}),
>   ("And more text...", {"meta": "bar"})
> ]
> 
> for doc, context in nlp.pipe(data, as_tuples=True):
>     # Let's assume you have a "meta" extension registered on the Doc
>     doc._.meta = context["meta"]
> ```

from https://stackoverflow.com/questions/57058798/make-spacy-nlp-pipe-process-tuples-of-text-and-additional-information-to-add-as

* Updating the attributes section

Update the attributes section with example of how extensions can be used to store metadata.

* Update processing-pipelines.md

* Update processing-pipelines.md

Made as_tuples example executable and relocated to the end of the "Processing Text" section.

* Update processing-pipelines.md

* Update processing-pipelines.md

Removed extra line

* Reformat and rephrase

Co-authored-by: Adriane Boyd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation and website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants