Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datafusion examples to docs #519

Merged
merged 6 commits into from
Dec 15, 2021

Conversation

matthewmturner
Copy link
Contributor

Description

The description of the main changes of your pull request

Related Issue(s)

closes #515

Documentation

@matthewmturner
Copy link
Contributor Author

Started putting together the docs, based on test in delta_datafusion.rs. I havent been able to actually get querying delta tables to work locally yet, but wanted to publicize the work in case any comments.

@matthewmturner
Copy link
Contributor Author

@houqp can delta tables be queried with the datafusion python bindings? i can add something for that as well.

@matthewmturner
Copy link
Contributor Author

and this might be a silly question, but why does the datafusion feature have the ext suffix but s3 and azure dont?

@matthewmturner
Copy link
Contributor Author

i looked into this on the python side and it seems that the ExecutionContext is missing the register_table method that is used in rust to query DeltaTables. I'll create an issue there to add that functionality as it would be nice to query from python.

now im going to try querying s3 DeltaTable in rust and adding docs for that then i think this should be good.

@houqp
Copy link
Member

houqp commented Dec 10, 2021

@houqp can delta tables be queried with the datafusion python bindings? i can add something for that as well.

Probably not yet because I don't think we support taking a list of file paths yet in the current table provider implementation :(

and this might be a silly question, but why does the datafusion feature have the ext suffix but s3 and azure dont?

Because the datafusion feature conflicts with the datafusion crate name. But I learned later that we could use optional dependencies as features directly. I haven't tested it myself, if this works, it would be better to use datafusion as the feature name going forward in our docs.

@matthewmturner
Copy link
Contributor Author

matthewmturner commented Dec 15, 2021

Probably not yet because I don't think we support taking a list of file paths yet in the current table provider implementation :(

@houqp can you expand on that? I thought since you can query with datafusion in Rust it was just a matter of exposing that via Python with something like a register_table on context.

@matthewmturner matthewmturner marked this pull request as ready for review December 15, 2021 17:27
@matthewmturner
Copy link
Contributor Author

@houqp ready for review

@houqp houqp merged commit 624b172 into delta-io:main Dec 15, 2021
@houqp
Copy link
Member

houqp commented Dec 15, 2021

Thank you @matthewmturner !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add documentation for using Delta Table with datafusion
2 participants