Skip to content

debugger87/datafusion-rs

 
 

Repository files navigation

DataFusion: Modern Distributed Compute Platform implemented in Rust

License Version Docs Gitter chat

DataFusion is a modern distributed compute platform implemented in Rust. It is very much inspired by Apache Spark and has a similar programming style through the use of DataFrames and SQL.

DataFusion can also be used as a crate dependency in your project if you want the ability to perform SQL queries and DataFrame style data manipulation in-process against your own data sources. In that respect, DataFusion is inspired by Apache Calcite in the Java world.

Project Home Page

The project home page is now at https://datafusion.rs and contains the roadmap as well as documentation for using this crate or running DataFusion as a distributed cluster. I am using GitHub issues to track development tasks and feedback.

Prerequisites

  • Rust nightly
  • Thrift (required by parquet-rs crate) - instructions here

Building DataFusion

See BUILDING.md.

Gitter

There is a Gitter channel where you can ask questions about the project or make feature suggestions too.

Contributing

Contributors are welcome! Please see CONTRIBUTING.md for details.

About

Distributed query processing implemented in Rust

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Rust 97.5%
  • Shell 1.8%
  • Other 0.7%