-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Insights: pola-rs/polars
September 13, 2024 – September 20, 2024
Overview
Could not load contribution data
Please try again later
30 Pull requests merged by 10 people
-
docs: Refactor
docs
directory hierarchy#18773 merged
Sep 20, 2024 -
feat: Relaxed schema alignment for parquet file list read
#18803 merged
Sep 20, 2024 -
fix: Proper dtype casting for struct embedded categoricals in chunked categoricals
#18815 merged
Sep 20, 2024 -
refactor(rust): Fix new-streaming
test_parquet::test_complex_types
#18829 merged
Sep 20, 2024 -
refactor(python): Make
NodeTraverser
struct public#18822 merged
Sep 20, 2024 -
refactor(rust): Fix zero-length len
#18817 merged
Sep 19, 2024 -
refactor: Add panic to unchecked DataFrame constructors in debug mode
#18807 merged
Sep 18, 2024 -
fix: Fixed some error/assertion types
#18811 merged
Sep 18, 2024 -
refactor(rust): Add missing implicit datetime alias in ExprIR
#18809 merged
Sep 18, 2024 -
fix: Remove panic in
arr.to_struct
#18804 merged
Sep 18, 2024 -
refactor(rust): Fix topological sort in new streaming engine
#18806 merged
Sep 18, 2024 -
refactor(rust): Fix new-streaming parquet
test_row_index_projection_pushdown_18463
#18805 merged
Sep 18, 2024 -
docs: Minor improvements to contributing guide
#18777 merged
Sep 18, 2024 -
fix: Allow empty sort by columns
#18774 merged
Sep 18, 2024 -
refactor(rust): Remove short-lived / non-CPU bound task spawns on async executor in new-streaming
#18764 merged
Sep 18, 2024 -
fix: Broadcast zip_with for structs
#18770 merged
Sep 18, 2024 -
docs(python): Improve
over
docs, add example withorder_by
#18796 merged
Sep 18, 2024 -
refactor: Fix parquet file metadata is dropped after first DSL->IR conversion
#18789 merged
Sep 17, 2024 -
refactor: Remove extra hashmap construction in new-streaming parquet
#18792 merged
Sep 17, 2024 -
refactor(rust): Fix new-streaming parquet on empty parquet
#18763 merged
Sep 17, 2024 -
fix: Dropped/shifted rows in parquet scan with
streaming=True
#18766 merged
Sep 17, 2024 -
chore(python): Remove TODO comment regarding NumPy pinning
#18776 merged
Sep 17, 2024 -
fix: Fix
cum_max
using exception text ofcum_min
for invalid dtype#18780 merged
Sep 17, 2024 -
refactor(rust): Ensure fallback node gets correct length df even if no columns selected
#18772 merged
Sep 16, 2024 -
refactor(rust): Fix input independence tests in new-streaming engine
#18771 merged
Sep 16, 2024 -
docs(python): Add documentation for beta gpu support
#18762 merged
Sep 16, 2024 -
fix: Fix accidental raise on shape 1
#18748 merged
Sep 15, 2024 -
feat: Always preserve sorted flag for .dt.date
#18692 merged
Sep 15, 2024 -
refactor(python): Remove unused methods
#18744 merged
Sep 15, 2024 -
refactor: Make DataFrame a Vec of
Column
instead ofSeries
#18664 merged
Sep 14, 2024
11 Pull requests opened by 10 people
-
refactor(rust): Replace `DynArgs` with an enum containing all its variants
#18746 opened
Sep 14, 2024 -
docs(python): Fix literal type mapping example in `lit` docstrings
#18756 opened
Sep 15, 2024 -
docs(python): Clarify documentation for `schema` in `read_csv` function
#18759 opened
Sep 15, 2024 -
fix: Improve histogram bin logic
#18761 opened
Sep 16, 2024 -
refactor: Keep scalar in more places
#18775 opened
Sep 16, 2024 -
fix: Struct filter by index
#18778 opened
Sep 16, 2024 -
fix(rust): Handle AnyValue::Struct to prevent null returns
#18801 opened
Sep 18, 2024 -
fix: Throw error for comparison of unequal length series
#18816 opened
Sep 18, 2024 -
fix(rust): Return empty DF when input is empty json list
#18827 opened
Sep 19, 2024 -
refactor(python): Re-export PyO3 in `polars-python` crate
#18835 opened
Sep 20, 2024
25 Issues closed by 12 people
-
Panicking with "Unreachable code" when passing a DataFrame into `DataFrame.drop`
#18837 closed
Sep 21, 2024 -
Cannot print DataFrame with dtype `List(Struct(Categorical))`
#12122 closed
Sep 20, 2024 -
Error when chaining a LazyFrame `.with_row_index()` with `.sink_csv()`
#18455 closed
Sep 20, 2024 -
Polars scan_parquet with wildcard fails where schema column index positions dont align
#18568 closed
Sep 20, 2024 -
Performance regression (particularly in q21 of the TPC-H benchmark, +60%) after specific commit
#18828 closed
Sep 20, 2024 -
`read_csv` raises ComputeError when filename contains "["
#18826 closed
Sep 19, 2024 -
`pl.exclude` used as `join` key raises PanicException
#17497 closed
Sep 18, 2024 -
`.unique_counts` not implemented for struct data types
#7915 closed
Sep 18, 2024 -
`df.sort()` PanicException when called with "bad" inputs
#15820 closed
Sep 18, 2024 -
`.by_name()` selector `PanicException`: no "columns" expected at this point
#13986 closed
Sep 18, 2024 -
`.list.gather` PanicException unreachable code when invalid argument given
#17270 closed
Sep 18, 2024 -
`.map_elements` with `return_dtype=pl.List(pl.Struct)` leads to `sink_parquet` PanicException
#17181 closed
Sep 18, 2024 -
`.arr.to_struct` PanicException for non-Array column
#18794 closed
Sep 18, 2024 -
Panic with "expected arrays of the same length"
#18673 closed
Sep 18, 2024 -
Option to use fixed point floats
#18741 closed
Sep 18, 2024 -
Python - Add support to the invert operator `~` to negate expressions and columns
#18793 closed
Sep 17, 2024 -
Panic: 'collect(streaming=True)' on 'scan_parquet' Fails for Hive-Partitioned Parquet Files in Azure Storage
#18779 closed
Sep 17, 2024 -
Parquet read with `streaming=True` shifts/drops rows
#18739 closed
Sep 17, 2024 -
`cum_max` error message for invalid datatype refers to `cum_min`
#18754 closed
Sep 17, 2024 -
Strange type validation on `when_then` expressions
#18767 closed
Sep 16, 2024 -
polars.testing assert_frame_equal raises AssertionError on identical dataframes
#18747 closed
Sep 15, 2024 -
Cannot add a Series to an empty DataFrame only if the Series lengh is 1
#18736 closed
Sep 15, 2024 -
Meet and Beat Pandas' Support for Nested DataFrames and Arrays
#18743 closed
Sep 14, 2024
47 Issues opened by 37 people
-
Polars produce wrong result in streaming mode
#18838 opened
Sep 21, 2024 -
Ability to `sink` lazy datasets to `STDOUT` or to files
#18834 opened
Sep 20, 2024 -
Filtering with pl.col is substantially (27x) slower than filtering with pl.Series
#18833 opened
Sep 20, 2024 -
`pl.Array` + `pl.lit` PanicException Cannot apply operation on arrays of different lengths
#18831 opened
Sep 20, 2024 -
bug: plotting breaks when `axis` is passed to `alt.X`
#18830 opened
Sep 20, 2024 -
write_csv ignores formatting when writing to io.StringIO()
#18825 opened
Sep 19, 2024 -
High memory usage when calculating variance?
#18824 opened
Sep 18, 2024 -
Row Group Based Subtotals with .group_by()
#18823 opened
Sep 18, 2024 -
Schema assumes the column order in the data when reading a CSV
#18821 opened
Sep 18, 2024 -
Join fails for scanned lazyframes when `streaming=True`
#18820 opened
Sep 18, 2024 -
write_parquet encoding no longer recognized by PBI Service parquet connector after Polars 1.5.0 onwards
#18819 opened
Sep 18, 2024 -
Install slack app to allow subscription
#18818 opened
Sep 18, 2024 -
Categorical revmaps are not merged when concatenated inside a struct
#18814 opened
Sep 18, 2024 -
`rows_by_key` works with pl.Array
#18813 opened
Sep 18, 2024 -
`read_csv` PanicException when `pl.Decimal` used in schema with invalid precision
#18812 opened
Sep 18, 2024 -
loading pickled `pl.Series` of `dtype=pl.Array(pl.Enum(...), ...)` fails
#18810 opened
Sep 18, 2024 -
pl.datetime does not respect leftmost-argument naming rule
#18808 opened
Sep 18, 2024 -
Reading from S3 compatible storage
#18802 opened
Sep 18, 2024 -
from_any_values_and_dtype converts AnyValue::Struct to null
#18800 opened
Sep 18, 2024 -
Issue when header dtype differ from the rest of the rows' dtype
#18799 opened
Sep 17, 2024 -
Parsing to `Decimal` ignores `decimal_comma` when reading from csv
#18798 opened
Sep 17, 2024 -
Parse large floats in CSVs that have periods as the thousands separator
#18797 opened
Sep 17, 2024 -
list.set_intersection operates on the wrong columns when multiple columns are selected with pl.col
#18795 opened
Sep 17, 2024 -
docs: to_datetime: document that %Y zero-pads years
#18791 opened
Sep 17, 2024 -
`reshape` in group_by context PanicException
#18788 opened
Sep 17, 2024 -
`.struct.field` + `.filter` PanicException instead of ColumnNotFoundError
#18787 opened
Sep 17, 2024 -
SQL query with WHERE clause that evaluates to true/false gives ShapeError on DataFrame with null columns
#18786 opened
Sep 17, 2024 -
Use of `sorting_columns` parquet metadata
#18785 opened
Sep 17, 2024 -
Support serializing to json in a format similar to pandas "split" orientation
#18784 opened
Sep 17, 2024 -
Explain that the trig returns and arguments are all in radians
#18783 opened
Sep 16, 2024 -
`write_excel(formulas={})` appears non-functional in latest version
#18782 opened
Sep 16, 2024 -
`str.json_decode` hangs on nested JSON structure with null values
#18781 opened
Sep 16, 2024 -
Add feature gates to `polars-stream`
#18769 opened
Sep 16, 2024 -
Parquet metadata is dropped on DSL->IR conversion cache
#18768 opened
Sep 16, 2024 -
read_json cannot parse a simple json array
#18760 opened
Sep 15, 2024 -
`AWS_ENDPOINT_URL` not inferred by cloud I/O
#18758 opened
Sep 15, 2024 -
`AWS_PROFILE` should be supported in cloud storage I/O config
#18757 opened
Sep 15, 2024 -
`Expr.shrink_dtype()` breaks broadcasting
#18755 opened
Sep 15, 2024 -
Lazy cross join + filter not optimized in non-equi join
#18753 opened
Sep 15, 2024 -
Ambigiuous column names in join_where require post-join names
#18752 opened
Sep 15, 2024 -
Join_where doesn't support multiple binary comparisons in a single Expr
#18751 opened
Sep 15, 2024 -
Ability to append to an existing directory of parquet files with new partitions (mode=append)
#18750 opened
Sep 15, 2024 -
Add a non-equi joins and `join_where` to `joins` section of the user guide
#18749 opened
Sep 15, 2024 -
`join_where` ColumnNotFoundError if predicate only uses columns from one side
#18745 opened
Sep 14, 2024 -
multi comment_prefix support while parsing csv
#18742 opened
Sep 14, 2024
35 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
feat(rust, python): Support arithmetic between Series with dtype list
#17823 commented on
Sep 20, 2024 • 8 new comments -
docs(rust): Add Rust example for tolerance join using join_asof in user guide
#18696 commented on
Sep 14, 2024 • 2 new comments -
fix(python): Disallow all-null/empty Series creation with empty Enum
#14676 commented on
Sep 15, 2024 • 1 new comment -
Error reading ods file with read_ods
#14053 commented on
Sep 14, 2024 • 0 new comments -
StringCacheMismatchError when using joblib.Parallel and Categorical data
#18528 commented on
Sep 17, 2024 • 0 new comments -
Support of vertical fold and scan
#12165 commented on
Sep 17, 2024 • 0 new comments -
floor_div runtime error for i64, u32 and u64
#17238 commented on
Sep 18, 2024 • 0 new comments -
`hist` panics after creating zero bins
#18650 commented on
Sep 18, 2024 • 0 new comments -
Remove `json` support in `LazyFrame.serialize`
#18284 commented on
Sep 18, 2024 • 0 new comments -
`collect_async` is blocking
#18718 commented on
Sep 18, 2024 • 0 new comments -
Support reading directly from zipfile.Path objects.
#16758 commented on
Sep 18, 2024 • 0 new comments -
Saving parquet to Google Cloud Storage with `df.write_parquet()`
#14630 commented on
Sep 19, 2024 • 0 new comments -
Support for uint256/int256
#15443 commented on
Sep 19, 2024 • 0 new comments -
`LazyFrame::cross_join` + `concat_list` error
#18587 commented on
Sep 20, 2024 • 0 new comments -
Support lazy schema retrieval in IO Plugins
#18638 commented on
Sep 20, 2024 • 0 new comments -
Simple arithmetic operations on the "list" type columns
#8006 commented on
Sep 20, 2024 • 0 new comments -
feat: Quantile function in SQL
#18047 commented on
Sep 19, 2024 • 0 new comments -
perf: Collapse cross-joins to faster joins
#18633 commented on
Sep 18, 2024 • 0 new comments -
feature: .rolling_slope()
#8861 commented on
Sep 14, 2024 • 0 new comments -
passing str + list[Expr] to `agg()` causes PanicException
#18706 commented on
Sep 14, 2024 • 0 new comments -
Improve horizontal null detection (`df.drop_nulls`, `all_null`_, `any_null`?)
#12443 commented on
Sep 14, 2024 • 0 new comments -
`PanicException` occurs when applying a deserialized `rolling_quantile`
#18595 commented on
Sep 14, 2024 • 0 new comments -
Skip row_groups in parquet files using bloom_filters
#5332 commented on
Sep 14, 2024 • 0 new comments -
`pl.col('a').is_in(['val1', None])` does not return true for null cells in col
#18728 commented on
Sep 14, 2024 • 0 new comments -
Incorrect result when using `map_elements` with a list of list return type
#18703 commented on
Sep 15, 2024 • 0 new comments -
CSV parsing: ComputeError
#15854 commented on
Sep 15, 2024 • 0 new comments -
Hive partitioning tracking issue
#15441 commented on
Sep 16, 2024 • 0 new comments -
Parquet reader fails when file has less columns than reader_schema
#14980 commented on
Sep 16, 2024 • 0 new comments -
Group By no longer supported in Rust WASM
#17192 commented on
Sep 16, 2024 • 0 new comments -
`read_database*` methods are slow (Oracle database).
#18738 commented on
Sep 16, 2024 • 0 new comments -
Build Python polars wheels with PGO
#9702 commented on
Sep 16, 2024 • 0 new comments -
Allow arithmetic operations for list and array type
#9188 commented on
Sep 16, 2024 • 0 new comments -
rolling_min/max has quadratic worst case behavior
#12714 commented on
Sep 16, 2024 • 0 new comments -
should be f64 scalar: ComputeError(ErrString("could not extract number from any-value of dtype: 'Null'"))
#17971 commented on
Sep 16, 2024 • 0 new comments -
Use BigQuery Dataframes as Read-Connector to BigQuery
#17326 commented on
Sep 17, 2024 • 0 new comments