[SPARK-48803][SQL] Throw internal error in Orc(De)serializer to align with ParquetWriteSupport #47208

yaooqinn · 2024-07-04T02:22:06Z

What changes were proposed in this pull request?

Kind of follow-up of #44275, this PR aligned 2 similar code paths with different error messages into one.

24/07/03 16:29:01 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: [INTERNAL_ERROR] Unsupported data type VarcharType(64). SQLSTATE: XX000
	at org.apache.spark.SparkException$.internalError(SparkException.scala:92)
	at org.apache.spark.SparkException$.internalError(SparkException.scala:96)

org.apache.spark.SparkUnsupportedOperationException: VarcharType(64) is not supported yet.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.dataTypeUnsupportedYetError(QueryExecutionErrors.scala:993)
	at org.apache.spark.sql.execution.datasources.orc.OrcSerializer.newConverter(OrcSerializer.scala:209)
	at org.apache.spark.sql.execution.datasources.orc.OrcSerializer.$anonfun$converters$2(OrcSerializer.scala:35)
	at scala.collection.immutable.List.map(List.scala:247)

Why are the changes needed?

improvement

Does this PR introduce any user-facing change?

No, users shouldn't face such errors in regular cases.

How was this patch tested?

passing existing tests

Was this patch authored or co-authored using generative AI tooling?

no

…ith ParquetWriteSupport

dongjoon-hyun

+1, LGTM. Thank you for making these consistent across data sources, @yaooqinn .
Merged to master for Apache Spark 4.0.0-preview2.

yaooqinn · 2024-07-09T03:17:02Z

Thank you @dongjoon-hyun

… with ParquetWriteSupport ### What changes were proposed in this pull request? Kind of follow-up of apache#44275, this PR aligned 2 similar code paths with different error messages into one. ```java 24/07/03 16:29:01 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.apache.spark.SparkException: [INTERNAL_ERROR] Unsupported data type VarcharType(64). SQLSTATE: XX000 at org.apache.spark.SparkException$.internalError(SparkException.scala:92) at org.apache.spark.SparkException$.internalError(SparkException.scala:96) ``` ```java org.apache.spark.SparkUnsupportedOperationException: VarcharType(64) is not supported yet. at org.apache.spark.sql.errors.QueryExecutionErrors$.dataTypeUnsupportedYetError(QueryExecutionErrors.scala:993) at org.apache.spark.sql.execution.datasources.orc.OrcSerializer.newConverter(OrcSerializer.scala:209) at org.apache.spark.sql.execution.datasources.orc.OrcSerializer.$anonfun$converters$2(OrcSerializer.scala:35) at scala.collection.immutable.List.map(List.scala:247) ``` ### Why are the changes needed? improvement ### Does this PR introduce _any_ user-facing change? No, users shouldn't face such errors in regular cases. ### How was this patch tested? passing existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#47208 from yaooqinn/SPARK-48803. Authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…TEMP_2088` ### What changes were proposed in this pull request? The pr is following up #47208. ### Why are the changes needed? In this PR #47208, the method `dataTypeUnsupportedYetError` of using `_LEGACY_ERROR_TEMP_2088` has been removed, but the corresponding error condition `_LEGACY_ERROR_TEMP_2088` has not been synchronously deleted, and currently the Spark code repo no longer uses this error condition. <img width="1022" alt="image" src="https://github.com/user-attachments/assets/19723637-3535-4162-9259-5ddcea0ee579"> ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47783 from panbingkun/remove_unused_error_condition. Authored-by: panbingkun <[email protected]> Signed-off-by: Max Gekk <[email protected]>

…TEMP_2088` ### What changes were proposed in this pull request? The pr is following up apache#47208. ### Why are the changes needed? In this PR apache#47208, the method `dataTypeUnsupportedYetError` of using `_LEGACY_ERROR_TEMP_2088` has been removed, but the corresponding error condition `_LEGACY_ERROR_TEMP_2088` has not been synchronously deleted, and currently the Spark code repo no longer uses this error condition. <img width="1022" alt="image" src="https://github.com/user-attachments/assets/19723637-3535-4162-9259-5ddcea0ee579"> ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47783 from panbingkun/remove_unused_error_condition. Authored-by: panbingkun <[email protected]> Signed-off-by: Max Gekk <[email protected]>

[SPARK-48803][SQL] Throw internal error in OrcDeserializer to align w…

a5fca00

…ith ParquetWriteSupport

yaooqinn force-pushed the SPARK-48803 branch from d2c0901 to a5fca00 Compare July 4, 2024 02:25

github-actions bot added the SQL label Jul 4, 2024

dongjoon-hyun approved these changes Jul 9, 2024

View reviewed changes

dongjoon-hyun closed this in 97ce5b8 Jul 9, 2024

yaooqinn deleted the SPARK-48803 branch July 9, 2024 03:17

panbingkun mentioned this pull request Aug 16, 2024

[SPARK-48803][FOLLOWUP] Remove unused error-condition _LEGACY_ERROR_TEMP_2088 #47783

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-48803][SQL] Throw internal error in Orc(De)serializer to align with ParquetWriteSupport #47208

[SPARK-48803][SQL] Throw internal error in Orc(De)serializer to align with ParquetWriteSupport #47208

yaooqinn commented Jul 4, 2024

dongjoon-hyun left a comment

yaooqinn commented Jul 9, 2024

[SPARK-48803][SQL] Throw internal error in Orc(De)serializer to align with ParquetWriteSupport #47208

[SPARK-48803][SQL] Throw internal error in Orc(De)serializer to align with ParquetWriteSupport #47208

Conversation

yaooqinn commented Jul 4, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun left a comment

Choose a reason for hiding this comment

yaooqinn commented Jul 9, 2024