Skip to content

Latest commit

 

History

History

server

Deephaven Python Integration Package

Deephaven Python Integration Package is created by Deephaven Data Labs. It allows Python developers, including data scientists, to access data, run queries, and execute Python scripts directly inside Deephaven data servers to achieve maximum performance. By taking advantage of the unique streaming table capability of Deephaven and its many data ingestion facilities (Kafka, Parquet, CSV, SQL, etc.), Python developers can quickly put together a real-time data processing pipeline that is high performing and easy to consume.

If you use a Windows operating system, WSL is not required to run Deephaven via pip.

Install

Because this package depends on the Deephaven server, it comes preinstalled with Deephaven Docker images and is made available at runtime in the Python console in the Deephaven Web UI.

Quick start

from deephaven import read_csv
from deephaven.stream.kafka.consumer import kafka_consumer, TableType
from deephaven.plot import Figure, PlotStyle
csv_table = read_csv("data1.csv")
kafka_table = kafka_consumer.consume({'bootstrap.servers': 'redpanda:29092'}, topic='realtime_feed', table_type=TableType.Append)
joined_table = kafka_table.join(csv_table, on=["key_col_1", "key_col_2"], joins=["data_col1"])
plot = Figure() \
    .axes(plot_style = PlotStyle.STACKED_BAR )\
    .plot_cat(series_name="Categories1", t=joined_table, category="Key_col_1", y = "data_col1") \
    .show()

Related documentation

API Reference

Start here