Skip to content
View gpfreitas's full-sized avatar

Block or report gpfreitas

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 stars written in Java
Clear filter

OpenRefine is a free, open source power tool for working with messy data and improving it

Java 10,820 1,947 Updated Oct 2, 2024

Example code from Learning Spark book

Java 3,884 2,421 Updated Dec 16, 2023

YUI Compressor

Java 3,013 663 Updated Sep 15, 2021

Stream summarizer and cardinality estimator.

Java 2,259 559 Updated Nov 28, 2019

A platform for visualization and real-time monitoring of data workflows

Java 1,181 210 Updated Jan 22, 2020

A Java package to automatically detect anomalies in large scale time-series data

Java 1,171 331 Updated Nov 14, 2023

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.

Java 1,139 387 Updated Apr 10, 2023

Hadoop library for large-scale data processing, now an Apache Incubator project

Java 585 134 Updated Jul 8, 2014

Kite SDK

Java 394 262 Updated Nov 1, 2022

Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.

Java 114 83 Updated Nov 12, 2015

A grouping of Apache Pig examples.

Java 65 14 Updated Oct 13, 2020

A sample maven-enabled pig project complete with example of unit tests on UDF using junit and pig scripts using pigunit.

Java 8 6 Updated Oct 23, 2013