Skip to content

schatzsr/CS6065-Hadoop-for-Real-Problems

Repository files navigation

Hadoop for Real Problems

CS6065 Intro to Cloud Computing Project 4 & 5

Cloned from University of Cincinnati's git site, https://github.uc.edu/

Assignment Outline: https://docs.google.com/document/d/1iEqvbRoRiLXSLaxdb2uURkyvpEjRKo6uPATHXEidWzk/

Working on problems:

Twitter:

1) [Chris McVeigh] What hour of the day does @PrezOno’s tweet the most on average, using every day we have twitter data?
Include a plot of the expected number of tweets for each hour of the day, for those he did tweet.
For example if Ono tweeted once every day at 12:30PM, his expected number of tweets between 12 and 1 would be 1.
If he alternates between 2 and 3 tweets per day, his average would be 2.5.

2) [Anthony Kleiser] What day of the week does @PrezOno tweet the most on average? Use the same example as in #1 but for days of the week.

3) [Sean Schatzman] How does @PrezOno’s tweet length compare to the average of all others? What is his average length? All others?

5) [Anthony Kleiser] What twitter user tweeted the most? What is the top 5 longest tweeters? Bottom 5?

6) [Chris McVeigh] For each day of the week, what was the most mentioned hashtag? Hour of the day?

10) [Sean Schatzman] For each day, what was the most retweeted or most favorited tweet?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published