Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 1.3 KB

README.md

File metadata and controls

27 lines (16 loc) · 1.3 KB

Hadoop for Real Problems

CS6065 Intro to Cloud Computing Project 4 & 5

Cloned from University of Cincinnati's git site, https://github.uc.edu/

Assignment Outline: https://docs.google.com/document/d/1iEqvbRoRiLXSLaxdb2uURkyvpEjRKo6uPATHXEidWzk/

Working on problems:

Twitter:

1) [Chris McVeigh] What hour of the day does @PrezOno’s tweet the most on average, using every day we have twitter data?
Include a plot of the expected number of tweets for each hour of the day, for those he did tweet.
For example if Ono tweeted once every day at 12:30PM, his expected number of tweets between 12 and 1 would be 1.
If he alternates between 2 and 3 tweets per day, his average would be 2.5.

2) [Anthony Kleiser] What day of the week does @PrezOno tweet the most on average? Use the same example as in #1 but for days of the week.

3) [Sean Schatzman] How does @PrezOno’s tweet length compare to the average of all others? What is his average length? All others?

5) [Anthony Kleiser] What twitter user tweeted the most? What is the top 5 longest tweeters? Bottom 5?

6) [Chris McVeigh] For each day of the week, what was the most mentioned hashtag? Hour of the day?

10) [Sean Schatzman] For each day, what was the most retweeted or most favorited tweet?