This document discusses trending topics by geo-location using data from Twitter. It describes the data flow and pipeline, including streaming tweets from the Twitter API to Kafka and processing them with Spark on HDFS for hourly and daily trends. The cluster setup is outlined showing the various components. Challenges around scaling to millions of tweets per day are discussed, requiring upgrades to memory and server sizes. Storm topologies are used to consume from Kafka, write tweets to storage, and aggregate minute-based trends for a live page. Time discrepancies between servers are also noted.