3. What is HealthLive??
• Just as Doppler radar scans the skies for indicators of bad weather, HealthLive
scans the drug sales at pharmacies across the globe for indicators of illness,
allowing you to check for the chance of sickness as easily as you can check for
chance of rain.
Stay in the know
Get real time sickness
alerts when you enter a
sick zone.
Live Map
Find recent illness
reports nearby, or across
the country
6. You can’t manage what you can’t see
• The monitoring features in X-Pack provides visibility into important areas of
Elasticsearch like:
1) Search and indexing performance
2) Memory and garbage collection
3) Host-level system and network metrics
4) Resource saturation and errors
• Used Elasticsearch Partial Document Update for better performance.
Challenges
7. • How do I detect spikes in Real Time???
Future Work
8. About me
Vaibhav Lella - CSGrad @University at Buffalo
Background:
Machine Learning
Passionate to build data pipelines that solve real
world problems
Hobbies: Cooking, Long drives,Soccer,Cricket,Travelling
Editor's Notes
This was a couple of years ago I was at a pharmacy, I saw this unusually long queue and what was weird was that a majority of people in the queue were looking for the same drug and the pharamacy just ran out of that drug due to a sudden increase in purchase of that drug.
That’s when it struck to me that sudden increases in their sales of particular drugs might serve as an early indicator of communitywide disease outbreaks.
I feel it is a crucial signal which motivated me to develop HealthLive.
I limited the scope of the diseases to be detected to infectious diseases, as they represent a public health problem for which early and valid signal detection is of particular concern, in light of potentially rapid emergence and opportunity for control interventions.
During the initial ideation phase during the literature review I came across several articles where one of the first signs of a large waterborne outbreak in Milwauke was newspaper reports that local pharmacies had sold out of antidiarrheal medications which
It is a real time drug sales tracker which helps to map spread of diseases in real time by leveraging the location of the pharmacies from which the drugs are purchased.
Visualizing where your application is spending the most time is essential to quickly identifying and addressing potential performance bottlenecks or anomalies.
This helped me in discovering which areas are the most meaningful for my use case and where I could optimize, initially I was using the whole document update api but then I came across the partial document update where all process happens within the shard. So it avoids the network overhead of multiple requests. Update API also reduces the chances of being conflicted from other processes.
It’s easy for a human eye to pick out the spikes in the above graph, but that’s because we can see the whole data set in retrospect. Identifying if the current point in time is a spike above normal is much harder than looking back over all the data in hindsight and picking out the peaks. I wanted the code to recognise, in the moment, that there was a spike in traffic happening.
I got frustrated enough to figure there must be someone out there smarter than me (very likely) who has already solved this (also likely). After a thorough Google I found that in fact it’s an unsolved (or possibly unsolvable) problem.
In order to hone spike detection the code needs an idea of the expected behaviour of the stream, so that it can know what level of spike is significant or not. This understanding comes from observing a lot of data – either by the developer or the code itself. Either way you first have to get a feel for the data before you can define what “normal” and “abnormal” look like. Without this knowledge the code could be missing important spikes, or reporting irrelevant ones.