“Encouraging like-minded people to talkdata over a pint inLeeds since January2013”What is Leeds Data Thing?www.leedsdatathing.co.uk
Data ScientistDigital managerMarketing expertGeo-spatial expertMarket researcherData Analyst@systemspeter@becs_edwards @GrahamHyde@JenGarrick @Andy_Tweets@m_barrettWho are We?
@systemspeter@becs_edwards @GrahamHyde@JenGarrick @Andy_Tweets@m_barrettWhat are the group aims?• To explore open data• To learn from other industries• To learn from each other• To highlight good work• To put Leeds on the map
Web developers,designers, analysts,professors, students,artists, bloggers,marketers, open dataenthusiasts, and lotsinbetween.Who attends?
Our first eventTim Waters on the evolution ofOpenStreetMap, other GeoVisualisations and AnalyticsAndy Bolton on the demographicmapping of Leeds and visualisingchild poverty in the cityMark Barrett on how to be creative,and the importance of using OpenData to build things that people canunderstandSpeakers3
The Big Data Week“Calling all data lovers, researchers, statisticians,academics, marketers, librarians, designers,developers and people who just LOVE to make anddiscover stuff – it’s time to get your Big Data Week2013 hat on!For the first time in the history of Big DataWeek, Leeds is a host city for the global festival thatfocuses on the social, political, technological andcommercial impacts of Big Data. Taking place fromthe 22nd- 28th April 2013, Leeds is one of over 20cities across the world who is working to bringtogether a community of people who are passionateabout asking questions and making things from data.”
Launch nighthttp://fettl.es/16xDHqt@RobWebster_LCHwas kind enough tolaunch the Big DataWeek for us here inLeeds and spokeabout what OpenData means to him
• The Big Data challenges facing the academic publishingcommunity• Leeds’ role in the data revolution• What data can do for the second largest council in the UK• How data is changing the community we live and work in• Why numbers are confusing sometimes• Turning big data into something understandable at a locallevel• Using data at the largest interdiscilinary centre for waterresearch in the UK• How well curated data, easily available analytical tools andgood data communication can aid wildlife conservation• Data collection and insight with a fascinating project aboutfashion bloggers• Using big data to solve crimesData in a day - blog postshttp://fettl.es/18IM95s
Bring your own dataKarrie Liu - why ethnicity information is important tohealth analysisElly Snare - Collecting data from fashion bloggingChristopher Hassall - collection, storage, visualisation andanalysis of wildlife dataMalachi Rangecroft - The leeds observatory - spanningdata from spanning from economic to crime, education tohealthSohail Rashid - the power that data and social media hasto transform the property industryDaniel Prendergast - getting to grips with data forpublishingRussel Brown - “counting is hard”http://fettl.es/YTLxbx
The Big Data Challenge@garrycoleman @grahamhyde
Leeds entries - Sportitudehttp://fettl.es/17gFIHH1.How sporty are different UK regions?2.Does being sporty mean being healthy?3.What helps or hinders a sporty place?Aggregating and mapping all the data:•Data about athletes from DBPedia•Map regions from Ordnance Survey•Regional population data from the 2011Census•Aggregated Health data from the GuardianData Blog
Leeds entries - Leeds is coveredhttp://fettl.es/15BeJqR“What caught my eye was the dataset listing the names of the doctorssurgeries, practices, medical centres. If I think about my neighbourhood Ican pass about half a dozen doctors in a very small area. Leeds is wellcovered (or perhaps just my area is!) . I was reminded of James Joyce’squote about being unable to cross Dublin without passing a pub. Perhapsthe same can be said for Leeds and doctors! The names of the surgerieswere also interesting. Names such as:Chapeloak SurgeryThe Avenue SurgeryDr Ca Hicks’ PracticeThe Dekeyser Group PracticeThe Highfield Medical CentreChapeltown Family SurgeryWonder if the more “leafy” the name, the more “leafy” the neighbourhoodit was in? Perhaps the more grandiose sounding practices had morepatients? Perhaps the smaller sounding ones had better patientsatisfaction reviews?Decided to go with the concept of “Leeds is covered” and wantedsomething showing the labels of the practices over the areas where theywere. Filling out the map, so to speak.”
Leeds entries - how healthy is your area?http://fettl.es/15KgbY0Scraping twitter data to show realtime conversations, with healthdata overlayed onto a map ofEngland
The problem – The NHS possess huge volumes of flat,poorlyutilised dataThe solution – To derive information (actionableintelligence?) from datasets put into the publicdomain by the NHSThe goal – To find patterns in quality of care andchronic health problems across the UK and presentthem accessiblyhttp://fettl.es/17gFPTvLeeds entries - visualising NHS data
Leeds entries - Leeds health visualisedhttp://fettl.es/10jxp9y•Is healthy a long life with high fertility?•Longer lives, Birth control & War are seen in theGlobal data•> $500 per capita doesnt affect life expectancy•In Leeds, income drives health factors across itswards.•The NHSIC data tells us: Leeds was a bit glumyesterday with less children & shorter lives.•Leeds Health hotspots by GP: Diabetes outliers
International entries - bigdataforhealthA Health CrisisWe have a health epidemic in the United States today.As this visualization reveals, a number of factors combine to theentrench the problem.We know that obesity leads to diabetes, but as this scatter plotmakes quite clear, income is also an important factor.Those with more advantages have more choices in life as to thefood they eat, and more leisure time to exercise and take care oftheir bodies.Meanwhile the working poor and others in less advantagedpositions not only suffer from worse living conditions but poorerhealth and wellness.http://fettl.es/YTMHUp
International entries - neofonie21,613,546,189words contained in 56,800,000 german-languagenews articles of the years 2008 to 2013 were mined.323,860,101times were the german cities Berlin, Hamburg,Stuttgart, Dortmund, Frankfurt, and Leipzigmentioned in those articles.376,595disease-related words were found in the textualvicinity of those cities.For each city the three most significant diseaserelated terms were analysed further. We manuallyselected catchwords that occurred frequently in thesurroundings of the diseases.http://bdw.neofonie.de
International entries - BerlinrWhat is this app all about?How are Berliners feeling today? Are they in agood or in a bad mood? The chart representsquantifies the sentiment of Berlins population.It is based on Berlin-related news stories inonline newspapers (which you can see and filterby in the donut chart) and updates daily. As wewere prototyping our model we realised that wewere producing a lot of interesting output andthat it would be shame to condense that in asimple yes, were feeling great today or no,were in a bad mood. Life is more than black andwhite. Which is how we came up with the two-dimensional chart above. The X-axis representsnegative sentiment, the Y-axis positivesentiment with each dot representing individualnews stories.http://wellberlin.herokuapp.com
Antonio Acuna / @diabulosHead of data.gov.uk at the UK Cabinet OfficeDr Mark Davies / @markpricedaviesStrategy Director - HSCICDr Geraint Lewis / @GeraintLewisChief Data Officer - NHS EnglandProfessor Des Higham / @DesHighamMathematics at University of StrathclydeThe results
Lessons learnedWhat worked well?High profile judges gave gravitas to the eventInternational entries brought further insightSocial media spread the world wellEvents building up to the main event buildmomentum and noiseLoading datasets onto a central sql Servermeant teams could work together and workremotelyHaving HSCIC support on hand really helpedWhat could we improveInviting a bank of public health registrars toserve as a resource for all teams, to help withissues such as association versus causation;confidence intervals; axes; confounding;riskadjustment; age and sex standardisationInviting a bank of interested parties to suggestsome problems/issues that the teams couldtackle
helps us understand howdevelopers use datahelps find gaps of understandingabout what data is availablehelps to understand what data isneeded but isn’t availablehelps to understand thegranularity that developersexpect to get from the datahelps understanding about howdevelopers want data presentedhelps to understand whatsystems developers need - 2* /3* / 4* / 5* dataWhy does engagement matter?
A Leeds Data Thing event every 6weeks(ish)Another data challenge inAutumn 2013Engaging with more groupswithin the cityPut Leeds on the map as theleading city for dataHighlight the careers available todata analysts after studyUse resources available withinthe cityMake more data understandableto a wide range of people withinLeedsWhat next...