Data Visualization in
  Python/ Django
   By KENNETH EMEKA ODOH
     By KENNETH EMEKA ODO
Table of Contents
Introduction
Motivation
Method
Appendices
Conclusion
References
Introduction
 My background
 Requirements(
  Python, Django, Matplotlib, ajax ) and other
  third-party libraries.
 What this talk is about ( we will be restricted to
  python, matplotlib and django ).
 What this talk is not about ( we are not trying
  to re-implement Google analytics ).
 Source codes are available at (
  https://github.com/kenluck2001/PyCon2012_T
  alk ).
MOTIVATION
There is a need to represent the business
 analytic data in a graphical form. This is
 because a picture speaks more than a thousand
 words.




   Source: en.wikipedia.org
Where do we find
data?




   Source: en.wikipedia.org
Sources of Data

• CSV
• DATABASES
Steps for data gathering
 Identify the data source.
 Preprocessing of the data (
  removing nulls, wide characters )
  e.g. Google refine.
 Actual data processing ( perform
  some statistical analysis ).
 Present the clean data in
  descriptive format. i.e Data
  visualization
 See Appendix 1
Visual Representation of
            data
     Charts / Diagram format
     Texts format
       Tables
       Log files




Source: devk2.wordpress.com   Source: elementsdatabase.com
Categorization of data
 Real-time ( generating charts on
  real time. This can also include
  mechanism for refreshing the site to
  get the latest chart ).
  See Appendix 2
 Batch-based ( create charts from
  csv file. Example in my blog)
   See Appendix 2
Rules of Data Collection
 Keep data in the easiest process able form
  e.g database, csv
 Keep data collected with timestamp. The
  time that the data is collected or
  processed, for filtering .
 Gather data that are relevant to the
  business needs.
 Ensure that whenever the data grows so
  large. You have to prune some stale or old
  data that are no longer needed.
Where is the data
   visualization done?
 Server
  See Appendix from 2 - 6
 Client
  Examples of Javascript library
  DS.js ( http://d3js.org/ )
  gRaphael.js (
   http://g.raphaeljs.com/ )
Factors to Consider for
Choice of Visualization
 Where do we perform the
  visualization processing?
 Is it Server or Client?


It depends
 Security
 Scalability
Tools needed for data
analysis
 Csvkit
  (http://csvkit.readthedocs.org/en/latest/)
 networkx (graphs) (spatial analysis)
  (http://networkx.lanl.gov/)
 pySAL ( http://code.google.com/p/pysal/
  )
Appendices


Let the codes begin
Appendix 1
## This describes a scatter plot of solar radiation against the month.
This aim to describe the steps of data gathering.CSV file from data science
hackathon website. The source code is available in a folder named
“plotCode”
impoqv cuv
fqom
mavplovlib.backendu.backend_agg
impoqv FigtqeCanvauAgg au FigtqeCanvau
fqom mavplovlib.figtqe impoqv Figtqe

def
pqepaqeLiuv(monvh_mouv_common_liuv):
''' Pqepaqe vhe inptv foq pqoceuu by
qemoving all tnneceuuaqy valteu.
Replace "NA" sivh 0''
      otvptv_liuv = []
      foq x in monvh_mouv_common_liuv:
      if x != 'NA':
          otvptv_liuv.append(x)
Appendix 1                 contd.
def plovSolaqRadiavionAgainuvMonvh(filename):
    vqainRosReadeq =
cuv.qeadeq(open(filename, 'qb'), delimiveq=',')
    monvh_mouv_common_liuv = []
    Solaq_qadiavion_64_liuv = []
    foq qos in vqainRosReadeq:
         monvh_mouv_common = qos[3]
         Solaq_qadiavion_64 = qos[6]

monvh_mouv_common_liuv.append(monvh_mouv_common)

Solaq_qadiavion_64_liuv.append(Solaq_qadiavion_6
4)
     #conveqv all elemenvu in vhe liuv vo floav
shile ukipping vhe fiquv elemenv foq vhe 1uv
elemenv iu a deucqipvion of vhe field.
     monvh_mouv_common_liuv = [floav(i) foq i in
pqepaqeLiuv(monvh_mouv_common_liuv)[1:] ]
     Solaq_qadiavion_64_liuv = [floav(i) foq i in
pqepaqeLiuv(Solaq_qadiavion_64_liuv)[1:] ]
     fig=Figtqe()
     ax=fig.add_utbplov(111)
     vivle='Scavveq Diagqam of uolaq qadiavion
againuv monvh of vhe yeaq'
     ax.uev_xlabel('Mouv common monvh')
     ax.uev_ylabel('Solaq Radiavion')
     fig.utpvivle(vivle, fonvuize=14)
     vqy:
Appendix 2
Fqom vhe pqojecv   in   foldeq
named WebMonivoq

clauu LoadEvenv:

def fillMonivoqModel(uelf):
     foq monObj in
uelf.monivoqObjLiuv:
          mObj =
Monivoq(tql =
monObj[2], hvvpSvavtu =
monObj[0], qeuponueTime =
monObj[1], convenvSvavtu =
monObj[5])
Appendix 3
fqom django.hvvp impoqv HvvpReuponue
fqom mavplovlib.backendu.backend_agg
impoqv FigtqeCanvauAgg au FigtqeCanvaufqom
mavplovlib.figtqe
impoqv Figtqefqom YAAS.uvavu.modelu impoqv
RegiuveqedUueq, OnlineUueq, SvavBid #ucavveq diagqam of
ntmbeq of bidu made againuv ntmbeq of online tuequ
# seekly qepoqv
@uvaff_membeq_qertiqed
def seeklyScavveqOnlinUuqBid(qerteuv, seek_no):
     page_vivle='Weekly Scavveq Diagqam baued on Online
tueq vequeu Bid'
     seekno=seek_no
     fig=Figtqe()
     ax=fig.add_utbplov(111)
     yeaq=uvav.gevYeaq()
     onlUueqObj =
OnlineUueq.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq)
     bidObj =
SvavBid.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq)
     onlUueqliuv =
liuv(onlUueqObj.valteu_liuv('no_of_online_tueq', flav=Tqte))
     bidliuv =
liuv(bidObj.valteu_liuv('no_of_bidu', flav=Tqte))
      vivle='Scavveq Diagqam of ntmbeq of online Uueq
againuv ntmbeq of bidu (seek {0l){1l'.foqmav(seekno,yeaq)
     ax.uev_xlabel('Ntmbeq of online Uuequ')
     ax.uev_ylabel('Ntmbeq of Bidu')
     fig.utpvivle(vivle, fonvuize=14)
     vqy:
           ax.ucavveq(onlUueqliuv, bidliuv)
      excepv ValteEqqoq:
           pauu
Appendix 4
# Example of how database may be deleted to recover some space.
From folder named “YAAS”. Check task.py
@peqiodic_vauk(qtn_eveqy=cqonvab(h
otq=1, mintve=30, day_of_seek=0)
)
def deleveOldIvemuandBidu():
         htndeqedandvsenvydayu =
davevime.voday() -
davevime.vimedelva(dayu=120)
         myIvem =
Ivem.objecvu.filveq(end_dave__lve
=htndeqedandvsenvydayu ).deleve()

    myBid =
Bid.objecvu.filveq(end_dave__lve=
htndeqedandvsenvydayu
).deleve()#poptlave vhe
qegiuveqedtueq and onlinetueq model
av qegtlaq inveqvalu
Appendix 5

Check project in
YAAS/stats/

for more information on
statistical processing
Appendix 6
 # how to refresh the views in django. To keep the charts.
 updated. See WebMonitor project

 {% exvendu            "baue.hvml" %l

 {% block uive_sqappeq %l
 <div id="meuuageu">Updaving
 vableu ...</div>
 <ucqipv>
      ftncvion qefqeuh() {
             $.ajax({
                       tql:
 "/monivoq/",
                       utcceuu:
 ftncvion(dava) {
 $('#meuuageu').hvml(dava);
                       l
           l);

 uevInveqval("qefqeuh()", 10000
 0);
References
 Python documentation ( http://www.python.org/ )
 Django documentation (
  https://www.djangoproject.com/ )
 Stack overflow ( http://stackoverflow.com/ )
 Celery documentation
  (http://ask.github.com/celery/)


Pictures
 email logo ( http:// ambrosedesigns.co.uk )
 blog logo ( http:// sociolatte.com )
Thanks for listening
           Follow me using any of


                    @kenluck2001

                    kenluck2001@yahoo.com


                    http://kenluck2001.tumblr.com
                    /

                    https://github.com/kenluck200
                    1

Data visualization by Kenneth Odoh

  • 1.
    Data Visualization in Python/ Django By KENNETH EMEKA ODOH By KENNETH EMEKA ODO
  • 2.
  • 3.
    Introduction  My background Requirements( Python, Django, Matplotlib, ajax ) and other third-party libraries.  What this talk is about ( we will be restricted to python, matplotlib and django ).  What this talk is not about ( we are not trying to re-implement Google analytics ).  Source codes are available at ( https://github.com/kenluck2001/PyCon2012_T alk ).
  • 4.
    MOTIVATION There is aneed to represent the business analytic data in a graphical form. This is because a picture speaks more than a thousand words. Source: en.wikipedia.org
  • 5.
    Where do wefind data? Source: en.wikipedia.org
  • 6.
    Sources of Data •CSV • DATABASES
  • 7.
    Steps for datagathering  Identify the data source.  Preprocessing of the data ( removing nulls, wide characters ) e.g. Google refine.  Actual data processing ( perform some statistical analysis ).  Present the clean data in descriptive format. i.e Data visualization  See Appendix 1
  • 8.
    Visual Representation of data  Charts / Diagram format  Texts format  Tables  Log files Source: devk2.wordpress.com Source: elementsdatabase.com
  • 9.
    Categorization of data Real-time ( generating charts on real time. This can also include mechanism for refreshing the site to get the latest chart ). See Appendix 2  Batch-based ( create charts from csv file. Example in my blog) See Appendix 2
  • 10.
    Rules of DataCollection  Keep data in the easiest process able form e.g database, csv  Keep data collected with timestamp. The time that the data is collected or processed, for filtering .  Gather data that are relevant to the business needs.  Ensure that whenever the data grows so large. You have to prune some stale or old data that are no longer needed.
  • 11.
    Where is thedata visualization done?  Server See Appendix from 2 - 6  Client Examples of Javascript library DS.js ( http://d3js.org/ ) gRaphael.js ( http://g.raphaeljs.com/ )
  • 12.
    Factors to Considerfor Choice of Visualization  Where do we perform the visualization processing?  Is it Server or Client? It depends  Security  Scalability
  • 13.
    Tools needed fordata analysis  Csvkit (http://csvkit.readthedocs.org/en/latest/)  networkx (graphs) (spatial analysis) (http://networkx.lanl.gov/)  pySAL ( http://code.google.com/p/pysal/ )
  • 14.
  • 15.
    Appendix 1 ## Thisdescribes a scatter plot of solar radiation against the month. This aim to describe the steps of data gathering.CSV file from data science hackathon website. The source code is available in a folder named “plotCode” impoqv cuv fqom mavplovlib.backendu.backend_agg impoqv FigtqeCanvauAgg au FigtqeCanvau fqom mavplovlib.figtqe impoqv Figtqe def pqepaqeLiuv(monvh_mouv_common_liuv): ''' Pqepaqe vhe inptv foq pqoceuu by qemoving all tnneceuuaqy valteu. Replace "NA" sivh 0'' otvptv_liuv = [] foq x in monvh_mouv_common_liuv: if x != 'NA': otvptv_liuv.append(x)
  • 16.
    Appendix 1 contd. def plovSolaqRadiavionAgainuvMonvh(filename): vqainRosReadeq = cuv.qeadeq(open(filename, 'qb'), delimiveq=',') monvh_mouv_common_liuv = [] Solaq_qadiavion_64_liuv = [] foq qos in vqainRosReadeq: monvh_mouv_common = qos[3] Solaq_qadiavion_64 = qos[6] monvh_mouv_common_liuv.append(monvh_mouv_common) Solaq_qadiavion_64_liuv.append(Solaq_qadiavion_6 4) #conveqv all elemenvu in vhe liuv vo floav shile ukipping vhe fiquv elemenv foq vhe 1uv elemenv iu a deucqipvion of vhe field. monvh_mouv_common_liuv = [floav(i) foq i in pqepaqeLiuv(monvh_mouv_common_liuv)[1:] ] Solaq_qadiavion_64_liuv = [floav(i) foq i in pqepaqeLiuv(Solaq_qadiavion_64_liuv)[1:] ] fig=Figtqe() ax=fig.add_utbplov(111) vivle='Scavveq Diagqam of uolaq qadiavion againuv monvh of vhe yeaq' ax.uev_xlabel('Mouv common monvh') ax.uev_ylabel('Solaq Radiavion') fig.utpvivle(vivle, fonvuize=14) vqy:
  • 18.
    Appendix 2 Fqom vhepqojecv in foldeq named WebMonivoq clauu LoadEvenv:  def fillMonivoqModel(uelf): foq monObj in uelf.monivoqObjLiuv: mObj = Monivoq(tql = monObj[2], hvvpSvavtu = monObj[0], qeuponueTime = monObj[1], convenvSvavtu = monObj[5])
  • 19.
    Appendix 3 fqom django.hvvpimpoqv HvvpReuponue fqom mavplovlib.backendu.backend_agg impoqv FigtqeCanvauAgg au FigtqeCanvaufqom mavplovlib.figtqe impoqv Figtqefqom YAAS.uvavu.modelu impoqv RegiuveqedUueq, OnlineUueq, SvavBid #ucavveq diagqam of ntmbeq of bidu made againuv ntmbeq of online tuequ # seekly qepoqv @uvaff_membeq_qertiqed def seeklyScavveqOnlinUuqBid(qerteuv, seek_no): page_vivle='Weekly Scavveq Diagqam baued on Online tueq vequeu Bid' seekno=seek_no fig=Figtqe() ax=fig.add_utbplov(111) yeaq=uvav.gevYeaq() onlUueqObj = OnlineUueq.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq) bidObj = SvavBid.objecvu.filveq(seek=seekno).filveq(yeaq=yeaq) onlUueqliuv = liuv(onlUueqObj.valteu_liuv('no_of_online_tueq', flav=Tqte)) bidliuv = liuv(bidObj.valteu_liuv('no_of_bidu', flav=Tqte)) vivle='Scavveq Diagqam of ntmbeq of online Uueq againuv ntmbeq of bidu (seek {0l){1l'.foqmav(seekno,yeaq) ax.uev_xlabel('Ntmbeq of online Uuequ') ax.uev_ylabel('Ntmbeq of Bidu') fig.utpvivle(vivle, fonvuize=14) vqy: ax.ucavveq(onlUueqliuv, bidliuv) excepv ValteEqqoq: pauu
  • 20.
    Appendix 4 # Exampleof how database may be deleted to recover some space. From folder named “YAAS”. Check task.py @peqiodic_vauk(qtn_eveqy=cqonvab(h otq=1, mintve=30, day_of_seek=0) ) def deleveOldIvemuandBidu(): htndeqedandvsenvydayu = davevime.voday() - davevime.vimedelva(dayu=120) myIvem = Ivem.objecvu.filveq(end_dave__lve =htndeqedandvsenvydayu ).deleve() myBid = Bid.objecvu.filveq(end_dave__lve= htndeqedandvsenvydayu ).deleve()#poptlave vhe qegiuveqedtueq and onlinetueq model av qegtlaq inveqvalu
  • 21.
    Appendix 5 Check projectin YAAS/stats/ for more information on statistical processing
  • 22.
    Appendix 6 #how to refresh the views in django. To keep the charts. updated. See WebMonitor project {% exvendu "baue.hvml" %l {% block uive_sqappeq %l <div id="meuuageu">Updaving vableu ...</div> <ucqipv> ftncvion qefqeuh() { $.ajax({ tql: "/monivoq/", utcceuu: ftncvion(dava) { $('#meuuageu').hvml(dava); l l); uevInveqval("qefqeuh()", 10000 0);
  • 23.
    References  Python documentation( http://www.python.org/ )  Django documentation ( https://www.djangoproject.com/ )  Stack overflow ( http://stackoverflow.com/ )  Celery documentation (http://ask.github.com/celery/) Pictures  email logo ( http:// ambrosedesigns.co.uk )  blog logo ( http:// sociolatte.com )
  • 24.
    Thanks for listening Follow me using any of @kenluck2001 kenluck2001@yahoo.com http://kenluck2001.tumblr.com / https://github.com/kenluck200 1