Using Django for a scientific document analysis (web) application


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Met open standards, open access bedoel ik dat de gegevens ontsloten zijn voor 'alle clients', niet alleen voor 'AmCAT' of python scripts, met behulp van open standards als SQL, RDF, HTTP, XML, JSON, etc. Op die manier kan een onderzoeker een eigen script schrijven in R, Perl, etc dat met AmCAT communiceert.
  • Waarbij een codebook ook een taxonomie / ontologie / etc genoemd kan worden...
  • Using Django for a scientific document analysis (web) application

    1. 1. AmCAT3Using Django for a scientific document analysis website: Tastypie, unit tests, R, open platforms and open questions Wouter van Atteveldt (VU Amsterdam)
    2. 2. AmCAT What is AmCAT? Design considerations Open data and the publication cycle Tables, TastyPie, and R Unit tests
    3. 3. What is AmCAT? Document management and analysis Aimed at social sciences and humanities  Input: scraping, uploading  Management: projects, selections  Analyses: keyword analysis, linguistic processing (lemmatizing etc), manual annotation Open source, open standards, open access
    4. 4. Design Choices Default Django: web site backed by a database AmCAT: database with a web front end
    5. 5. Design Choices Default Django: web site backed by a database AmCAT: database with a web front end Data should be accessible from outside ORM should be usable without web site code DB should be final authentication/authorisation
    6. 6. Design choices Separate apps for business, presentation Custom authentication middleware and user management  save() and update() with using=  database-specific code for creating users  We dont actually like this too much... All data and methods (should be) exposed through web service API
    7. 7. Open data and Publication Cycle AmCAT Navigator (web site) REST API ORM (web service) (django) Relational SPARQL External scripts DB End point (Python, R, ...)
    8. 8. Open access publication cycle Source: Analysis: Publication:DANS/AmCAT3 (Linked) data R, matlab, ... e.g. Sweave PDF + hyperlinks Web service + Latex Structured data? data link from site Links back to
    9. 9. Tastypie + Datatables Django Model-based REST api Jquery datatables with AJAX call The good news:  It works  Unified point of entry for tables in website and scripts The bad news:  Tastypie code horribly redundant  (Unless were doing it wrong!)
    10. 10. Unit tests Web pages tough to test well Move as much code as possible from presentation to business layer  Trivial views need less testing  Regular python modules easy to test Our choices:  Put all unit tests in the target module  Put more complicated integration tests in tests/ package
    11. 11. Bonus slide: Plugins Django (model)forms as interface description for plugins Plugins callable from web site, as web service, and from cli Single point of entry for actions (relation with REST data modification?)