PPTX, PDF604 views

Road Trip To Component

The document discusses the transition to Clojure for handling technical challenges in data processing, including importation, analysis, and storage of diverse data formats. Key reasons for the switch include the advantages of JVM, performance improvements, and enhanced data representation capabilities. It highlights organizational strategies for Clojure applications, managing state, and using lifecycle management libraries, along with some difficulties faced in implementation.

Software◦

Road Trip To Component

1.
Road Trip to Component MarketaAdamova
2.
NOMNOM INSIGHTS
3.
Technical challenges • Importlarge amounts of data in various formats • Process, analyse and store data • Present data to user • Fast search and stats on data
6.
Reasons for change •Missing NLP libraries • Better data storage • Performance issues • Prototype app
7.
Taking the Clojure turn
8.
Why Clojure? • JVM •Java Interop • Concurrent processing • Data representation • Fun to write
9.
Lukasz Korecki (CTOat NomNom) “We moved to Clojure because of JVM and we stayed for everything else.”
12.
Main road blockers •Correct JVM setup • Application structure • Managing shared application state
13.
How to structureClojure application
14.
Organising Clojure code •Namespaces • Extract code to libraries • Protocols
15.
Protocol & types •Mechanism for abstraction • Polymorphism • Boundaries between subsystems
18.
Handling state inClojure
19.
Mutable state • https://clojure.org/about/state •State = value associate with identity at given time • Memory cache, concurrent programming, … • Atoms, refs, agents
20.
Shared state • Accessiblefrom various namespaces • Open connections and channels • Global accessible configuration • Mutability not required during runtime
21.
Application configuration • Functionto load env variable • Configuration in single atom
24.
Application configuration • Functionto load env variable • Configuration in single atom • Config library • Mount • Component
28.
Stuart Sierra’s Component
29.
What is ‘Component’ •managing lifecycle and dependencies of components with runtime state • db access, external API services, web server • system of components
34.
How it solvesour problem • Enforcing structure in code • State defined in single place • Better visibility of system
35.
Testing • Mock components •Integration tests for complex flow • E2E test
40.
REPL interaction • Definedevelopment system • Multiple systems in single JVM • No need to restart REPL
42.
Production • Avoid accessingproduction system !! • Visualise system & strong subsystem boundaries • Debugging • Add ad hoc components when required
44.
The bad parts… • “all or nothing” • Failures during system startup • Trying to use “wrappers” • Integration with other libraries • OO approach
45.
End of journey
46.
Resources https://clojure.org/index https://github.com/stuartsierra/component https://github.com/tolitius/mount http://www.joyofclojure.com http://thinkrelevance.com/blog/2013/11/07/when-should-you-use-clojures-object- oriented-features https://purelyfunctional.tv/issues/clojure-gazette-180-how-do-you-structure-your-apps/ https://cb.codes/organizing-clojure-projects-and-libraries/ … and lotsof other Clojure talks, articles and discussions
47.
Questions?

Editor's Notes

#2 welcome talk introduce myself quick overview what the talk will be about
#3 Intro to Nomnom place to gather all your customer feedback …. explain what it does … founded April 2015, live November 2016 some of our customers - Usertesting.com, Wix, Magento Analytics, RJMetrics, Sumologic etc. very small team of 7 people, distributed across 2 continents & 5 countries
#4 Data import each API is different the amount of data TODO: add stats about traffic (number of connected integrations, number of requests we make) Analyse & process keep useful data Present data UX is important Maybe clojureScript in future :) Search data TODO add stats about number of performance + search query time
#5 understand better visual input play video search by keywords, the source of customer feedback and even the type of the user which left you feedback
#6 Original data model simplified version user connected integrations (oauth, api keys), download data on their behalf data stored both in PG and ES & query ES and retrieve docs from PG in production we have all the extra infra stuff (RabbitMQ, Redis, Webhooks, Schedulers, loadBalancers, statsD, s3)
#7 NLP libraries not huge support comparing to python, java Data model postgres + JSONB turns out that frequently creating and updating millions of JSON objects can put a lot of strain on the database Performance ruby slow, single threaded … bad for async/concurrent processing growing number of integrations/traffic Prototype how long to keep your original data model? rails serves well in areas you don’t want to worry about (user management, billing)
#8 Right tool for the job! Migration is not single steps but rather many small once!
#9 JVM: battle tested, easy to get quickly running Java Interop: NLP libraries Concurrency: multithreading without much overhead Data: immutability Fun: who want to write JAVA? Newbies no previous experience in team for writing web servers in clojure you don’t have to care that much about the real definition of pure functions, monads etc code is easy to write, reason about and has great performance
#10 TODO spelling transducers
#11 Migration 1 - few NLP services
#12 Migration 2 main data processing/storing model in Clojure move document storage to rethinkDB Around this time I joined Next challenge was to move the workers As we start adding more services we started to see common issues
#13 JVM crashed on first startup over provisioned machines misconfigured thread poll settings in jetty mixed high CPU with lot of IO we had a memory leak due to a bug in regexes (different syntax then in ruby) you need at least basic tuning in place Structure no holding hand like in rails not many books around how to structure large applications people new to Clojure State - why we can’t get away with pure functions - larger application in production need state - clojure approach to problem
#14 structure your code structure the business logic Stuart Halloway said ‘If your application is more then 2 weeks old your biggest problem is complexity of your code’
#15 Ns specific set of related data & functions separate functionality by comment blocks Libraries extract code to clojure libraries Protocol - create boundaries in system
#16 Protocol specification only, no implementation polymorphic functions + protocol object single type can implement multiple protocols Dynamic polymorphism dynaminc, no compile effect generate interface with the same ns function can be used on multiple data types or behave differently based on additional argument dispatch on class type (90% of use cases of multi method) - multi method = runtime polymorphism (dispatch on function) higher level abstraction/organization Deftype vs Defrecord => record give you hash-map
#17 Protocol & type & constructor Deftype vs Defrecord => record give you hash-map
#18 Protocol & type & constructor Deftype vs Defrecord => record give you hash-map
#19 state = value at time I ll be talking here about mutable state & shared state
#20 In past .. state= the content of this memory block identity = has a state, exactly one point at the time, state does not change, identity can have new state! This is the Clojure model (Rich hickey) Why we need state Clojure has great solution
#21 GMS = global & mutable (can be but do we need it) Shared state = accessible from various namespaces Examples Mutability - usually defined on application startup and then not required to change!
#22 First attempt on handling the the shared config …
#25 library - avoid code duplicates issues - compilation vs runtime value (TODO add code), opening to many channels to RabbitMQ mount - more clojure like approach, only solving one problem component - was good fit as we were still writing many new services (explain later why full buy in is required)
#26 Back to our application … Multiple clojure services service oriented rather then micro services 90% data processing done in clojure to be migrated (legacy integrations from ruby) again not visible the additional infra setup Topic for another talk - overhead/advantage with managing multiple services rather then single monolith
#27 Single clojure services does a lot of stuff workers = logic clients - connect to 3rd parties, fetch/store/send data initiate jobs (werb server & schedulers) How to structure, configure and manage
#30 'Component' is a tiny Clojure framework for managing the lifecycle and dependencies of software components which have runtime state. This is primarily a design pattern with a few helper functions. It can be seen as a style of dependency injection using immutable data structures. all stageful part is gathered together (rather then scattered atoms) group together related entities good REPL
#31 Protocol & type & constructor Deftype vs Defrecord => record give you hash-map
#32 the 3 parts protocol = define your interface protocol + type + constructor fn no state changes in constructor fn
#35 structure - using protocols, people can follow pattern + boundaries visibility - system map of components, libraries for visualisation (to many cross dependencies) state - configuration + open connection/channels
#36 Mock = stub implementation (better then redefs) The cost of creating and starting a system is low enough Mock component can do real things E2E - setup test system with mix of real & test components
#37 ;; simplify setup ;; start in dependencies
#39 ;; simplify setup ;; start in dependencies New instance of system for each test
#40 ;; simplify setup ;; start in dependencies New instance of system for each test
#41 dev system - system map of components (replace those which needs to be mocked up) multiple system - test two versions REPL restart = JVM restart (takes long), components makes it unnecessary => rapid development cycle
#43 access production = bad idea as can reset the whole system you know what state is running in each component debugging = Nrepl ad hoc components = data migration, long running job
#46 use the right tool for the right job async processing is great use the approach which suits you
#47 TODO

Road Trip To Component

More Related Content

What's hot

Viewers also liked

Similar to Road Trip To Component

Recently uploaded

Road Trip To Component

Editor's Notes