Pithos - Architecture and .NET Technologies


Published on

Architecture and technologies used in the Windows client for Pithos, GRNET's cloud storage service

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pithos - Architecture and .NET Technologies

  1. 1. Architecture and .NET TechnologiesPanagiotis KanavosPITHOS
  2. 2.  Object storage similar to Amazon S3/Azure Blob storage A service of Synnefo Written in Python Clients for Web, Windows, iOS, Android, Linux
  3. 3. Synnefo
  4. 4. Client API REST API based on OpenStack Object Storage API Accounts, Containers without Folders GET for data, object info PUT, POST for uploads and data updates
  5. 5. Structure
  6. 6. API Extensions  Block Storage  Partial Upload/Download  Permissions, Versions  Metadata Queries  UUIDs for Object IDs  Object updates (copy, move)
  7. 7. API Characteristics No folders! Placeholder directory object hold metadata Block updates ONLY Merkle hashing to detect modified blocks Hash using SHA256
  8. 8. Merkle Hashing Top Hash Hash of #1- Hash of #2- 2 Hashes 3 Hashes Block #1 Block #2 Block #3 Block #4 Hash Hash Hash Hash
  9. 9. Download ProcessGet Hashmap Calculate local Find different from server hashmap blocks Download Patch local file blocks with blocks
  10. 10. Upload Process Server respondsCalculate local PUT to server with missing hashmap block hashes PUT missing Server responds blocks at Repeat from #2 201container level
  11. 11. Pithos Client Multiple accounts per machine Synchronize local folder to Pithos account Detect local changes and upload Detect server changes and download Calculate Merkle Hash for each file
  12. 12. Client Architecture UI Core Networking Storage WPF File Agent CloudFiles SQLite Poll Agent MVVM Network Agent SQL Server Caliburn HttpClient Compact Micro Status Agent
  13. 13. Technologies .ΝΕΤ 4, due to Windows XP support req Visual Studio 2012 + Async Targeting Pack UI - Caliburn.Micro Concurrency - TPL, Parallel, Dataflow Network – HttpClient Hashing - OpenSSL – Faster than native provider for hashing Storage - NHibernate, SQLite/SQL Server Compact Logging - log4net
  14. 14. The challenges Handle hundreds of file events Hashing of many large files Multiple slow connections to the server Unstable network Yet it shouldn’t hang Minimal UI with enough info for the user
  15. 15. Event Handling Poll Agent Uploader/Downloader• Listen • Queue requests• Wait for Idle • Get Server • Process each file • Network ops for hashes files • Compare hashes • Identify changes File Agent Network Agent
  16. 16. Events Handling(2) Use producer/consumer Store events in ConcurrentQueue Process ONLY after an idle timeout
  17. 17. Merkle Hashing Why I hate Game of Thrones Asynchronous reading of blocks Block hashing in parallel Use OpenSSL to gain SSE2 etc Concurrency throttling Watch the memory consumption!
  18. 18. Memory Leaks in a ManagedEnvironment!  4ΜΒ Blocks? Large Memory but …  Quickly reading 2GB in 64ΚΒ blocks?  Downloading 600ΜΒ in x KB blocks?  Huge number of small objects awaiting collection during CPU/IO intensive processing  Poor Garbage Collector can’t keep up!
  19. 19. Hashing  100% CPU? Multicore is nice but …  Blocks the system when processing large files! Throttle parallel block hash ops Improvements:  Dynamically throttle «large» files  «Throttling» of File Read Ops
  20. 20. Multiple slow network calls Every call a Task Concurrent REST calls per account and shared folder Task.WhenAll to process results at end of poll
  21. 21. Unstable network Use System.Net.Http.HttpClient Store downloaded blocks in .pithos.cache folder Check and reuse orphans Asynchronous Retry of calls
  22. 22. Resistance to crashes Use Transactional NTFS where available Thanks MS for killing it! Modify a copy File.Replace otherwise
  23. 23. Should not hang Use independent agents Asynchronous operations wherever possible Use async/await for more readable code Must always .ConfigureAwait(false)! BE CAREFULL of async void
  24. 24. Minimal UI Use WPF, MVVM Use Progress to update the UI  Part of .NET 4.5, backported to 4 The Icon is the Shell! Lack of good WPF Notification Icon Problematic Data Binding in menus
  25. 25. SQLite or Compact CE?  Initially SQLite -> Staleness problems (DUH !)  Write Ahead logging, means you can see stale data  Switch to SQL Compact to allow concurrent updates (duh ?)  Really needed better caching?  Akavache?  A Document DB is better suited
  26. 26. Next Steps File Manager UI General Cleanup (DUH!) Bring back Unit Tests (Duh ?) Mock Server  WebAPI? scriptcs? Yumm! Create a separate Pithos library Windows RT, Windows Phone clients  AFTER the cleanup
  27. 27. Links for Pithos Pithos trial http://pithos.okeanos.io Synnefo Documentation http://www.synnefo.org/docs/synnefo/latest/ind ex.html Pithos API Documentation http://www.synnefo.org/docs/pithos/latest/index .html Pithos Windows Client https://code.grnet.gr/projects/pithos-ms-client
  28. 28. Useful Links Parallel FX Team blog http://blogs.msdn.com/b/pfxteam Caliburn.Micro http://caliburnmicro.codeplex.com/ Ayende’s BufferPool http://ayende.com/blog/4827/answer- stopping-the-leaks
  29. 29. Useful Books C# 5 in a Nutshell, O’Riley Parallel Programming with .NET, Microsoft• Pro Parallel Programming with C#, Wiley• Concurrent Programming on Windows, Pearson• The Art of Concurrency, O’Reilly