Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Share and Share Alike


Published on

Using System V Shared Memory in MRI Ruby Projects

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Share and Share Alike

  1. 1. Share and Share AlikeUsing System V shared memory constructs inMRI Ruby projects
  2. 2. Who Am I?● Jeremy Holland● Senior Lead Developer at CentreSource in beautiful Nashville, TN● Math and Algorithms nerd● Scotch drinker● @awebneck,, freenode: awebneck, etc.
  3. 3. The Problem ● FREAKIN ● HUGE ● BINARY ● TREE
  4. 4. How huge?● Huge. Millions of nodes, each node holding ~500 bytes● e.g. Gigabytes of data● K-d tree of non-negligible dimension (varied, around 6-10)● No efficient existing implementation that would serve the purposes needed  Fast search  Reasonably fast consistency
  5. 5. Things we considered ...and discarded● Index the tree, persist to disk  Loading umpteen gigs of data from disk takes a spell.  Reload it for each query  WAY TOO SLOW
  6. 6. Things we considered ...and discarded● Index once and hold in memory  Issues both with maintaining index consistency and balance  Difficult to share among many processes / threads without duplicating in memory.
  7. 7. Things we considered ...and discarded● DRb  Simulates memory shared by multiple processes, but not really  While the interface to search the tree is available to many different processes, actually searching it takes place in the single, server-based process
  8. 8. Enter Shared Memory● Benefits  Shared segment actually accessible by multiple, wholly separate processes  Built-in access control and permissions  Built-in per-segment semaphore● Drawbacks  With great power comes great responsibility  Acts like a bytearray – manual serialization
  9. 9. Ruby-level memory paradigm vs C-level memory paradigm● Ruby:  Everything goes on the heap  Garbage collected - no explicit freeing of memory● C:  Local vars, functions, etc. on the stack  Explicit allocations on the heap (malloc)  Explicit freeing of heap – no GC
  10. 10. Ruby● Before start of process
  11. 11. Ruby● Process starts● Heap begins to grow
  12. 12. Ruby● Process runs● Heap continues to grow with additional allocations
  13. 13. Ruby● Process runs● GC frees allocated memory no longer needed...
  14. 14. Ruby● it can be reallocated for new objects
  15. 15. Ruby● Process ends● Heap freed
  16. 16. C● Process starts● Stack grows to hold functions, local vars
  17. 17. C● Process runs● Memory is explicitly allocated from the heap in the form of arrays, structs, etc.
  18. 18. C● Process runs● A function is called, and goes on the stack
  19. 19. C● Process runs● The function returns, and is popped off the stack
  20. 20. C● Process runs● The item in the heap, no longer needed, is explicitly freed
  21. 21. C● Process runs● A new array is allocated from the heap
  22. 22. C● Process ends (untidily)● The stack and heap are reclaimed by the OS as free
  23. 23. TRUTHRuby itself has no concept of shared memory.
  24. 24. TRUTH C does.
  25. 25. Shared Memory● A running process (as viewed from the C level)
  26. 26. Shared Memory● A shared segment is created with an explicit size – like allocating an array
  27. 27. Shared Memory● The segment is ”attached” to the process at a virtual address
  28. 28. Shared Memory● Yielding to the process a pointer to the beginning of the segment
  29. 29. Shared Memory● A new process starts, wishing to attach to the same segment.
  30. 30. Shared Memory● It asks the OS for the identifier of the segment based on an integer key Are you there? Yup!
  31. 31. Shared Memory● ...and attaches it to itself in fashion similar to the original.
  32. 32. Shared Memory● Both processes can now - depending on permissions – read and write from the segment simultaneously!
  33. 33. Shared Memory● The first process finishes with the segment and detaches it.
  34. 34. Shared Memory● And thereafter, ends.
  35. 35. Shared Memory ● ...leaving only the second process, still attached
  36. 36. Shared Memory ● Now, the second process detaches...
  37. 37. Shared Memory ● ...and subsequently ends
  38. 38. Shared Memory● Note that the shared segment is still in persisted in memory● Can be reattached to another process with permission to do so
  39. 39. Shared Memory● Later, a new process comes along and explicitly destroys the segment, all processes being finished with it.
  40. 40. How its done: Configuration● Precisely how much memory can be drafted into service for sharing purposes is controlled by kernel parameters  kernel.shmall – the maximum number of memory pages available for sharing (should be at least ceil(shmmax / PAGE_SIZE))  kernel.shmmax – the maximum size in bytes of a single shared segment  kernel.shmmni – the maximum number of shared segments allowed.
  41. 41. How its done: Configuration● To view your current settings:
  42. 42. How its done: Configuration● Or...
  43. 43. How its done: Configuration● Setting the values temporarily can be accomplished with sysctl...
  44. 44. How its done: Configuration● ...or more permanently by editing /etc/sysctl.conf
  45. 45. How its done: Creating New and Acquiring Existing Segments● int shmget(key_t key, size_t size, int shmflag)  key_t key: integer key identifying the segment or IPC_PRIVATE  size_t size: integer size of segment in bytes (will be rounded up to next multiple of PAGE_SIZE)  int shmflag: mode flag consisting of standard o- g-w and IPC_CREAT (to create or attach to existing) and optionally IPC_EXCL (to throw an error if it already exists)
  46. 46. How its done: Creating New and Acquiring Existing Segments● int shmget(key_t key, size_t size, int shmflag)  Returns: valid segment identifier integer on success, or -1 on error
  47. 47. How its done: Attaching segments● void * shmat(int shmid, const void *shmaddr,  int shmflag)  shmid: integer identifier returned by a call to shmget  shmaddr: Pointer to the address at which to attach the memory. Almost always want to leave this NULL, so that the system will address the segment wherever theres room for it.
  48. 48. How its done: Attaching segments● void *shmat(int shmid, const void *shmaddr, int  shmflag)  shmflag: several flags for controlling the attachment – most importantly, SHM_RDONLY (what it looks like)  returns: a void pointer to the start of the attached segment, or (void *)-1 on error
  49. 49. How its done: Detaching segments● int shmdt(const void *shmaddr)  shmaddr: Pointer returned by the call to shmat  returns: 0 or -1 on error
  50. 50. How its done: Getting segment information● int shmctl(int shmid, int cmd, struct shmid_ds  *buf)  shmaddr: The identifier returned by shmget  cmd: The command to execute – for this purpose, IPC_STAT  Buf: A shmid_ds struct
  51. 51. How its done: Getting segment informationstruct shmid_ds {  struct   ipc_perm;    permissions/ownership  size_t   shm_segsz;   size of segment in bytes  time_t   shm_atime;   last attachment time  time_t   shm_dtime;   last detachment time  time_t   shm_ctime;   last change time  pid_t    shm_cpid;    pid of creator  pid_t    shm_lpid;    pid of last attached  shmatt_t shm_nattch;  # of attached processes}
  52. 52. How its done: Destroying segments● int shmctl(int shmid, int cmd, struct shmid_ds  *buf)  shmaddr: The identifier returned by shmget  cmd: IPC_RMID  Buf: A shmid_ds struct (you can ignore it afterwards, but itll throw a fit if you dont provide it)
  53. 53. ExamplesExamples
  54. 54. Challenges and Caveats● Addressing  Segments are attached wherever there is room for them in the attaching process address space
  55. 55. Challenges and Caveats● Maybe here in one process... 0x7f195bda2000
  56. 56. Challenges and Caveats● ...maybe here in another 0x73f882c1f000
  57. 57. Challenges and Caveats● So if you store an 0x7f195bda2004 absolute pointer in the segment that points somewhere else in the segment...
  58. 58. Challenges and Caveats● Its not terribly likely to 0x7f195bda2004 point where you think it should when referenced in a separate process
  59. 59. Challenges and Caveats● Addressing  Segments are attached wherever there is room for them in the attaching process address space  Absolute pointers are effectively useless  Relative pointers – i.e. Offsets  BSTs as heaps (the data structure).  Serialization.
  60. 60. Challenges and Caveats● Duplication and copying  Ruby primitivesques (numerics, strings, etc) are all allocated on the heap  Shared data must be effectively copied  Diminishes the usefulness of the tool for certain applications (large data sharing)  Not everything is a nail
  61. 61. Challenges and Caveats● Duplication and copying  But... fantastic for certain applications  Search  Search the shared structure at c level  Copy and coerce results to ruby objects  |results| << |data to be searched|  Semaphore, interprocess messaging  Built-in to the IPC/SHM lib!
  62. 62. Semaphore● Tracking resource allocation  Effectively an integer checked when a process allocates some resource  If nonzero, decrement  If zero, the resource isnt available● Simple, but slightly weird API.
  63. 63. Message Queues● Push bytearray/string messages into the queue, shift em off● Simple, slightly less bizarre API
  64. 64. In closing...● Quite exciting, this computer magic● Dont just use it because its there  Have a NEED● Dont be afraid to drop to C  Dont know C?  Learn it – a pretty simple language, when alls said and done  Building ruby C extensions is actually pretty painless
  65. 65. Questions / Comments In which I probably get trolled
  66. 66. Thanks for listening!Enjoy the rest of the conference!