• Email
  • Like
  • Save
  • Private Content
  • Embed
 

Parallel External Memory Algorithms Applied to Generalized Linear Models

by on Aug 22, 2012

  • 725 views

Presentation by Lee Edlefsen, Revolution Analytics to JSM 2012, San Diego CA, July 30 2012 ...

Presentation by Lee Edlefsen, Revolution Analytics to JSM 2012, San Diego CA, July 30 2012

For the past several decades the rising tide of technology has allowed the same data analysis code to handle the increase in sizes of typical data sets. That era is ending. The size of data sets is increasing much more rapidly than the speed of single cores, of RAM, and of hard drives. To deal with this, statistical software must be able to use multiple cores and computers. Parallel external memory algorithms (PEMA's) provide the foundation for such software. External memory algorithms (EMA's) are those that do not require all data to be in RAM, and are widely available. Parallel implementations of EMA's allow them to run on multiple cores and computers, and to process unlimited rows of data. This paper describes a general approach to efficiently parallelizing EMA's, using an R and C++ implementation of GLM as a detailed example. It examines the requirements for efficient PEMA's; the arrangement of code for automatic parallelization; efficient threading; and efficient inter-process communication. It includes billion row benchmarks showing linear scaling with rows and nodes, and demonstrating that extremely high performance is achievable.

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

Statistics

Likes
0
Downloads
10
Comments
0
Embed Views
0
Views on SlideShare
725
Total Views
725
Post Comment
Edit your comment

Parallel External Memory Algorithms Applied to Generalized Linear Models Parallel External Memory Algorithms Applied to Generalized Linear Models Presentation Transcript