An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

P Y TABLES & Family
Analyzing and Sharing HDF5 Data wi...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Who are we?
Cárabos is the company committed to the P ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Who are we?
Cárabos is the company committed to the P ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Who are we?
Cárabos is the company committed to the P ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Outline
1

An Introduction to the Python Language
Fact...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Outl...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Main...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Main...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Main...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Main...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Very...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Very...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Very...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Very...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Outl...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Basi...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Basi...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Basi...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Nume...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Nume...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Plot...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Plot...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Plot...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Outl...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Prog...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Prog...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Facts about Python
Scientific Packages
An example

Reso...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Outline
1

An ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What is P Y TA...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why HDF5?
I st...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why HDF5?
I st...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What Does Extr...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What Does Extr...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What Does Extr...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Which ELD Feat...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Which ELD Feat...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Which ELD Feat...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Which ELD Feat...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Which ELD Feat...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why Buffered I...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why a Cache Sy...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why a Cache Sy...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Why Fast Index...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Ease of Use
Na...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Ease of Use
Na...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Ease of Use
Na...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Outline
1

An ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Supported Obje...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What’s New in ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What’s New in ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Some Metrics o...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Outline
1

An ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

What is P Y TA...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Index Selectio...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Index Selectio...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Usage Examples
P Y TABLES Pro

Current Status...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Outline
1

An I...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

CSTABLES: P Y T...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Outline
1

An I...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Design Goals
Co...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Concurrency Iss...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Concurrency Iss...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Concurrency Iss...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Client Cache Mo...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Client Cache Mo...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Client Cache Mo...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Client Cache Mo...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Outline
1

An I...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Two Different A...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

Two Different A...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

CSTABLES Status...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

The P Y TABLES ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

The P Y TABLES ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

The P Y TABLES ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Overview
Design goals
Examples of Use

The P Y TABLES ...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Summary
Python is a joy for scientific computing. Start...
An Introduction to the Python Language
P Y TABLES
CSTABLES
Summary

Thank You!

Thanks also to:
Elena Pourmal, for invitin...
Upcoming SlideShare
Loading in …5
×

Analyzing and Sharing HDF5 Data with Python

878 views

Published on

Analyzing and sharing large amounts of data is a must for people working in many fields of scientific applications. It is usual that compiled languages are preferred in such environments, mainly because they came first, but also importantly, because of their performance. Nonetheless, interpreted languages like Perl or Python are gaining an important share of scientific users mainly because of their flexibility and greater productivity when writing code.

During this talk, the advantages of using Python will be stressed in scenarios where data should be analyzed interactively and shared between a number of users, even on remote locations. In particular, we will show PyTables, a Python library that is meant to access HDF5 data on a convenient, but also efficient, way. CSTables, a library that works on top of PyTables allowing remote access to HDF5 file repositories, will also be described.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
878
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Analyzing and Sharing HDF5 Data with Python

  1. 1. An Introduction to the Python Language P Y TABLES CSTABLES Summary P Y TABLES & Family Analyzing and Sharing HDF5 Data with Python Francesc Altet Cárabos Coop. V. HDF Workshop November 30, 2005 - December 2, 2005. Francesc Altet P Y TABLES & Family
  2. 2. An Introduction to the Python Language P Y TABLES CSTABLES Summary Who are we? Cárabos is the company committed to the P Y TABLES suite development and deployment. We have years of experience in designing software solutions for handling extremely large datasets. What we provide: Commercial support for the P Y TABLES suite. P Y TABLES-based applications. Consulting services for managing complex data environments. Francesc Altet P Y TABLES & Family
  3. 3. An Introduction to the Python Language P Y TABLES CSTABLES Summary Who are we? Cárabos is the company committed to the P Y TABLES suite development and deployment. We have years of experience in designing software solutions for handling extremely large datasets. What we provide: Commercial support for the P Y TABLES suite. P Y TABLES-based applications. Consulting services for managing complex data environments. Francesc Altet P Y TABLES & Family
  4. 4. An Introduction to the Python Language P Y TABLES CSTABLES Summary Who are we? Cárabos is the company committed to the P Y TABLES suite development and deployment. We have years of experience in designing software solutions for handling extremely large datasets. What we provide: Commercial support for the P Y TABLES suite. P Y TABLES-based applications. Consulting services for managing complex data environments. Francesc Altet P Y TABLES & Family
  5. 5. An Introduction to the Python Language P Y TABLES CSTABLES Summary Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  6. 6. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  7. 7. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Main Features Interpreted ⇒ Allows interactivity Flexible data structures ⇒ Malleability Minimalistic grammar ⇒ Easy to learn Very complete library (Batteries included) ⇒ Boosts productivity Francesc Altet P Y TABLES & Family
  8. 8. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Main Features Interpreted ⇒ Allows interactivity Flexible data structures ⇒ Malleability Minimalistic grammar ⇒ Easy to learn Very complete library (Batteries included) ⇒ Boosts productivity Francesc Altet P Y TABLES & Family
  9. 9. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Main Features Interpreted ⇒ Allows interactivity Flexible data structures ⇒ Malleability Minimalistic grammar ⇒ Easy to learn Very complete library (Batteries included) ⇒ Boosts productivity Francesc Altet P Y TABLES & Family
  10. 10. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Main Features Interpreted ⇒ Allows interactivity Flexible data structures ⇒ Malleability Minimalistic grammar ⇒ Easy to learn Very complete library (Batteries included) ⇒ Boosts productivity Francesc Altet P Y TABLES & Family
  11. 11. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Very Suitable for Scientific/Engineer Fields Interactivity has always been very appreciated for increasing productivity. Being interpreted does not necessarily mean being inefficient. Python has solutions for linking C & Fortran libraries in an easy way. Programming is normally considered a necessary evil. Rich expressivity of Python normally reduces the amount of code to solve real problems. Many efficient scientific libraries have been developed for Python: Numeric, numarray, SciPy, Scientific-Python, matplotlib... Francesc Altet P Y TABLES & Family
  12. 12. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Very Suitable for Scientific/Engineer Fields Interactivity has always been very appreciated for increasing productivity. Being interpreted does not necessarily mean being inefficient. Python has solutions for linking C & Fortran libraries in an easy way. Programming is normally considered a necessary evil. Rich expressivity of Python normally reduces the amount of code to solve real problems. Many efficient scientific libraries have been developed for Python: Numeric, numarray, SciPy, Scientific-Python, matplotlib... Francesc Altet P Y TABLES & Family
  13. 13. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Very Suitable for Scientific/Engineer Fields Interactivity has always been very appreciated for increasing productivity. Being interpreted does not necessarily mean being inefficient. Python has solutions for linking C & Fortran libraries in an easy way. Programming is normally considered a necessary evil. Rich expressivity of Python normally reduces the amount of code to solve real problems. Many efficient scientific libraries have been developed for Python: Numeric, numarray, SciPy, Scientific-Python, matplotlib... Francesc Altet P Y TABLES & Family
  14. 14. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Very Suitable for Scientific/Engineer Fields Interactivity has always been very appreciated for increasing productivity. Being interpreted does not necessarily mean being inefficient. Python has solutions for linking C & Fortran libraries in an easy way. Programming is normally considered a necessary evil. Rich expressivity of Python normally reduces the amount of code to solve real problems. Many efficient scientific libraries have been developed for Python: Numeric, numarray, SciPy, Scientific-Python, matplotlib... Francesc Altet P Y TABLES & Family
  15. 15. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  16. 16. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Basic Matrix Handling Numeric Very mature but lacking some features. Still has a huge user base. scipy.core Called to substitute both Numeric & numarray. When finished, it is supposed to have all the advantages of Numeric & numarray together. numarray Created to overcome some of the limitations of Numeric. Better handling of large arrays, heterogeneous data... Francesc Altet P Y TABLES & Family
  17. 17. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Basic Matrix Handling Numeric Very mature but lacking some features. Still has a huge user base. scipy.core Called to substitute both Numeric & numarray. When finished, it is supposed to have all the advantages of Numeric & numarray together. numarray Created to overcome some of the limitations of Numeric. Better handling of large arrays, heterogeneous data... Francesc Altet P Y TABLES & Family
  18. 18. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Basic Matrix Handling Numeric Very mature but lacking some features. Still has a huge user base. scipy.core Called to substitute both Numeric & numarray. When finished, it is supposed to have all the advantages of Numeric & numarray together. numarray Created to overcome some of the limitations of Numeric. Better handling of large arrays, heterogeneous data... Francesc Altet P Y TABLES & Family
  19. 19. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Numerical Supplements SciPy Optimization, integration, special functions Signal and image processing Genetic algorithms, ODE solvers... Scientific Python Quaternions, automatic derivatives, (linear) interpolation, polinomials, ... Overlaps somewhat SciPy, but it has... A nice netCDF module for I/O Francesc Altet P Y TABLES & Family
  20. 20. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Numerical Supplements SciPy Optimization, integration, special functions Signal and image processing Genetic algorithms, ODE solvers... Scientific Python Quaternions, automatic derivatives, (linear) interpolation, polinomials, ... Overlaps somewhat SciPy, but it has... A nice netCDF module for I/O Francesc Altet P Y TABLES & Family
  21. 21. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Plotting (2D) matplotlib example matplotlib Great interactivity. Produces publication quality figures. Francesc Altet P Y TABLES & Family
  22. 22. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Plotting (2D) PyQwt example PyQwt Fast plotting. Nice to be embedded in Qt apps. Francesc Altet P Y TABLES & Family
  23. 23. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Plotting (3D) MayaVi example MayaVi 3D-oriented scientific data visualizer. Uses the Visualization Toolkit (VTK). Francesc Altet P Y TABLES & Family
  24. 24. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  25. 25. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Programming as a Necessary Evil Evaluate En (x) = ∞ e−xt 1 t n dt for each value of n from scipy import * from scipy.integrate import quad, Inf def integrand(t,n,x): return exp(-x*t) / t**n def expint(n,x): return quad(integrand, 1, Inf, args=(n, x))[0] v_expint = vectorize(expint) print v_expint(3,arange(1.0,4.0,0.5)) The output [ 0.10969197, 0.05673949, 0.03013338, 0.01629537, 0.00893065, 0.00494538,] Francesc Altet P Y TABLES & Family
  26. 26. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Programming as a Necessary Evil The output The plotting code (matplotlib) xrng = arange(1.0,4.0,0.1) plot(xrng, v_expint(1, xrng)) plot(xrng, v_expint(2, xrng)) plot(xrng, v_expint(3, xrng)) legend(("n=1", "n=2", "n=3")) title(’Exponential Integral’) Francesc Altet P Y TABLES & Family
  27. 27. An Introduction to the Python Language P Y TABLES CSTABLES Summary Facts about Python Scientific Packages An example Resources Introductory material: docs.python.org/tut/tut.html www.python.org/doc/Intros.html Lutz & Ascher, "Learning Python": good introduction. Martelli, "Python in a Nutshell": useful reference. Martelli & Ascher, "Python Cookbook": more specialized, useful recipes for particular problems. Hans P. Langtangen, "Python Scripting for Computational Science": teaches computational scientists and engineers how to write small Python scripts efficiently. Francesc Altet P Y TABLES & Family
  28. 28. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  29. 29. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What is P Y TABLES? Simply stated: a database for Python based on HDF5. Designed to deal with extremely large datasets. Provides an easy-to-use interface. Supports many of the features of HDF5 and others that are not present in it (e.g. indexation). It is Open Source software (BSD license). Francesc Altet P Y TABLES & Family
  30. 30. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why HDF5? I started looking for different backends for saving large datasets, but HDF5 was the final winner. Thought out for managing very large datasets in an efficient way. Let you organize datasets hierchically. Very flexible and well tested in scientific environments. Good maintenance and improvement rate. It is Open Source software. Francesc Altet P Y TABLES & Family
  31. 31. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why HDF5? I started looking for different backends for saving large datasets, but HDF5 was the final winner. Thought out for managing very large datasets in an efficient way. Let you organize datasets hierchically. Very flexible and well tested in scientific environments. Good maintenance and improvement rate. It is Open Source software. Francesc Altet P Y TABLES & Family
  32. 32. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What Does Extremely Large Exactly Mean? The Hitch Hiker’s Guide to the Galaxy offers this definition of the word “Infinite”: Infinite: Bigger than the biggest thing ever and then some. Much bigger than that in fact, really amazingly immense, a totally stunning size, real “wow, that’s big”. Infinity is just so big that, by comparison, bigness looks really titchy. Gigantic multiplied by colossal multiplied by staggeringly huge is the sort of concept we’re trying to get across here. Disclaimer Agreed, ’Extremely Large’ may not exactly mean ’Infinite’, although it is pretty close to this definition. Francesc Altet P Y TABLES & Family
  33. 33. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What Does Extremely Large Exactly Mean? The Hitch Hiker’s Guide to the Galaxy offers this definition of the word “Infinite”: Infinite: Bigger than the biggest thing ever and then some. Much bigger than that in fact, really amazingly immense, a totally stunning size, real “wow, that’s big”. Infinity is just so big that, by comparison, bigness looks really titchy. Gigantic multiplied by colossal multiplied by staggeringly huge is the sort of concept we’re trying to get across here. Disclaimer Agreed, ’Extremely Large’ may not exactly mean ’Infinite’, although it is pretty close to this definition. Francesc Altet P Y TABLES & Family
  34. 34. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What Does Extremely Large Exactly Mean? The Hitch Hiker’s Guide to the Galaxy offers this definition of the word “Infinite”: Infinite: Bigger than the biggest thing ever and then some. Much bigger than that in fact, really amazingly immense, a totally stunning size, real “wow, that’s big”. Infinity is just so big that, by comparison, bigness looks really titchy. Gigantic multiplied by colossal multiplied by staggeringly huge is the sort of concept we’re trying to get across here. Disclaimer Agreed, ’Extremely Large’ may not exactly mean ’Infinite’, although it is pretty close to this definition. Francesc Altet P Y TABLES & Family
  35. 35. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Which ELD Features Does P Y TABLES Support? A good base library (HDF5). Support for 64-bit in all data addressing. Need to overcome a Python slicing limitation: only 32-bit addresses are supported. Buffered I/O. A LRU cache system for an efficient reuse of objects. Fast indexing and searching capabilities for tables. Francesc Altet P Y TABLES & Family
  36. 36. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Which ELD Features Does P Y TABLES Support? A good base library (HDF5). Support for 64-bit in all data addressing. Need to overcome a Python slicing limitation: only 32-bit addresses are supported. Buffered I/O. A LRU cache system for an efficient reuse of objects. Fast indexing and searching capabilities for tables. Francesc Altet P Y TABLES & Family
  37. 37. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Which ELD Features Does P Y TABLES Support? A good base library (HDF5). Support for 64-bit in all data addressing. Need to overcome a Python slicing limitation: only 32-bit addresses are supported. Buffered I/O. A LRU cache system for an efficient reuse of objects. Fast indexing and searching capabilities for tables. Francesc Altet P Y TABLES & Family
  38. 38. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Which ELD Features Does P Y TABLES Support? A good base library (HDF5). Support for 64-bit in all data addressing. Need to overcome a Python slicing limitation: only 32-bit addresses are supported. Buffered I/O. A LRU cache system for an efficient reuse of objects. Fast indexing and searching capabilities for tables. Francesc Altet P Y TABLES & Family
  39. 39. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Which ELD Features Does P Y TABLES Support? A good base library (HDF5). Support for 64-bit in all data addressing. Need to overcome a Python slicing limitation: only 32-bit addresses are supported. Buffered I/O. A LRU cache system for an efficient reuse of objects. Fast indexing and searching capabilities for tables. Francesc Altet P Y TABLES & Family
  40. 40. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why Buffered I/O? Because of speed (what else?): Francesc Altet P Y TABLES & Family
  41. 41. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why a Cache System? 1.- To achieve better open file times: Francesc Altet P Y TABLES & Family
  42. 42. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why a Cache System? 2.- To achieve a conservative usage of the memory: Francesc Altet P Y TABLES & Family
  43. 43. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Why Fast Indexing? It is not trifling in terms of time when you have to index tables with more than one billion of rows: Francesc Altet P Y TABLES & Family
  44. 44. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Ease of Use Natural naming # access to file:/group1/table table = file.root.group1.table Support for generalized slicing # step means a stride in the slice table[start:stop:step] Support for iterators # get the values in col1 that satisfy the # (1.3 < col3 <= 2.) condition in table col3 = table.cols.col3 [r[’col1’] for r in table.where(1.3 < col3 <= 2.)] Francesc Altet P Y TABLES & Family
  45. 45. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Ease of Use Natural naming # access to file:/group1/table table = file.root.group1.table Support for generalized slicing # step means a stride in the slice table[start:stop:step] Support for iterators # get the values in col1 that satisfy the # (1.3 < col3 <= 2.) condition in table col3 = table.cols.col3 [r[’col1’] for r in table.where(1.3 < col3 <= 2.)] Francesc Altet P Y TABLES & Family
  46. 46. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Ease of Use Natural naming # access to file:/group1/table table = file.root.group1.table Support for generalized slicing # step means a stride in the slice table[start:stop:step] Support for iterators # get the values in col1 that satisfy the # (1.3 < col3 <= 2.) condition in table col3 = table.cols.col3 [r[’col1’] for r in table.where(1.3 < col3 <= 2.)] Francesc Altet P Y TABLES & Family
  47. 47. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  48. 48. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Supported Objects in P Y TABLES Table: Lets you deal with heterogeneous datasets. Chunked. Enlargeable. Support for nested types. Array: Provides quick and dirty array handling. Not chunked. Non enlargeable. CArray: Provides compressed array support. Chunked. Not enlargeable. EArray: Most general array support. Chunked. Enlargeable. VLArray: Collections of homogeneous data with a variable number of entries. Chunked. Enlargeable. Group: The structural component. Francesc Altet P Y TABLES & Family
  49. 49. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What’s New in P Y TABLES 1.2? Added a new cache for the object tree. Allows a contained use of memory (even with huge trees). Almost instantaneous opening of files (good news for interactive use!). New NetCDF module (contributed by Jeff Whitaker). Provides API emulation for Scientific.IO.NetCDF. netCDF-3 ⇔ HDF5 conversions. tables.NetCDF datasets can be shared over the internet with the OPeNDAP protocol. Plans to write data natively in netCDF-4 format in the future. Francesc Altet P Y TABLES & Family
  50. 50. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What’s New in P Y TABLES 1.2? Added a new cache for the object tree. Allows a contained use of memory (even with huge trees). Almost instantaneous opening of files (good news for interactive use!). New NetCDF module (contributed by Jeff Whitaker). Provides API emulation for Scientific.IO.NetCDF. netCDF-3 ⇔ HDF5 conversions. tables.NetCDF datasets can be shared over the internet with the OPeNDAP protocol. Plans to write data natively in netCDF-4 format in the future. Francesc Altet P Y TABLES & Family
  51. 51. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Some Metrics on P Y TABLES 1.2 Core Library ~ 7.5 thousand LOC (Python) ~ 2.8 thousand LOC (Pyrex) ~ 4.0 thousand LOC (C) Test Units ~ 3.1 thousand test units (Python) ~ 23 thousand LOC (Python) Documentation ~ 5.0 thousand lines of on-line doc strings ~ 160 pages of Users’ Guide in PDF (and HTML) Francesc Altet P Y TABLES & Family
  52. 52. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  53. 53. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro What is P Y TABLES P RO? It’s just a regular P Y TABLES but with enhanced features: Improved search speed: Selections in tables with > 1 billion rows will be typically done in less than 1 second. Complex queries: Supports an unlimited combination of conditions. Query optimizer: Queries are analyzed, reordered and classified to get an optimum response time. Support for complex indices: Expressions like col1 + col2*col3 + col4**3 can be indexed and used for selections afterwards. Francesc Altet P Y TABLES & Family
  54. 54. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Index Selection Speed Francesc Altet P Y TABLES & Family
  55. 55. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Index Selection Speed: The Goal Francesc Altet P Y TABLES & Family
  56. 56. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Usage Examples P Y TABLES Pro Current Status for P Y TABLES P RO Improving the query response time. Remains to be done: Complex queries. Query optimizer. Date of release: 2nd quarter 2006 (tentative). Francesc Altet P Y TABLES & Family
  57. 57. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  58. 58. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use CSTABLES: P Y TABLES Goes Client-Server Francesc Altet P Y TABLES & Family
  59. 59. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  60. 60. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Design Goals Concurrency: Allows multiple access to the same file at the same time. Locking system: Avoids corruption of data. High Throughput: Data is transmitted in large blocks. High-speed compressor/decompressor used. Client Cache: Metadata is kept in the client-side when an object is first accessed. Full P Y TABLES API compatibility: Existing P Y TABLES programs can be re-used to access remote files without changing virtually anything. Francesc Altet P Y TABLES & Family
  61. 61. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Concurrency Issues Question: CSTABLES does not provide threading or asynchronous features yet. So, how does it deal with several requests at a time? Answer: Large data read and write requests are split into small chunks. This considerably improves server response time. In addition, CSTABLES provides a lock mechanism that allows applications to explicitly put a lock on a node or on an entire subtree. Different locking access modes: READ, WRITE and ALL Francesc Altet P Y TABLES & Family
  62. 62. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Concurrency Issues Question: CSTABLES does not provide threading or asynchronous features yet. So, how does it deal with several requests at a time? Answer: Large data read and write requests are split into small chunks. This considerably improves server response time. In addition, CSTABLES provides a lock mechanism that allows applications to explicitly put a lock on a node or on an entire subtree. Different locking access modes: READ, WRITE and ALL Francesc Altet P Y TABLES & Family
  63. 63. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Concurrency Issues Question: CSTABLES does not provide threading or asynchronous features yet. So, how does it deal with several requests at a time? Answer: Large data read and write requests are split into small chunks. This considerably improves server response time. In addition, CSTABLES provides a lock mechanism that allows applications to explicitly put a lock on a node or on an entire subtree. Different locking access modes: READ, WRITE and ALL Francesc Altet P Y TABLES & Family
  64. 64. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Client Cache More in Depth CSTABLES caches some of the metadata of the P Y TABLES object tree. When a client makes a change to the metadata, this change is pushed to the other clients’ caches. Easy to control which attributes should be cached and which should not. You should try to cache primarily read-only attributes. Caveat emptor: Caching attributes that are updated very often might generate more traffic than attributes that are not cached at all! Francesc Altet P Y TABLES & Family
  65. 65. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Client Cache More in Depth CSTABLES caches some of the metadata of the P Y TABLES object tree. When a client makes a change to the metadata, this change is pushed to the other clients’ caches. Easy to control which attributes should be cached and which should not. You should try to cache primarily read-only attributes. Caveat emptor: Caching attributes that are updated very often might generate more traffic than attributes that are not cached at all! Francesc Altet P Y TABLES & Family
  66. 66. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Client Cache More in Depth CSTABLES caches some of the metadata of the P Y TABLES object tree. When a client makes a change to the metadata, this change is pushed to the other clients’ caches. Easy to control which attributes should be cached and which should not. You should try to cache primarily read-only attributes. Caveat emptor: Caching attributes that are updated very often might generate more traffic than attributes that are not cached at all! Francesc Altet P Y TABLES & Family
  67. 67. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Client Cache More in Depth CSTABLES caches some of the metadata of the P Y TABLES object tree. When a client makes a change to the metadata, this change is pushed to the other clients’ caches. Easy to control which attributes should be cached and which should not. You should try to cache primarily read-only attributes. Caveat emptor: Caching attributes that are updated very often might generate more traffic than attributes that are not cached at all! Francesc Altet P Y TABLES & Family
  68. 68. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Outline 1 An Introduction to the Python Language Facts about Python Scientific Packages An example 2 P Y TABLES Overview Usage Examples P Y TABLES Pro 3 CSTABLES Overview Design goals Examples of Use Francesc Altet P Y TABLES & Family
  69. 69. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Two Different APIs P Y TABLES API (prefix with csclient -ip=server_IP) import tables fileh = tables.openFile("file.h5") print fileh.root.table.cols.col1[:] fileh.close() CSTABLES API (no need to be prefixed) from cstables.client import client c=client.Session() app=c.connect("your_FQDN_server") fileh=app.openFile("file.h5") print fileh.root.table.cols.col1[:] fileh.close() Francesc Altet P Y TABLES & Family
  70. 70. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use Two Different APIs P Y TABLES API (prefix with csclient -ip=server_IP) import tables fileh = tables.openFile("file.h5") print fileh.root.table.cols.col1[:] fileh.close() CSTABLES API (no need to be prefixed) from cstables.client import client c=client.Session() app=c.connect("your_FQDN_server") fileh=app.openFile("file.h5") print fileh.root.table.cols.col1[:] fileh.close() Francesc Altet P Y TABLES & Family
  71. 71. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use CSTABLES Status & Availability The main design features are already implemented and working. Beta available (caveat: only works against P Y TABLES 1.0). Focus now is on checking & debugging possible errors, improving the throughput and bettering the Users’ Guide. Future directions: threading, asynchronous communications. Francesc Altet P Y TABLES & Family
  72. 72. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use The P Y TABLES Suite Is Getting Shape P Y TABLES The basic layer for the other components. P Y TABLES Pro P Y TABLES with a twist: Complex searches and ultra-fast selections in tables. CSTABLES The client-server P Y TABLES. It supports as well generic HDF5 files. V I TABLES A data viewer for P Y TABLES (and HDF5) files. Francesc Altet P Y TABLES & Family
  73. 73. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use The P Y TABLES Suite Is Getting Shape P Y TABLES The basic layer for the other components. P Y TABLES Pro P Y TABLES with a twist: Complex searches and ultra-fast selections in tables. CSTABLES The client-server P Y TABLES. It supports as well generic HDF5 files. V I TABLES A data viewer for P Y TABLES (and HDF5) files. Francesc Altet P Y TABLES & Family
  74. 74. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use The P Y TABLES Suite Is Getting Shape P Y TABLES The basic layer for the other components. P Y TABLES Pro P Y TABLES with a twist: Complex searches and ultra-fast selections in tables. CSTABLES The client-server P Y TABLES. It supports as well generic HDF5 files. V I TABLES A data viewer for P Y TABLES (and HDF5) files. Francesc Altet P Y TABLES & Family
  75. 75. An Introduction to the Python Language P Y TABLES CSTABLES Summary Overview Design goals Examples of Use The P Y TABLES Suite Is Getting Shape P Y TABLES The basic layer for the other components. P Y TABLES Pro P Y TABLES with a twist: Complex searches and ultra-fast selections in tables. CSTABLES The client-server P Y TABLES. It supports as well generic HDF5 files. V I TABLES A data viewer for P Y TABLES (and HDF5) files. Francesc Altet P Y TABLES & Family
  76. 76. An Introduction to the Python Language P Y TABLES CSTABLES Summary Summary Python is a joy for scientific computing. Start using it now! The P Y TABLES suite is designed to work with HDF5 files in an interactive, efficient and, most importantly, easy way. Outlook Working hard to release CSTABLES in the first quarter of 2006. P Y TABLES P RO will come later on (~ 2nd quarter of 2006). It would be nice to produce a parallel version of P Y TABLES (long term goal). Francesc Altet P Y TABLES & Family
  77. 77. An Introduction to the Python Language P Y TABLES CSTABLES Summary Thank You! Thanks also to: Elena Pourmal, for inviting us to the Workshop. Frank Baker, for being kind. The HDF Group for pushing forward P Y TABLES. Francesc Altet P Y TABLES & Family

×