SlideShare a Scribd company logo
Data Processing with
       Ruby
        Brian Chapados
      http://chapados.org



                              SDRuby
                            April 3, 2008
Understanding Proteins
sequence: 1-D linear chain
     > Archaeglobus PCNA
     MIDVIMTGELLKTVTRAIVALVSEARIHFLEKGLHSRAVDPANVAMVIVDIPK
     DSFEVYNIDEEKTIGVDMDRIFDISKSISTKDLVELIVEDESTLKVKFGSVEYK
     VALIDPSAIRKEPRIPELELPAKIVMDAGEFKKAIAAADKISDQVIFRSDKEGF
     RIEAKGDVDSIVFHMTETELIEFNGGEARSMFSVDYLKEFCKVAGSGDLLTI
     HLGTNYPVRLVFELVGGRAKVEYILAPRIESE




 structure: 3-D after
       folding
Hard to do structures with several
          components
X-ray scattering




            C. Trame, personal communication.
            Sousa et al. 2000. Cell 103: 633-643.
Raw Data
    Distance distribution function of
            particle


       R        P(R)      ERROR

0.0000E+00   0.0000E+00   0.0000E+00
0.5000E+00   0.3157E-02   0.0000E+00
0.1000E+01   0.6069E-02   0.0000E+00
0.1500E+01   0.8740E-02   0.0000E+00
0.2000E+01   0.1118E-01   0.0000E+00
0.2500E+01   0.1339E-01   0.0000E+00
0.3000E+01   0.1538E-01   0.0000E+00
0.3500E+01   0.1718E-01   0.0000E+00
0.4000E+01   0.1879E-01   0.0000E+00
0.4500E+01   0.2023E-01   0.0000E+00
0.5000E+01   0.2153E-01   0.0000E+00
0.5500E+01   0.2269E-01   0.0000E+00
0.6000E+01   0.2374E-01   0.0000E+00
0.6500E+01   0.2471E-01   0.0000E+00
0.7000E+01   0.2560E-01   0.0000E+00
0.7500E+01   0.2645E-01   0.0000E+00
0.8000E+01   0.2727E-01   0.0000E+00
0.8500E+01   0.2809E-01   0.0000E+00
0.9000E+01   0.2891E-01   0.0000E+00
0.9500E+01   0.2976E-01   0.0000E+00
0.1000E+02   0.3065E-01   0.0000E+00
0.1050E+02   0.3160E-01   0.0000E+00
Existing Software
Svergun group @ EMBL
http://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html



Works well, but...
    requires running each program multiple times
   “interactive” interfaces
    not easily scriptable
    no really... you have to see it to believe it
Help from Ruby
We want to use linux clusters with hundreds of CPUs

Ruby
 wrap external programs
 write shell scripts to run external programs
Rake
 define relationships between inputs/outputs of
               different programs
 launch external programs after dependencies
                  are satisfied
Do more with Ruby
quick and dirty...
     Define input parameters in a script
     Define common tasks in a library

 more robust...
    Ruby API for running commands
    More sophisticated information processing
    Evolve towards a micro-framework
Acknowledgements
Lab (Scripps Research Institute)
 John Tainer
 Scott Williams
 Chris Putnam

Data Collection                    Funding
    Beamline 12.3.1                 NIH, DOE, NCI
  The Advanced Light
  Source (ALS, LBNL)

More Related Content

Viewers also liked

Kenesunumu
KenesunumuKenesunumu
Kenesunumu
anttab
 
Aquarelas Envelhecidas Cora Coralina
Aquarelas Envelhecidas Cora CoralinaAquarelas Envelhecidas Cora Coralina
Aquarelas Envelhecidas Cora Coralinarapolido
 
Business Advantage On A Warming Planet
Business Advantage On A Warming PlanetBusiness Advantage On A Warming Planet
Business Advantage On A Warming Planet
World Resources Institute (WRI)
 
Internet Curriculum Project
Internet Curriculum ProjectInternet Curriculum Project
Internet Curriculum Projectmiss_dumiak
 
Dispositivos Almacenamiento
Dispositivos AlmacenamientoDispositivos Almacenamiento
Dispositivos Almacenamientosusitaipe
 
Presentac[1]..
Presentac[1]..Presentac[1]..
Presentac[1]..jjgonzalez
 
instrumentos del negocio
instrumentos del negocioinstrumentos del negocio
instrumentos del negocio
jorpical
 
Dispositivos Almacenamiento
Dispositivos AlmacenamientoDispositivos Almacenamiento
Dispositivos Almacenamiento
judithvasquez
 
Refik Saydam Hifzisihha Merkezinin TanıDaki Rolu
Refik Saydam Hifzisihha Merkezinin TanıDaki RoluRefik Saydam Hifzisihha Merkezinin TanıDaki Rolu
Refik Saydam Hifzisihha Merkezinin TanıDaki Rolu
anttab
 
Presentac[1]..
Presentac[1]..Presentac[1]..
Presentac[1]..jjgonzalez
 
Alpha6 Guidance
Alpha6 GuidanceAlpha6 Guidance
Alpha6 Guidance
Steve Bishop
 
Kkkah
KkkahKkkah
Kkkah
anttab
 
proffessional
proffessionalproffessional
proffessional
tulineel
 
Egxeiridio Drastiriotiton Modellus
Egxeiridio Drastiriotiton ModellusEgxeiridio Drastiriotiton Modellus
Egxeiridio Drastiriotiton Modellus
Stergios
 
Multimedia Final
Multimedia FinalMultimedia Final
Multimedia Finalboirablava
 
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
guest1506a6
 
D Mc Clelland Test
D Mc Clelland TestD Mc Clelland Test
D Mc Clelland Test
Myrle GM Zanatta
 

Viewers also liked (20)

Kenesunumu
KenesunumuKenesunumu
Kenesunumu
 
Aquarelas Envelhecidas Cora Coralina
Aquarelas Envelhecidas Cora CoralinaAquarelas Envelhecidas Cora Coralina
Aquarelas Envelhecidas Cora Coralina
 
Business Advantage On A Warming Planet
Business Advantage On A Warming PlanetBusiness Advantage On A Warming Planet
Business Advantage On A Warming Planet
 
Guantánamo
GuantánamoGuantánamo
Guantánamo
 
Rwanda
RwandaRwanda
Rwanda
 
Internet Curriculum Project
Internet Curriculum ProjectInternet Curriculum Project
Internet Curriculum Project
 
Rivista
RivistaRivista
Rivista
 
Dispositivos Almacenamiento
Dispositivos AlmacenamientoDispositivos Almacenamiento
Dispositivos Almacenamiento
 
Presentac[1]..
Presentac[1]..Presentac[1]..
Presentac[1]..
 
instrumentos del negocio
instrumentos del negocioinstrumentos del negocio
instrumentos del negocio
 
Dispositivos Almacenamiento
Dispositivos AlmacenamientoDispositivos Almacenamiento
Dispositivos Almacenamiento
 
Refik Saydam Hifzisihha Merkezinin TanıDaki Rolu
Refik Saydam Hifzisihha Merkezinin TanıDaki RoluRefik Saydam Hifzisihha Merkezinin TanıDaki Rolu
Refik Saydam Hifzisihha Merkezinin TanıDaki Rolu
 
Presentac[1]..
Presentac[1]..Presentac[1]..
Presentac[1]..
 
Alpha6 Guidance
Alpha6 GuidanceAlpha6 Guidance
Alpha6 Guidance
 
Kkkah
KkkahKkkah
Kkkah
 
proffessional
proffessionalproffessional
proffessional
 
Egxeiridio Drastiriotiton Modellus
Egxeiridio Drastiriotiton ModellusEgxeiridio Drastiriotiton Modellus
Egxeiridio Drastiriotiton Modellus
 
Multimedia Final
Multimedia FinalMultimedia Final
Multimedia Final
 
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
Apresentacao com oportunidade de trabalho para Promotor(a) e Supervisor(a) be...
 
D Mc Clelland Test
D Mc Clelland TestD Mc Clelland Test
D Mc Clelland Test
 

Similar to Processing Data with Ruby

Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Databricks
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
diannepatricia
 
SRAdb Bioconductor Package Overview
SRAdb Bioconductor Package OverviewSRAdb Bioconductor Package Overview
SRAdb Bioconductor Package Overview
Sean Davis
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
Dmytro Mishkin
 
Plutniak maisonobe resto atelier2-network
Plutniak maisonobe resto atelier2-networkPlutniak maisonobe resto atelier2-network
Plutniak maisonobe resto atelier2-network
Marion Maisonobe
 
CassandraMeetup-0225-updated
CassandraMeetup-0225-updatedCassandraMeetup-0225-updated
CassandraMeetup-0225-updatedWei Zhu
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
Spark Summit
 
Open Source Means Upstream First
Open Source Means Upstream FirstOpen Source Means Upstream First
Open Source Means Upstream First
OPNFV
 
Katello on TorqueBox
Katello on TorqueBoxKatello on TorqueBox
Katello on TorqueBox
lzap
 
Microservices With Spring Boot and Spring Cloud Netflix
Microservices With Spring Boot and Spring Cloud NetflixMicroservices With Spring Boot and Spring Cloud Netflix
Microservices With Spring Boot and Spring Cloud Netflix
Krzysztof Sobkowiak
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
Utkarsh Contractor
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
cursoNGS
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon Aurora
Amazon Web Services
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Databricks
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
Li Shen
 
CloudCon2012 Ruo Ando
CloudCon2012 Ruo AndoCloudCon2012 Ruo Ando
CloudCon2012 Ruo Ando
Ruo Ando
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of Semantics
Jean-Paul Calbimonte
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
Morningstar Tech Talks
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Lucidworks
 

Similar to Processing Data with Ruby (20)

Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
 
SRAdb Bioconductor Package Overview
SRAdb Bioconductor Package OverviewSRAdb Bioconductor Package Overview
SRAdb Bioconductor Package Overview
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
Plutniak maisonobe resto atelier2-network
Plutniak maisonobe resto atelier2-networkPlutniak maisonobe resto atelier2-network
Plutniak maisonobe resto atelier2-network
 
CassandraMeetup-0225-updated
CassandraMeetup-0225-updatedCassandraMeetup-0225-updated
CassandraMeetup-0225-updated
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
 
Open Source Means Upstream First
Open Source Means Upstream FirstOpen Source Means Upstream First
Open Source Means Upstream First
 
Katello on TorqueBox
Katello on TorqueBoxKatello on TorqueBox
Katello on TorqueBox
 
Microservices With Spring Boot and Spring Cloud Netflix
Microservices With Spring Boot and Spring Cloud NetflixMicroservices With Spring Boot and Spring Cloud Netflix
Microservices With Spring Boot and Spring Cloud Netflix
 
Surveillance scene classification using machine learning
Surveillance scene classification using machine learningSurveillance scene classification using machine learning
Surveillance scene classification using machine learning
 
Discovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGSDiscovery and annotation of variants by exome analysis using NGS
Discovery and annotation of variants by exome analysis using NGS
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon Aurora
 
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a LaptopProject Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
Project Tungsten Phase II: Joining a Billion Rows per Second on a Laptop
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
CloudCon2012 Ruo Ando
CloudCon2012 Ruo AndoCloudCon2012 Ruo Ando
CloudCon2012 Ruo Ando
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
RDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of SemanticsRDF Stream Processing and the role of Semantics
RDF Stream Processing and the role of Semantics
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBMSolr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
Solr and Machine Vision - Scott Cote, Lucidworks & Trevor Grant, IBM
 

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 

Recently uploaded (20)

Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 

Processing Data with Ruby

  • 1. Data Processing with Ruby Brian Chapados http://chapados.org SDRuby April 3, 2008
  • 2. Understanding Proteins sequence: 1-D linear chain > Archaeglobus PCNA MIDVIMTGELLKTVTRAIVALVSEARIHFLEKGLHSRAVDPANVAMVIVDIPK DSFEVYNIDEEKTIGVDMDRIFDISKSISTKDLVELIVEDESTLKVKFGSVEYK VALIDPSAIRKEPRIPELELPAKIVMDAGEFKKAIAAADKISDQVIFRSDKEGF RIEAKGDVDSIVFHMTETELIEFNGGEARSMFSVDYLKEFCKVAGSGDLLTI HLGTNYPVRLVFELVGGRAKVEYILAPRIESE structure: 3-D after folding
  • 3. Hard to do structures with several components
  • 4. X-ray scattering C. Trame, personal communication. Sousa et al. 2000. Cell 103: 633-643.
  • 5. Raw Data Distance distribution function of particle R P(R) ERROR 0.0000E+00 0.0000E+00 0.0000E+00 0.5000E+00 0.3157E-02 0.0000E+00 0.1000E+01 0.6069E-02 0.0000E+00 0.1500E+01 0.8740E-02 0.0000E+00 0.2000E+01 0.1118E-01 0.0000E+00 0.2500E+01 0.1339E-01 0.0000E+00 0.3000E+01 0.1538E-01 0.0000E+00 0.3500E+01 0.1718E-01 0.0000E+00 0.4000E+01 0.1879E-01 0.0000E+00 0.4500E+01 0.2023E-01 0.0000E+00 0.5000E+01 0.2153E-01 0.0000E+00 0.5500E+01 0.2269E-01 0.0000E+00 0.6000E+01 0.2374E-01 0.0000E+00 0.6500E+01 0.2471E-01 0.0000E+00 0.7000E+01 0.2560E-01 0.0000E+00 0.7500E+01 0.2645E-01 0.0000E+00 0.8000E+01 0.2727E-01 0.0000E+00 0.8500E+01 0.2809E-01 0.0000E+00 0.9000E+01 0.2891E-01 0.0000E+00 0.9500E+01 0.2976E-01 0.0000E+00 0.1000E+02 0.3065E-01 0.0000E+00 0.1050E+02 0.3160E-01 0.0000E+00
  • 6. Existing Software Svergun group @ EMBL http://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html Works well, but... requires running each program multiple times “interactive” interfaces not easily scriptable no really... you have to see it to believe it
  • 7. Help from Ruby We want to use linux clusters with hundreds of CPUs Ruby wrap external programs write shell scripts to run external programs Rake define relationships between inputs/outputs of different programs launch external programs after dependencies are satisfied
  • 8. Do more with Ruby quick and dirty... Define input parameters in a script Define common tasks in a library more robust... Ruby API for running commands More sophisticated information processing Evolve towards a micro-framework
  • 9. Acknowledgements Lab (Scripps Research Institute) John Tainer Scott Williams Chris Putnam Data Collection Funding Beamline 12.3.1 NIH, DOE, NCI The Advanced Light Source (ALS, LBNL)