Activityin Big Data      26/03/2012                   1
Previous work   in Big Data                 2
Scenario      Application      Data          Target    placement and   management:   Applications:      scheduling:     Ma...
Big Data PapersHigh level performance goals and Big Data•   Resource-aware Adaptive Scheduling for MapReduce Clusters. J. ...
On going research       in Big Data                     5
New challenges in Big Data: OUR VISION                                      Conventional                                  ...
New challenges in Big Data: OUR APPROACH                                                  MapReduce                       ...
On going research projects                                                                                                ...
Next planned research           in Big Data                         9
New challenges in Big Data: Our approach                                                               MapReduce          ...
Our Big Data resource management picture       Data              Genomic               Drug               Air Quality     ...
Collaboration with other BSC departments                  Heterogeneous Application Flows                  (Domain Specifi...
Autonomic & eBusiness               Group                        13
Group Goal To research autonomic and intelligent resource management for todays business applications.                    ...
Current main interrelated areas                                                    BLO-driven                             ...
Group members       www.bsc.es/eBusiness                              16
Upcoming SlideShare
Loading in …5
×

Big Data

2,212 views

Published on

Big Data en el Barcelona Supercomputing Center

Published in: Technology, Business
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
2,212
On SlideShare
0
From Embeds
0
Number of Embeds
541
Actions
Shares
0
Downloads
51
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Big Data

  1. 1. Activityin Big Data 26/03/2012 1
  2. 2. Previous work in Big Data 2
  3. 3. Scenario Application Data Target placement and management: Applications: scheduling: MapReduce Key-Value Data Analytics storage Bioinformatics 3
  4. 4. Big Data PapersHigh level performance goals and Big Data• Resource-aware Adaptive Scheduling for MapReduce Clusters. J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, E. Ayguadé. In the ACM/IFIP/USENIX 12th International Middleware Conference (Middleware 2011).• Performance-Driven Task Co-Scheduling for MapReduce Environments.J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguadé, M. Steinder, I. Whalley. In the 12th IEEE/IFIP Network Operations and Management Symposium (NOMS2010).Hybrid Hardware and Big Data• Speeding Up Distributed MapReduce Applications Using Hardware Accelerators. Y. Becerra, V. Beltran, D. Carrera, M. González, J. Torres and E. Ayguadé. In the 38th International Conference on Parallel Processing (ICPP 2009).• Accelerated MapReduce Workloads in Heterogeneous Clusters. J. Polo, D. Carrera, Y. Becerra, V. Beltran, J. Torres, E. Ayguadé. Performance Management of Accelerators. In the 39th International Conference on Parallel Processing (ICPP2010).Big Data and Energy:• Towards Energy-Eficient Management of MapReduce Workloads. J.Polo, Y. Becerra, D. Carrera, V. Beltran, J. Torres and E. Ayguadé. First international conference on energy-efficient computing and networking. (e-Energy 2010).• GreenHadoop: Leveraging Green Energy in Data-Processing Frameworks. Í. Goiri, K. Le, T. D. Nguyen, J. Guitart, J. Torres, and R. Bianchini. European Conference on Computer Systems (Eurosys 2012). 4
  5. 5. On going research in Big Data 5
  6. 6. New challenges in Big Data: OUR VISION Conventional Storage Execution Time Systems Large Data new Sets, growing requirements too big for for real-time conventional decisions storage/tools GBs Data Volume PBs 6
  7. 7. New challenges in Big Data: OUR APPROACH MapReduce & NoSQL Conventional Storage Execution Time Systems In-memory GBs Data Volume PBs 7
  8. 8. On going research projects Technology Goal Use case Collaborators involved Hadoop Snapshot isolation (support to Data Analytics IBM & online data generation) Cassandra High level performance goal and Data Analytics and Hadoop Life Science Dept. MapReduce automatic query configuration Bioinformatics (support to & (BSC) & NoSQL drug discovery) Cassandra Automatic configuration, data Bioinformatics (support to Life Science Dept. organization to meet high level Cassandra performance goals drug discovery) (BSC) In-Memory Bioinformatics Workflows (index construction, Bioinformatics (genomic IBM and Life In-Memory alignment, sorting, data PIMD sequencing) Science Dept. (BSC) processing) 8
  9. 9. Next planned research in Big Data 9
  10. 10. New challenges in Big Data: Our approach MapReduce & NoSQL Conventional Storage Execution Time Systems APPLICATION Storage IN-MEMORY Hierarchy Management RDBMS In-memory GBs Data Volume PBs 10
  11. 11. Our Big Data resource management picture Data Genomic Drug Air Quality Business Analytics Sequencing Discovery Forecasting Intelligent Resource Management High Scalable Application placement Data Management: To meet performance goals as: automatic data organization and scheduling: Consistency, and configuration (multi-job performance Availability, Partitioning Tolerance, (NoSQL/In-Memory/Hierarchy goals, resource awareness, Energy Consumption, management) hybrid harware) Response Time, … In-Memory DB Heterogeneous NoSQL Compute Nodes SQL Storage Hierarchy: Mix of Mechanichal + Flash + SCM
  12. 12. Collaboration with other BSC departments Heterogeneous Application Flows (Domain Specific, Differentiated Resource Requirements) Legacy Code Prog. (MPI) Models eDSL Data-centric Resource Manager Custom In-mem NoSQL Data Key/Val Mgmt. Persistent FileSystems Objects Mix of Mechanichal + Flash + SCM Storage Compute nodes
  13. 13. Autonomic & eBusiness Group 13
  14. 14. Group Goal To research autonomic and intelligent resource management for todays business applications. Cloud Computing The objective is to create new components at middleware level that provides holistic Sustainable Computing Big Data solutions for some of the new Autonomic and Intelligent IT challenges in the industry Resource Management High Business Performance Analytics Computing 14
  15. 15. Current main interrelated areas BLO-driven High Exploiting Management performance Heterogeneous architectures Hardware for Big Data Embedded Domain Specific Online Languages for predictors HPC Massively Service-aware Distributed VM Data Stores Management Middleware Workload that provides Energy-aware Management a holistic Management solution 15
  16. 16. Group members www.bsc.es/eBusiness 16

×