This presentation is about MPI programming. Talks about collective communication, OpenMPI, debugging using DDD, using gfilt and valgrind as helper tools.
MPI Sweden Student Club: How Can I Be Social - a Social Media Checklist for m...Gerrit Heijkoop
This document provides a social media checklist for meetings. It discusses exploring one's social media presence and doing a self-reflection. The checklist covers choosing the right accounts and hashtags, defining social media roles and coordinating efforts. It emphasizes telling stories with photos and keeping content relevant, learnable, lovable, laughable and location-based. The document concludes by asking the reader what they will do with their social media presence.
Open MPI new version number scheme and roadmapJeff Squyres
The document discusses Open MPI's transition to a new version numbering scheme and release planning roadmap. Open MPI will move from an "odd/even" numbering scheme to a "A.B.C" scheme where A changes for backwards incompatible releases, B changes for new features, and C changes for bug fixes. Version 1.10.0 will start using the new scheme, with larger new features planned for version 2.0.0 later in the year. Future release series are planned to be supported for around 2 years each.
In this slidecast, Jeff Squyres from Cisco describes the proposed successor to the Linux verbs API that is designed to better serve the needs of MPI.
Learn more: http://blogs.cisco.com/performance/a-fun-thing-happened-on-the-way-to-the-openframeworks-discussion-today/#more-134672
Watch the video presentation: http://wp.me/p3RLHQ-beT
The document discusses using MPI (Message Passing Interface) for parallel programming on high performance computing systems, describing key MPI concepts like point-to-point communication, collective operations, and I/O functions. It also provides examples of how to implement simple MPI programs in C/C++/Fortran using libraries like MPICH2 and how MPI can scale to large clusters with millions of processes.
Write optimization in external memory data structuresleifwalsh
After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.
A Fractal Tree is a write-optimized data structure that matches the insertion performance of an LSM-tree while maintaining the optimal query performance of a B-tree. It's inspired by many data structures (Buffered Repository Trees, B^ε trees, ...) but the real definition is just what we've implemented at Tokutek.
I'll provide background on B-trees and LSM-trees, an overview of how Fractal Trees work, where they differ from B-trees and LSM-trees, and how we use their performance advantages in some obvious and some surprising ways to power new MySQL and MongoDB features in TokuDB and TokuMX.
This document discusses analyzing the efficiency of algorithms. It begins by explaining how to measure algorithm efficiency using Big O notation, which estimates how fast an algorithm's execution time grows as the input size increases. Common growth rates like constant, logarithmic, linear, and quadratic time are described. Examples are provided to demonstrate determining the Big O of various algorithms. Specific algorithms analyzed in more depth include binary search, selection sort, insertion sort, and Towers of Hanoi. The document aims to introduce techniques for developing efficient algorithms using approaches like dynamic programming, divide-and-conquer, and backtracking.
Training Deep Neural Networks has been a difficult task for a long time. Recently diverse approaches have been presented to tackle these difficulties, showing that deep models improve the performance of shallow ones in some areas like signal processing, signal classification or signal segmentation, whatever type of signals, e.g. video, audio or images. One of the most important methods is greedy layer-wise unsupervised pre-training followed by a fine-tuning phase. Despite the advantages of this procedure, it does not fit some scenarios where real time learning is needed, as for adaptation of some time-series models. This paper proposes to couple both phases into one, modifying the loss function to mix together the unsupervised and supervised parts. Benchmark experiments with MNIST database prove the viability of the idea for simple image tasks, and experiments with time-series forecasting encourage the incorporation of this idea into on-line learning approaches. The interest of this method in time-series forecasting is motivated by the study of predictive models for domotic houses with intelligent control systems.
MPI Sweden Student Club: How Can I Be Social - a Social Media Checklist for m...Gerrit Heijkoop
This document provides a social media checklist for meetings. It discusses exploring one's social media presence and doing a self-reflection. The checklist covers choosing the right accounts and hashtags, defining social media roles and coordinating efforts. It emphasizes telling stories with photos and keeping content relevant, learnable, lovable, laughable and location-based. The document concludes by asking the reader what they will do with their social media presence.
Open MPI new version number scheme and roadmapJeff Squyres
The document discusses Open MPI's transition to a new version numbering scheme and release planning roadmap. Open MPI will move from an "odd/even" numbering scheme to a "A.B.C" scheme where A changes for backwards incompatible releases, B changes for new features, and C changes for bug fixes. Version 1.10.0 will start using the new scheme, with larger new features planned for version 2.0.0 later in the year. Future release series are planned to be supported for around 2 years each.
In this slidecast, Jeff Squyres from Cisco describes the proposed successor to the Linux verbs API that is designed to better serve the needs of MPI.
Learn more: http://blogs.cisco.com/performance/a-fun-thing-happened-on-the-way-to-the-openframeworks-discussion-today/#more-134672
Watch the video presentation: http://wp.me/p3RLHQ-beT
The document discusses using MPI (Message Passing Interface) for parallel programming on high performance computing systems, describing key MPI concepts like point-to-point communication, collective operations, and I/O functions. It also provides examples of how to implement simple MPI programs in C/C++/Fortran using libraries like MPICH2 and how MPI can scale to large clusters with millions of processes.
Write optimization in external memory data structuresleifwalsh
After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.
A Fractal Tree is a write-optimized data structure that matches the insertion performance of an LSM-tree while maintaining the optimal query performance of a B-tree. It's inspired by many data structures (Buffered Repository Trees, B^ε trees, ...) but the real definition is just what we've implemented at Tokutek.
I'll provide background on B-trees and LSM-trees, an overview of how Fractal Trees work, where they differ from B-trees and LSM-trees, and how we use their performance advantages in some obvious and some surprising ways to power new MySQL and MongoDB features in TokuDB and TokuMX.
This document discusses analyzing the efficiency of algorithms. It begins by explaining how to measure algorithm efficiency using Big O notation, which estimates how fast an algorithm's execution time grows as the input size increases. Common growth rates like constant, logarithmic, linear, and quadratic time are described. Examples are provided to demonstrate determining the Big O of various algorithms. Specific algorithms analyzed in more depth include binary search, selection sort, insertion sort, and Towers of Hanoi. The document aims to introduce techniques for developing efficient algorithms using approaches like dynamic programming, divide-and-conquer, and backtracking.
Training Deep Neural Networks has been a difficult task for a long time. Recently diverse approaches have been presented to tackle these difficulties, showing that deep models improve the performance of shallow ones in some areas like signal processing, signal classification or signal segmentation, whatever type of signals, e.g. video, audio or images. One of the most important methods is greedy layer-wise unsupervised pre-training followed by a fine-tuning phase. Despite the advantages of this procedure, it does not fit some scenarios where real time learning is needed, as for adaptation of some time-series models. This paper proposes to couple both phases into one, modifying the loss function to mix together the unsupervised and supervised parts. Benchmark experiments with MNIST database prove the viability of the idea for simple image tasks, and experiments with time-series forecasting encourage the incorporation of this idea into on-line learning approaches. The interest of this method in time-series forecasting is motivated by the study of predictive models for domotic houses with intelligent control systems.
Introducing TokuMX: The Performance Engine for MongoDB (NYC.rb 2013-12-10)leifwalsh
TokuMX is a drop-in replacement for MongoDB that provides improved storage and performance. It uses its own indexing architecture called Fractal Tree indexing to allow for highly compressed and concurrent operations on indexes larger than available RAM. TokuMX offers features like ACID compliance and MVCC for increased consistency compared to MongoDB. It provides significantly higher compression ratios than MongoDB, reducing storage requirements by over 10x in some workloads.
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
Most modern databases concern themselves with their ability to scale a workload beyond the power of one machine. But maintaining a database across multiple machines is inherently more complex than it is on a single machine. As soon as scaling out is required, suddenly a lot of scaling out is required, to deal with new problems like index suitability and load balancing.
Write optimized data structures are well-suited to a sharding architecture that delivers higher efficiency than traditional sharding architectures. This talk describes a new sharding architecture for MongoDB applications that can be achieved with write optimized storage like TokuMX's Fractal Tree indexes.
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...Sri Ambati
Top 10 Performance Gotchas in scaling in-memory Algorithms
Abstract:
Math Algorithms have primarily been the domain of desktop data science. With the success of scalable algorithms at Google, Amazon, and Netflix, there is an ever growing demand for sophisticated algorithms over big data. In this talk, we get a ringside view in the making of the world's most scalable and fastest machine learning framework, H2O, and the performance lessons learnt scaling it over EC2 for Netflix and over commodity hardware for other power users.
Top 10 Performance Gotchas is about the white hot stories of i/o wars, S3 resets, and muxers, as well as the power of primitive byte arrays, non-blocking structures, and fork/join queues. Of good data distribution & fine-grain decomposition of Algorithms to fine-grain blocks of parallel computation. It's a 10-point story of the rage of a network of machines against the tyranny of Amdahl while keeping the statistical properties of the data and accuracy of the algorithm.
Track: Scalability, Availability, and Performance: Putting It All Together
Time: Wednesday, 11:45am - 12:35pm
Mejora del reconocimiento de palabras manuscritas aisladas mediante un clasif...Francisco Zamora-Martinez
Este documento describe un estudio para mejorar el reconocimiento de palabras manuscritas aisladas cortas mediante la combinación de diferentes clasificadores. Se utilizaron clasificadores basados en HMM y HMM-MLP para palabras de cualquier longitud, y clasificadores basados en MLP para palabras cortas de hasta 3 letras. Al combinar los resultados de los diferentes clasificadores mediante un método de recuento de Borda, se obtuvo una mejora significativa en la tasa de error de palabra en comparación con los clasificadores individuales.
The document discusses visualization of large finite element method (FEM) models. It describes the motivation, typical FEM model structure including nodes, elements and components, the FEM analysis process of importing CAD data, meshing, simulation, and results visualization. It outlines the requirements for out-of-core visualization including interactive manipulation and cutting planes. The design partitions the FEM model spatially and stores elements in a binary forest for efficient out-of-core access and simplification. OpenSceneGraph is used for rendering elements paged in from files on demand.
ScicomP 2015 presentation discussing best practices for debugging CUDA and OpenACC applications with a case study on our collaboration with LLNL to bring debugging to the OpenPOWER stack and OMPT.
The document discusses two classic sorting algorithms: mergesort and quicksort. It provides information on the basic functioning of mergesort, including that it works by dividing an array into halves, recursively sorting each half, and then merging the two sorted halves. The time complexity of mergesort is analyzed and shown to be O(n log n) through solving a recurrence relation. Quicksort is also mentioned as another commonly used sorting algorithm.
In this paper we propose a family of Viterbi algorithms specialized for lexical tree based FSA and HMM acoustic models. Two algorithms to decode a tree lexicon with left-to-right models with or without skips and other algorithm which takes a directed acyclic graph as input and performs error correcting decoding are presented. They store the set
of active states topologically sorted in contiguous memory queues. The number of basic operations needed to update each hypothesis is reduced and also more locality in memory is obtained reducing the expected number of cache misses and achieving a speed-up over other implementations.
Write-optimization in external memory data structures (Highload++ 2014)leifwalsh
After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.
A Fractal Tree is a write-optimized data structure that matches the insertion performance of an LSM-tree while maintaining the optimal query performance of a B-tree. It's inspired by many data structures (Buffered Repository Trees, B^ε trees, ...) but the real definition is just what we've implemented at Tokutek.
I'll provide background on B-trees and LSM-trees, an overview of how Fractal Trees work, where they differ from B-trees and LSM-trees, and how we use their performance advantages in some obvious and some surprising ways to power new MySQL and MongoDB features in TokuDB and TokuMX.
In this paper, we describe a novel approach to Part-Of-Speech tagging based on neural networks. Multilayer perceptrons are used following corpus-based learning from contextual and lexical information. The Penn Treebank corpus has been used for the training and evaluation of the tagging system. The results show that the connectionist approach is feasible and comparable with other approaches.
Basics of Algorithms and Analysis of algorithm is in there, which includes Time complexity , space complexity, three cases ( best, average, worst) and analysis of Insertion sort.
*For knowledge purpose only*
*Hope you'll come up with better one*
Some empirical evaluations of a temperature forecasting module based on Art...Francisco Zamora-Martinez
This document describes experiments using artificial neural networks (ANNs) to forecast indoor temperature in a "domotic" smart home environment. ANNs were trained on historical temperature and time data, and evaluated on their ability to predict temperature values up to 3 hours in the future. Creating an ensemble model combining ANNs trained for different forecast horizons improved accuracy over individual models. The best-performing ensemble model achieved mean absolute errors between 0.027-0.352°C on validation and test data for forecasts up to 3 hours ahead.
The document discusses parallel programming using the Message Passing Interface (MPI). It provides an overview of MPI, including what MPI is, common implementations like OpenMPI, the general MPI API, point-to-point and collective communication functions, and how to perform basic operations like send, receive, broadcast and reduce. It also covers MPI concepts like communicators, blocking vs non-blocking communication, and references additional resources for learning more about MPI programming.
The document discusses abapGit, an open source project that implements Git version control in ABAP. It was created in 2014 to address limitations in SAP's native version control. AbapGit allows developers to work with online and offline repositories, supports over 60 object types, and has been adopted by some companies like emineo to improve transparency, enable experiments, and facilitate code reviews. A live demo is then shown of using abapGit functionality like committing changes, viewing diffs, and linking a local repository to one hosted on GitHub.
Marchand leny mass digitization systems and open source softwareFIAT/IFTA
The document discusses a mass digitization project undertaken by INA to digitize 135,000 Beta SP tapes within 3 years. INA partnered with Ektacom to build an efficient, high-quality, and flexible technical solution. The system aimed to digitize 120 tapes per day using 4 encoding servers and open-source FFmpeg software for flexibility. After one year, the system has proven reliable and compatible with non-compliant tapes, requiring fewer VTRs and saving operator time while meeting productivity goals. Open-source tools have also proven useful for various tasks at INA like transcoding, analyzing metadata, and infrastructure components.
The document summarizes the itsme technical development seminar, which outlines the itsme team, current development status, architecture, and roadmap. The team is working on requirements definition, prototyping using Python, and developing a graphical toolkit. The architecture aims for logical separation between the UI and back-end data management. Major milestones include alpha testing in January 2010 and a beta release in April 2010.
Simplifying and accelerating converged media with Open Visual CloudLiz Warner
Challenges exist with media transformation into Visual Cloud services and the flexibility to migrate those services to new HW platforms. Learn how Intel and partners are solving these challenges with highly optimized cloud native media processing, media analytics, and graphics/rendering components to quickly and easily deliver end-to-end visual cloud services with scalable open source software. Two visual cloud services around media delivery and media analytics will be demonstrated to showcase how to enable faster time to market for innovative “new media” services.
The document discusses a talk on OpenACC, OpenMP, offloading and GCC. The agenda includes an introduction and history, GCC's offloading implementation, OpenMP updates in GCC 13, OpenACC updates, OpenMP memory management and unified shared memory, AMD GCN port updates, and nvptx port updates.
The document discusses OpenNMS reporting enhancements. It describes the current status of reporting in OpenNMS, including performance data stored in RRD files and other data in SQL databases. It then discusses why a reporting engine would be useful, including easier customized reports, scheduling, and deployment. The document outlines how the OpenNMS reporting engine would work using JasperServer and its web service API to generate reports from OpenNMS data. Finally, it provides an example report and discusses alternatives to JasperServer before concluding with future perspectives.
Introducing TokuMX: The Performance Engine for MongoDB (NYC.rb 2013-12-10)leifwalsh
TokuMX is a drop-in replacement for MongoDB that provides improved storage and performance. It uses its own indexing architecture called Fractal Tree indexing to allow for highly compressed and concurrent operations on indexes larger than available RAM. TokuMX offers features like ACID compliance and MVCC for increased consistency compared to MongoDB. It provides significantly higher compression ratios than MongoDB, reducing storage requirements by over 10x in some workloads.
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...leifwalsh
Most modern databases concern themselves with their ability to scale a workload beyond the power of one machine. But maintaining a database across multiple machines is inherently more complex than it is on a single machine. As soon as scaling out is required, suddenly a lot of scaling out is required, to deal with new problems like index suitability and load balancing.
Write optimized data structures are well-suited to a sharding architecture that delivers higher efficiency than traditional sharding architectures. This talk describes a new sharding architecture for MongoDB applications that can be achieved with write optimized storage like TokuMX's Fractal Tree indexes.
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...Sri Ambati
Top 10 Performance Gotchas in scaling in-memory Algorithms
Abstract:
Math Algorithms have primarily been the domain of desktop data science. With the success of scalable algorithms at Google, Amazon, and Netflix, there is an ever growing demand for sophisticated algorithms over big data. In this talk, we get a ringside view in the making of the world's most scalable and fastest machine learning framework, H2O, and the performance lessons learnt scaling it over EC2 for Netflix and over commodity hardware for other power users.
Top 10 Performance Gotchas is about the white hot stories of i/o wars, S3 resets, and muxers, as well as the power of primitive byte arrays, non-blocking structures, and fork/join queues. Of good data distribution & fine-grain decomposition of Algorithms to fine-grain blocks of parallel computation. It's a 10-point story of the rage of a network of machines against the tyranny of Amdahl while keeping the statistical properties of the data and accuracy of the algorithm.
Track: Scalability, Availability, and Performance: Putting It All Together
Time: Wednesday, 11:45am - 12:35pm
Mejora del reconocimiento de palabras manuscritas aisladas mediante un clasif...Francisco Zamora-Martinez
Este documento describe un estudio para mejorar el reconocimiento de palabras manuscritas aisladas cortas mediante la combinación de diferentes clasificadores. Se utilizaron clasificadores basados en HMM y HMM-MLP para palabras de cualquier longitud, y clasificadores basados en MLP para palabras cortas de hasta 3 letras. Al combinar los resultados de los diferentes clasificadores mediante un método de recuento de Borda, se obtuvo una mejora significativa en la tasa de error de palabra en comparación con los clasificadores individuales.
The document discusses visualization of large finite element method (FEM) models. It describes the motivation, typical FEM model structure including nodes, elements and components, the FEM analysis process of importing CAD data, meshing, simulation, and results visualization. It outlines the requirements for out-of-core visualization including interactive manipulation and cutting planes. The design partitions the FEM model spatially and stores elements in a binary forest for efficient out-of-core access and simplification. OpenSceneGraph is used for rendering elements paged in from files on demand.
ScicomP 2015 presentation discussing best practices for debugging CUDA and OpenACC applications with a case study on our collaboration with LLNL to bring debugging to the OpenPOWER stack and OMPT.
The document discusses two classic sorting algorithms: mergesort and quicksort. It provides information on the basic functioning of mergesort, including that it works by dividing an array into halves, recursively sorting each half, and then merging the two sorted halves. The time complexity of mergesort is analyzed and shown to be O(n log n) through solving a recurrence relation. Quicksort is also mentioned as another commonly used sorting algorithm.
In this paper we propose a family of Viterbi algorithms specialized for lexical tree based FSA and HMM acoustic models. Two algorithms to decode a tree lexicon with left-to-right models with or without skips and other algorithm which takes a directed acyclic graph as input and performs error correcting decoding are presented. They store the set
of active states topologically sorted in contiguous memory queues. The number of basic operations needed to update each hypothesis is reduced and also more locality in memory is obtained reducing the expected number of cache misses and achieving a speed-up over other implementations.
Write-optimization in external memory data structures (Highload++ 2014)leifwalsh
After a long reign as the dominant on-disk data structure for databases and filesystems, B-trees are slowly being replaced by write-optimized data structures, to handle ever-growing volumes of data. Some write optimization techniques, like LSM-trees, give up some of the query performance of B-trees in order to achieve this.
A Fractal Tree is a write-optimized data structure that matches the insertion performance of an LSM-tree while maintaining the optimal query performance of a B-tree. It's inspired by many data structures (Buffered Repository Trees, B^ε trees, ...) but the real definition is just what we've implemented at Tokutek.
I'll provide background on B-trees and LSM-trees, an overview of how Fractal Trees work, where they differ from B-trees and LSM-trees, and how we use their performance advantages in some obvious and some surprising ways to power new MySQL and MongoDB features in TokuDB and TokuMX.
In this paper, we describe a novel approach to Part-Of-Speech tagging based on neural networks. Multilayer perceptrons are used following corpus-based learning from contextual and lexical information. The Penn Treebank corpus has been used for the training and evaluation of the tagging system. The results show that the connectionist approach is feasible and comparable with other approaches.
Basics of Algorithms and Analysis of algorithm is in there, which includes Time complexity , space complexity, three cases ( best, average, worst) and analysis of Insertion sort.
*For knowledge purpose only*
*Hope you'll come up with better one*
Some empirical evaluations of a temperature forecasting module based on Art...Francisco Zamora-Martinez
This document describes experiments using artificial neural networks (ANNs) to forecast indoor temperature in a "domotic" smart home environment. ANNs were trained on historical temperature and time data, and evaluated on their ability to predict temperature values up to 3 hours in the future. Creating an ensemble model combining ANNs trained for different forecast horizons improved accuracy over individual models. The best-performing ensemble model achieved mean absolute errors between 0.027-0.352°C on validation and test data for forecasts up to 3 hours ahead.
The document discusses parallel programming using the Message Passing Interface (MPI). It provides an overview of MPI, including what MPI is, common implementations like OpenMPI, the general MPI API, point-to-point and collective communication functions, and how to perform basic operations like send, receive, broadcast and reduce. It also covers MPI concepts like communicators, blocking vs non-blocking communication, and references additional resources for learning more about MPI programming.
The document discusses abapGit, an open source project that implements Git version control in ABAP. It was created in 2014 to address limitations in SAP's native version control. AbapGit allows developers to work with online and offline repositories, supports over 60 object types, and has been adopted by some companies like emineo to improve transparency, enable experiments, and facilitate code reviews. A live demo is then shown of using abapGit functionality like committing changes, viewing diffs, and linking a local repository to one hosted on GitHub.
Marchand leny mass digitization systems and open source softwareFIAT/IFTA
The document discusses a mass digitization project undertaken by INA to digitize 135,000 Beta SP tapes within 3 years. INA partnered with Ektacom to build an efficient, high-quality, and flexible technical solution. The system aimed to digitize 120 tapes per day using 4 encoding servers and open-source FFmpeg software for flexibility. After one year, the system has proven reliable and compatible with non-compliant tapes, requiring fewer VTRs and saving operator time while meeting productivity goals. Open-source tools have also proven useful for various tasks at INA like transcoding, analyzing metadata, and infrastructure components.
The document summarizes the itsme technical development seminar, which outlines the itsme team, current development status, architecture, and roadmap. The team is working on requirements definition, prototyping using Python, and developing a graphical toolkit. The architecture aims for logical separation between the UI and back-end data management. Major milestones include alpha testing in January 2010 and a beta release in April 2010.
Simplifying and accelerating converged media with Open Visual CloudLiz Warner
Challenges exist with media transformation into Visual Cloud services and the flexibility to migrate those services to new HW platforms. Learn how Intel and partners are solving these challenges with highly optimized cloud native media processing, media analytics, and graphics/rendering components to quickly and easily deliver end-to-end visual cloud services with scalable open source software. Two visual cloud services around media delivery and media analytics will be demonstrated to showcase how to enable faster time to market for innovative “new media” services.
The document discusses a talk on OpenACC, OpenMP, offloading and GCC. The agenda includes an introduction and history, GCC's offloading implementation, OpenMP updates in GCC 13, OpenACC updates, OpenMP memory management and unified shared memory, AMD GCN port updates, and nvptx port updates.
The document discusses OpenNMS reporting enhancements. It describes the current status of reporting in OpenNMS, including performance data stored in RRD files and other data in SQL databases. It then discusses why a reporting engine would be useful, including easier customized reports, scheduling, and deployment. The document outlines how the OpenNMS reporting engine would work using JasperServer and its web service API to generate reports from OpenNMS data. Finally, it provides an example report and discusses alternatives to JasperServer before concluding with future perspectives.
Container Attached Storage (CAS) with OpenEBS - SDC 2018OpenEBS
The document discusses container attached storage (CAS), which aims to provide storage for containers in a container-native way. CAS is designed to run in containers for containers in user space, using the Kubernetes substrate. It addresses challenges like small working sets, ephemeral storage, and cloud lock-in by keeping data local to workloads and allowing per-workload optimization and migration. The document outlines the CAS design and implementation, including using an input/output container to handle storage IO in user space and leveraging technologies like SPDK, virtio, and Kubernetes custom resources.
This document discusses the LLVM compiler system and its approach. LLVM aims to build modular compiler components that implement modern techniques, integrate well together, have few dependencies, and integrate with existing tools. This allows compilers built with LLVM components to share code and improvements, choose the best components, and be constructed quickly. The document provides llvm-gcc as an example client that uses GCC's front-end but replaces the optimizer and code generator with LLVM's modern ones to gain benefits like interprocedural optimizations, aggressive loop optimizations, and retargetable code generation.
Programming Models for High-performance ComputingMarc Snir
This document discusses programming systems for high-performance computing. It begins by describing how distributed memory parallel systems replaced vector machines in the early 1990s. Message passing interface (MPI) became the dominant programming model. The document then examines reasons for MPI's success, including quick implementation, low cost, and good performance. It also discusses factors needed for a new programming system to succeed, such as addressing compelling needs and backward compatibility. While MPI will continue to be used, its scalability may become an issue for exascale systems.
Open Process Automation: Status of the O-PAS™ Standard, Conformance Certifica...Yokogawa1
Today, end users in the energy and chemical industries must work with and integrate multiple proprietary systems in almost every process plant or facility. These systems include manufacturing execution systems (MES), distributed control systems (DCS), human-machine interfaces (HMI), programmable logic controllers (PLC) and inputs/outputs (I/O). These multiple proprietary systems, and the integration thereof, result in elevated capital costs on new projects and high total cost of ownership through the asset lifecycle, especially in the operation and maintenance of such systems. The Open Process Automation™ Forum (OPAF) is an international forum of end users, system integrators, suppliers, academia, and standards organizations who are working together to develop the specifications for open process control systems. OPAF’s goal is to enable more open and modular systems that supports integration of best-in-class components. This architecture will provide both configuration and application portability across components from different suppliers, thereby reducing system capital cost and total cost of ownership. The vision is a standards-based, open, secure and interoperable process control architecture that reduces the cost of control system upgrades and replacements, as well as removes barriers to technology insertion, with adaptable cybersecurity designed in. This keynote presentation will outline the Open Process Automation initiative, standard and status of industry prototyping, as well as share evidence of commercialization.
APIs and SDKs: Breaking Into and Succeeding in a Specialty MarketScott Abel
This document provides an overview of writing documentation for APIs and SDKs. It discusses typical users and producers of APIs/SDKs, ideal information to include in SDK and API documentation, common documentation deliverables, programming concepts to cover, and help authoring tools. The document also outlines benefits and drawbacks to technical writers in this specialty, ways to break into the market including education and training options, and resources for API/SDK documentation writers.
Amora is a mobile remote assistant created to address limitations of existing free and open-source remote control software in 2007. It was developed using C, Python, and C++ to provide a stable, high-performance, and easy-to-use interface for remote control across Linux, Windows and mobile devices. The project is open source under GPL 2.0 and has been officially packaged for several Linux distributions.
This is a reupload of the talk I delivered at the Spark London Meetup group, November 2016. Original link to the event: https://www.meetup.com/Spark-London/events/235626954/
I share observations and best practices.
Python uses DYNAMIC typing AND A COMBINATION of reference counting AND A cycle-detecting GARBAGE collector for memory MANAGEMENT. It ALSO FEATURES DYNAMIC NAME RESOLUTION(LATE binding) which binds methods AND VARIABLE NAMES during PROGRAM execution.
As we know, whenever we run ANY APPLICATION it gets LOADED into RAM AND some memory gets ALLOCATED by OS for THAT APPLICATION.
Learn more about Python PRogramming with Learnbay.
Visit:https://www.learnbay.co/data-science-course/
The next web will be about flow, this flow will be user generated pipelines through applications and services. Unlike before these Pipelines will be definable, non-proprietary and shareable by anyone
LAS16-108: JerryScript and other scripting languages for IoTLinaro
LAS16-108: JerryScript and other scripting languages for IoT
Speakers: Paul Sokolovsky
Date: September 26, 2016
★ Session Description ★
Overview of small-size/low-resource VHLL (very high-level languages)/scripting languages available for embedded/IoT usage (JavaScript, Python, Lua, etc.). Typical/possible usage scenarios and benefits. Challenges of running VHLLs in deeply embedded/very resource-constrained environments. Progress reports on porting JerryScript to Zephyr. (Possibly, architecture comparison of JerryScript and MicroPython).
★ Resources ★
Etherpad: pad.linaro.org/p/las16-108
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-108/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
Advanced technologies and techniques for debugging HPC applicationsRogue Wave Software
Presented at Supercomputing 18. Debugging and analyzing today's HPC applications requires a tool with capabilities and features to support the demands of today’s complex HPC applications. Debugging tools must be able to handle the extensive use of C++ templates and the STL, use of many shared libraries, optimized code, code leveraging GPU accelerators and applications constructed with multiple languages.
This presentation walks through the different advanced technologies provided by the debugger, TotalView for HPC, and shows how they can be used to easily understand complex code and quickly solve difficult problems. Showcasing TotalView’s new user interface, you will learn how to leverage the amazing technology of reverse debugging to replay how your program ran. You will also see how TotalView provides a unified view across applications that utilize Python and C++, debug CUDA applications, find memory leaks in your HPC codes and other powerful techniques for improving the quality of your code.
FIWARE Global Summit - Real-time Media Stream Processing Using KurentoFIWARE
Kurento Media Server is an open source platform for processing audio and video streams. It allows input streams to be processed and output streams to be manipulated or redistributed. The server has endpoints to receive media and filters that can transform or process the media. Client applications connect to Kurento to build processing pipelines with these components and control the streaming applications.
The document describes an IBM workshop on CAPI and OpenCAPI technologies. It provides an overview of FPGA acceleration using SNAP, including how SNAP simplifies FPGA programming using a C/C++ based approach. Examples of use cases for FPGA acceleration like video processing and machine learning inference are also presented.
OpenID AuthZEN Interop Read Out - AuthorizationDavid Brossard
During Identiverse 2024 and EIC 2024, members of the OpenID AuthZEN WG got together and demoed their authorization endpoints conforming to the AuthZEN API
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Things to Consider When Choosing a Website Developer for your Website | FODUUFODUU
Choosing the right website developer is crucial for your business. This article covers essential factors to consider, including experience, portfolio, technical skills, communication, pricing, reputation & reviews, cost and budget considerations and post-launch support. Make an informed decision to ensure your website meets your business goals.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfTechgropse Pvt.Ltd.
In this blog post, we'll delve into the intersection of AI and app development in Saudi Arabia, focusing on the food delivery sector. We'll explore how AI is revolutionizing the way Saudi consumers order food, how restaurants manage their operations, and how delivery partners navigate the bustling streets of cities like Riyadh, Jeddah, and Dammam. Through real-world case studies, we'll showcase how leading Saudi food delivery apps are leveraging AI to redefine convenience, personalization, and efficiency.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Advanced MPI
1. Advanced MPI
metu-ceng
ts@TayfunSen.com
28 November 2008
2. Company
LOGO Outline
• Quick Introduction to MPI
• Collective Communication
• OpenMPI
• MPI using C++: Boost Libraries
• User Defined Types
• Tools of Trade & Debugging
• References
•Q&A
28/11/2008 Advanced MPI 2/20
3. Company
LOGO Quick Intro to MPI
• MPI is a standard with many
implementations (libraries)
• OpenMPI which evolved from lam-
mpi/mpich and MVAPICH are
bigger ones
• Basically a message passing API
• Some basic requirements: high
performing, scalable and
portable.
28/11/2008 Advanced MPI 3/20
4. Company
LOGO Collective Comm.
• Point to point communication has
been talked in previous seminars
• communicator: groups processes
so you can communicate with all at
the same time (custom groups)
• synchronization, data sharing,
reduce operations are all
examples
28/11/2008 Advanced MPI 4/20
5. Company
LOGO Collective Comm.
• Demo time!
Example collective communication
programs on localhost
Connecting to NAR – best guide is
the web site –
hpc.ceng.metu.edu.tr
Which MPI to use, how to set up
environment?
Using man pages
28/11/2008 Advanced MPI 5/20
6. Company
LOGO OpenMPI
• Both OpenMPI (Beware: not to be
confused with OpenMP) and
MVAPICH are good implementations,
installed on NAR – choose according to
your taste
• No proper documentation at the
moment, use the FAQ page
• Can also do MPMD:
$ mpirun np 2 a.out : np 2
b.out
Ranks 0 to 3 on different progs.
28/11/2008 Advanced MPI 6/20
7. Company
LOGO OpenMPI
• OpenMPI is highly configurable
• MCA (modular component
architecture) parameters for run-
time tuning
• Check following commands for
these parameters:
$ ompi_info param all all
$ ompi_info param btl tcp
Shows parameters and their
explanations. Check as needed.
28/11/2008 Advanced MPI 7/20
8. Company
LOGO OpenMPI
• $ mpirun mca btl self,sm,gm ...
• Above says use only loopback
communication, shared memory or
Myrinet/GM for this run
• While communicating on the same
machine, shared memory is used
automatically
• TCP is used by default if it is available,
for node2node communication. Check
http://www.open-mpi.org/faq/?category=tuning
for more info
28/11/2008 Advanced MPI 8/20
9. Company
LOGO MPI with some Boost
• Boost C++ libraries complete STL
• Some are to be included in the
standard
28/11/2008 Advanced MPI 9/20
10. Company
LOGO MPI with some Boost
• Pretty easy to set up, install
• Some libraries are header only
• No Boost.MPI library on NAR
/usr/lib/libboo* and no headers
/usr/include/boost/
• Interesting libraries are Boost.MPI and
Boost.Serialization
• Set up on your home directory on NAR
or ask the sys admins
• Can use any underlying MPI
implementation
28/11/2008 Advanced MPI 10/20
11. Company
LOGO MPI with some Boost
• Previous code becomes:
mpi::environment env(argc, argv);
mpi::communicator world;
std::cout << quot;I am process quot; <<
world.rank() << quot; of quot; <<
world.size() << quot;.quot; << std::endl;
• Some examples
• Need to write a Makefile to add
required libraries/header files while
compiling
• Python bindings also exist
28/11/2008 Advanced MPI 11/20
12. Company
LOGO User Defined Data Types
• Best thing about Boost is it is so
easier to transfer complex types
• Uses boost::serialization
• Quick and easy
• An example
• Some thoughts: what happens
when there are pointers? (like
when transmitting trees?)
28/11/2008 Advanced MPI 12/20
13. Company
LOGO Tools of Trade
• gfilt for intelligible template errors
• Extremely useful when using C++
to code
• For debugging, one can use gdb
and for memory checking, valgrind
28/11/2008 Advanced MPI 13/20
14. Company
LOGO Tools of Trade
BEFORE
AFTER
28/11/2008 Advanced MPI 14/20
15. Company
LOGO Debugging
• Parallel debuggers? Totalview?
DDT? Expensive, costs $$$
• Too simple to use a conventional
debugger. GDB is your best friend.
• If using Boost libraries, C++ or
more complex structures like trees:
Use a graphical debugger DDD,
xxgdb, ... choose your favourite.
28/11/2008 Advanced MPI 15/20
16. Company
LOGO Debugging
• Demo time! Using DDD to debug
parallel programs.
28/11/2008 Advanced MPI 16/20
17. Company
LOGO Debugging
• What else? Valgrind for detecting
memory leaks, and possible core
dumps. Just use
• # mpirun -np 2 valgrind –leak-
check=full ./main
• Beware! Extremely slow while
running with valgrind
28/11/2008 Advanced MPI 17/20
18. Company
LOGO Some left out bits
• There are more advanced topics
such as Parallel I/O, RDMA etc.
• Different hardware, network
characteristics enable different
methods
• Myrinet, OpenFabrics, Quadrics...
• Interconnections becoming much
more complex
28/11/2008 Advanced MPI 18/20
19. Company
LOGO References & Notes
• The presentation template is from:
http://www.presentationhelper.co.uk/free-open-offic
• Hardware information of NAR retrieved from
http://hpc.ceng.metu.edu.tr/system/hardware/
• A great tutorial on MPI can be found @ NCSA
http://webct.ncsa.uiuc.edu:8900/public/MPI/
• Boost homepage is at: http://www.boost.org/
• OpenMPI home page can be found:
http://www.open-mpi.org/
• This presentation can be obtained from
slideshare at: http://www.slideshare.net/tayfun/
• gfilt can be obtained from
http://www.bdsoft.com/tools/stlfilt.html
28/11/2008 Advanced MPI 19/20
20. Company
LOGO The End
Thanks For Your Time.
Any Questions
?
28/11/2008 Advanced MPI 20/20