This document discusses combining event processing and big data using the Lambda architecture. It begins with definitions of big data and fast data. It then discusses architectures for data systems, highlighting the need for immutable data and views to provide human fault tolerance. The Lambda architecture is introduced as providing both batch and real-time processing of all incoming data to generate views over the data for querying. Implementation of the Lambda architecture combines batch processing using Hadoop with real-time processing using stream processing.
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Chris Dagdigian
This is a talk I put together for a http://www.neren.org/ seminar called "Bridging the Gap: Research Facilitation". Tried to give a biotech/pharma view for a mostly academic audience.
Introduction à la gouvernance de données, Philippe Bourgeois, Senior Consultant Trivadis. Conférence donnée dans le cadre du Swiss Data Forum, du 24 novembre 2015 à Lausanne
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
BioIT World 2016 - HPC Trends from the TrenchesChris Dagdigian
As presented at BioIT World 2016. In one of the more popular presentations of the Expo, Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. He’ll cover what has changed (or not) in the past year around infrastructure, storage, computing, and networks. This presentation will help you understand IT to build and support data intensive science.
Video link from the presentation: biote.am/bs
[Note: email chris@bioteam.net if you would like a PDF copy of this presentation]
This was a 30 min talk intended as one of the opening/overview presentations before a full-day deep dive into ScienceDMZ design patterns and architectures.
Direct downloads are not enabled. Contact me directly (chris@bioteam.net) if you for some odd reason want a copy of this slide deck!
Bio-IT Trends From The Trenches (digital edition)Chris Dagdigian
Note: Contact me directly dag@bioteam.net if you would like a PDF download of these slides
This is Chris Dagdigian’s 10th year delivering his no holds barred, candid state of the industry address at BioIT World, and we are not going to let a pandemic stop him.
Instead of his typical talk, five distinguished panelists will join Chris for a spirited discussion on Current Events and Scientific Computing and the impacts of the COVID-19 Pandemic:
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Chris Dagdigian
This is a talk I put together for a http://www.neren.org/ seminar called "Bridging the Gap: Research Facilitation". Tried to give a biotech/pharma view for a mostly academic audience.
Introduction à la gouvernance de données, Philippe Bourgeois, Senior Consultant Trivadis. Conférence donnée dans le cadre du Swiss Data Forum, du 24 novembre 2015 à Lausanne
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
BioIT World 2016 - HPC Trends from the TrenchesChris Dagdigian
As presented at BioIT World 2016. In one of the more popular presentations of the Expo, Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. He’ll cover what has changed (or not) in the past year around infrastructure, storage, computing, and networks. This presentation will help you understand IT to build and support data intensive science.
Video link from the presentation: biote.am/bs
[Note: email chris@bioteam.net if you would like a PDF copy of this presentation]
This was a 30 min talk intended as one of the opening/overview presentations before a full-day deep dive into ScienceDMZ design patterns and architectures.
Direct downloads are not enabled. Contact me directly (chris@bioteam.net) if you for some odd reason want a copy of this slide deck!
Bio-IT Trends From The Trenches (digital edition)Chris Dagdigian
Note: Contact me directly dag@bioteam.net if you would like a PDF download of these slides
This is Chris Dagdigian’s 10th year delivering his no holds barred, candid state of the industry address at BioIT World, and we are not going to let a pandemic stop him.
Instead of his typical talk, five distinguished panelists will join Chris for a spirited discussion on Current Events and Scientific Computing and the impacts of the COVID-19 Pandemic:
Cloud Sobriety for Life Science IT Leadership (2018 Edition)Chris Dagdigian
Candid/blunt AWS advice for research IT and life science IT leadership. Hard lessons learned from many years of AWS consulting. Contact dag@bioteam.net if you want a PDF copy of this presentation
Mapping Life Science Informatics to the CloudChris Dagdigian
Infrastructure cloud platforms such as those offered by Amazon Web Services are not designed and built with scientific research as the primary use case. These presentation slides cover the current state of mapping life science research and HPC technique onto “the cloud” and how to work around the common engineering, orchestration and data movement problems.
[Note: I've replaced the 2011 version of this talk deck with a slightly updated version as delivered at the AIRI Petabyte Challenge Meeting]
This is a very short slide deck I did for a 10-minute slot on a http://pistoiaalliance.org/ webinar. The slides do not fully cover what I intend to talk about so if the webinar is recorded and available afterwards I'll update this description with the recording URL.
PDF copy of the slides available upon request ("chris@bioteam.net")
Tiny slide deck from a 5-min lightning talk covering a recent project involving live replication of 2-petabytes of scientific data.
Please leave feedback if you'd like to see this as a long-form technical blog article or conference talk, thanks!
Annual address covering trends, emerging requirements, pain points and infrastructure issues in the "Bio-IT" aka life science informatics and HPC realm; Email me if you want a PDF of this talk - chris@bioteam.net
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
BioITWorld 2013 presentation - Best practices for building multi-tenant HPC clusters for Pharma/BioTech
Essentially a mini case study of a recent deployment of a multi-petabyte, 1000+ CPU core Linux cluster in the Boston area.
Please email me at: chris@bioteam.net if you would like the actual PDF file itself.
The talk presents the evolution of Big-Data systems from single-purpose MapReduce frameworks to fully general computational infrastructures. In particular, I will follow the evolution of Hadoop, and show the benefits and challenges of a new architectural paradigm that decouples the resource management component (YARN) from the specifics of the application frameworks (e.g., MapReduce, Tez, REEF, Giraph, Naiad, Dryad, Spark,...). We argue that beside the primary goals of increasing scalability and programming model flexibility, this transformation dramatically facilitates innovation.
In this context, I will present some of our contributions to the evolution of Hadoop (namely: work-preserving preemption, and predictable resource allocation), and comment on the fascinating experience of working on open- source technologies from within Microsoft. The current Hadoop APIs (HDFS and YARN) provide the cluster equivalent of an OS API. With this as a backdrop, I will present our attempt to create the equivalent of stdlib for the cluster: the REEF project.
Carlo A. Curino received a PhD from Politecnico di Milano, and spent two years as Post Doc Associate at CSAIL MIT leading the relational cloud project. He worked at Yahoo! Research as Research Scientist focusing on mobile/cloud platforms and entity deduplication at scale. Carlo is currently a Senior Scientist at Microsoft in the Cloud and Information Services Lab (CISL) where he is working on big-data platforms and cloud computing.
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingChris Dagdigian
October 2013 "Beyond the Genome" presentation slides. Talk is mostly focused on issues around IaaS cloud usage for "Bio-IT" and life science informatics & scientific computing.
PDF SLIDES AVAILABLE DIRECTLY - PLEASE EMAIL "CHRIS@BIOTEAM.NET" FOR SLIDES
The way we make decisions has changed. The data we use has changed. The techniques we can apply to data and decisions have changed. Yet what we build and how we build it has barely changed in 20 years.
The definition of madness is doing more of what you already do and expecting different results. The threat to the data warehouse is not from new technology that will replace the data warehouse. It is from destabilization caused by new technology as it changes the architecture, and from failure to adapt to those changes.
The technology that we use is problematic because it constrains and sometimes prevents necessary activities. We don’t need more technology and bigger machines. We need different technology that does different things. More product features from the same vendors won’t solve the problem.
The data we want to use is challenging. We can’t model and clean and maintain it fast enough. We don’t need more data modeling to solve this problem. We need less modeling and more metadata.
And lastly, a change in scale has occurred. It isn’t a simple problem of “big”. The problem with current workloads has been solved, despite the performance problems that many people still have today. Scale has many dimensions – important among them are the number of discrete sources and structures, the rate of change of individual structures, the rate of change in data use, the variety of uses and the concurrency of those uses.
In short, we need new architecture that is not focused on creating stability in data, but one that is adaptable to continuous and rapidly changing uses of data.
Durant cette présentation, nous introduirons des concepts de bases de la science de la donnée et discuterons d’un projet réalisé chez un de nos client.
Nous découvrirons, comment on peut facilement réaliser des projets de science de la donnée à l’aide du langage de programmation statistique R, ainsi que de son intégration dans la nouvelle suite de Microsoft SQL Server 2016.
Bi isn't big data and big data isn't BI (updated)mark madsen
Big data is hyped, but isn't hype. There are definite technical, process and business differences in the big data market when compared to BI and data warehousing, but they are often poorly understood or explained. BI isn't big data, and big data isn't BI. By distilling the technical and process realities of big data systems and projects we can separate fact from fiction. This session examines the underlying assumptions and abstractions we use in the BI and DW world, the abstractions that evolved in the big data world, and how they are different. Armed with this knowledge, you will be better able to make design and architecture decisions. The session is sometimes conceptual, sometimes detailed technical explorations of data, processing and technology, but promises to be entertaining regardless of the level.
Yes, it’s about the data normally called “big”, but it’s not Hadoop for the database crowd, despite the prominent role Hadoop plays. The session will be technical, but in a technology preview/overview fashion. I won’t be teaching you to write MapReduce jobs or anything of the sort.
The first part will be an overview of the types, formats and structures of data that aren’t normally in the data warehouse realm. The second part will cover some of the basic technology components, vendors and architecture.
The goal is to provide an overview of the extent of data available and some of the nuances or challenges in processing it, coupled with some examples of tools or vendors that may be a starting point if you are building in a particular area.
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
Keynote, Munich, June 2016
The way we make decisions has changed. The data we use has changed. The techniques we can apply to data and decisions have changed. Yet what we build and how we build it has barely changed in 20 years.
The definition of madness is doing more of what you already do and expecting different results. The threat to the data warehouse is not from new technology that will replace the data warehouse. It is from destabilization caused by new technology as it changes the architecture, and from failure to adapt to those changes.
The technology that we use is problematic because it constrains and sometimes prevents necessary activities. We don’t need more technology and bigger machines. We need different technology that does different things. More product features from the same vendors won’t solve the problem.
The data we want to use is challenging. We can’t model and clean and maintain it fast enough. We don’t need more data modeling to solve this problem. We need less modeling and more metadata.
And lastly, a change in scale has occurred. It isn’t a simple problem of “big”. The problem with current workloads has been solved, despite the performance problems that many people still have today. Scale has many dimensions – important among them are the number of discrete sources and structures, the rate of change of individual structures, the rate of change in data use, the variety of uses and the concurrency of those uses.
In short, we need new architecture that is not focused on creating stability in data, but one that is adaptable to continuous and rapidly changing uses of data.
Briefing room: An alternative for streaming data collectionmark madsen
Knowing what’s happening in your enterprise right now can mark the difference between success and failure. The key is to have a rich view of activity, such that analysts and others can explore in a fully multidimensional fashion. Benefiting from such a detailed perspective can help professionals identify the exact nature of problems or opportunities, thus enabling precise actions that make a difference quickly.
Register for this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature explain how a nexus of innovations for analyzing network traffic can help companies stay on top of their game. He’ll be briefed by Erik Giesa of ExtraHop, who will showcase his company’s stream analytics technology for wire data, which provides real-time, multidimensional views of network traffic. He’ll share success stories of how ExtraHop has solved otherwise intractable problems and enabled a new level of root-cause analysis.
Disruptive Innovation: how do you use these theories to manage your IT?mark madsen
The term disruptive innovation was popularized by Harvard professor Clayton Christensen in his 1997 book “The Innovator’s Dilemma.” Nearly 20 years later “Disrupt!” is a popular leadership mantra that is more frequently uttered than experienced. You can't productize it. You can't always control it – at least what effects it has in practice. You aren't necessarily going to like every product of innovation. So are you sure you want it? If so, how do you promote a culture in which innovation can flower – and, potentially, thrive? Because that's probably the best that you can do.
Perhaps there's a better framing for innovation than just "disruption.“ This session is an overview of commmoditization and innovation theories followed by basic things you can do to apply that theory to your daily job architecting, choosing and managing a data environment in your company.
7 Big Data Challenges and How to Overcome ThemQubole
Implementing a big data project is difficult. Hadoop is complex, and data governance is crucial. Learn common big data challenges and how to overcome them.
Trivadis TechEvent 2016 Analyzing Oracle related issues using TFACTL by Raine...Trivadis
One of the biggest challenges when maintaining Oracle software, is the number of components to be considered and the corresponding number of trace files. This is especially the case if you are operating Grid Infrastructures because there are additional steps of complexity added. Analysing Real Application Clusters with two or more nodes involves additional analysis due to increasing number of additional trace logs. This session introduces you to Trace File Analyzer Collector ("TFACTL"), a CLI tool, which integrates several analysis tools to monitor various GI/DB components even on multiple nodes.
Cloud Sobriety for Life Science IT Leadership (2018 Edition)Chris Dagdigian
Candid/blunt AWS advice for research IT and life science IT leadership. Hard lessons learned from many years of AWS consulting. Contact dag@bioteam.net if you want a PDF copy of this presentation
Mapping Life Science Informatics to the CloudChris Dagdigian
Infrastructure cloud platforms such as those offered by Amazon Web Services are not designed and built with scientific research as the primary use case. These presentation slides cover the current state of mapping life science research and HPC technique onto “the cloud” and how to work around the common engineering, orchestration and data movement problems.
[Note: I've replaced the 2011 version of this talk deck with a slightly updated version as delivered at the AIRI Petabyte Challenge Meeting]
This is a very short slide deck I did for a 10-minute slot on a http://pistoiaalliance.org/ webinar. The slides do not fully cover what I intend to talk about so if the webinar is recorded and available afterwards I'll update this description with the recording URL.
PDF copy of the slides available upon request ("chris@bioteam.net")
Tiny slide deck from a 5-min lightning talk covering a recent project involving live replication of 2-petabytes of scientific data.
Please leave feedback if you'd like to see this as a long-form technical blog article or conference talk, thanks!
Annual address covering trends, emerging requirements, pain points and infrastructure issues in the "Bio-IT" aka life science informatics and HPC realm; Email me if you want a PDF of this talk - chris@bioteam.net
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
BioITWorld 2013 presentation - Best practices for building multi-tenant HPC clusters for Pharma/BioTech
Essentially a mini case study of a recent deployment of a multi-petabyte, 1000+ CPU core Linux cluster in the Boston area.
Please email me at: chris@bioteam.net if you would like the actual PDF file itself.
The talk presents the evolution of Big-Data systems from single-purpose MapReduce frameworks to fully general computational infrastructures. In particular, I will follow the evolution of Hadoop, and show the benefits and challenges of a new architectural paradigm that decouples the resource management component (YARN) from the specifics of the application frameworks (e.g., MapReduce, Tez, REEF, Giraph, Naiad, Dryad, Spark,...). We argue that beside the primary goals of increasing scalability and programming model flexibility, this transformation dramatically facilitates innovation.
In this context, I will present some of our contributions to the evolution of Hadoop (namely: work-preserving preemption, and predictable resource allocation), and comment on the fascinating experience of working on open- source technologies from within Microsoft. The current Hadoop APIs (HDFS and YARN) provide the cluster equivalent of an OS API. With this as a backdrop, I will present our attempt to create the equivalent of stdlib for the cluster: the REEF project.
Carlo A. Curino received a PhD from Politecnico di Milano, and spent two years as Post Doc Associate at CSAIL MIT leading the relational cloud project. He worked at Yahoo! Research as Research Scientist focusing on mobile/cloud platforms and entity deduplication at scale. Carlo is currently a Senior Scientist at Microsoft in the Cloud and Information Services Lab (CISL) where he is working on big-data platforms and cloud computing.
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingChris Dagdigian
October 2013 "Beyond the Genome" presentation slides. Talk is mostly focused on issues around IaaS cloud usage for "Bio-IT" and life science informatics & scientific computing.
PDF SLIDES AVAILABLE DIRECTLY - PLEASE EMAIL "CHRIS@BIOTEAM.NET" FOR SLIDES
The way we make decisions has changed. The data we use has changed. The techniques we can apply to data and decisions have changed. Yet what we build and how we build it has barely changed in 20 years.
The definition of madness is doing more of what you already do and expecting different results. The threat to the data warehouse is not from new technology that will replace the data warehouse. It is from destabilization caused by new technology as it changes the architecture, and from failure to adapt to those changes.
The technology that we use is problematic because it constrains and sometimes prevents necessary activities. We don’t need more technology and bigger machines. We need different technology that does different things. More product features from the same vendors won’t solve the problem.
The data we want to use is challenging. We can’t model and clean and maintain it fast enough. We don’t need more data modeling to solve this problem. We need less modeling and more metadata.
And lastly, a change in scale has occurred. It isn’t a simple problem of “big”. The problem with current workloads has been solved, despite the performance problems that many people still have today. Scale has many dimensions – important among them are the number of discrete sources and structures, the rate of change of individual structures, the rate of change in data use, the variety of uses and the concurrency of those uses.
In short, we need new architecture that is not focused on creating stability in data, but one that is adaptable to continuous and rapidly changing uses of data.
Durant cette présentation, nous introduirons des concepts de bases de la science de la donnée et discuterons d’un projet réalisé chez un de nos client.
Nous découvrirons, comment on peut facilement réaliser des projets de science de la donnée à l’aide du langage de programmation statistique R, ainsi que de son intégration dans la nouvelle suite de Microsoft SQL Server 2016.
Bi isn't big data and big data isn't BI (updated)mark madsen
Big data is hyped, but isn't hype. There are definite technical, process and business differences in the big data market when compared to BI and data warehousing, but they are often poorly understood or explained. BI isn't big data, and big data isn't BI. By distilling the technical and process realities of big data systems and projects we can separate fact from fiction. This session examines the underlying assumptions and abstractions we use in the BI and DW world, the abstractions that evolved in the big data world, and how they are different. Armed with this knowledge, you will be better able to make design and architecture decisions. The session is sometimes conceptual, sometimes detailed technical explorations of data, processing and technology, but promises to be entertaining regardless of the level.
Yes, it’s about the data normally called “big”, but it’s not Hadoop for the database crowd, despite the prominent role Hadoop plays. The session will be technical, but in a technology preview/overview fashion. I won’t be teaching you to write MapReduce jobs or anything of the sort.
The first part will be an overview of the types, formats and structures of data that aren’t normally in the data warehouse realm. The second part will cover some of the basic technology components, vendors and architecture.
The goal is to provide an overview of the extent of data available and some of the nuances or challenges in processing it, coupled with some examples of tools or vendors that may be a starting point if you are building in a particular area.
Everything Has Changed Except Us: Modernizing the Data Warehousemark madsen
Keynote, Munich, June 2016
The way we make decisions has changed. The data we use has changed. The techniques we can apply to data and decisions have changed. Yet what we build and how we build it has barely changed in 20 years.
The definition of madness is doing more of what you already do and expecting different results. The threat to the data warehouse is not from new technology that will replace the data warehouse. It is from destabilization caused by new technology as it changes the architecture, and from failure to adapt to those changes.
The technology that we use is problematic because it constrains and sometimes prevents necessary activities. We don’t need more technology and bigger machines. We need different technology that does different things. More product features from the same vendors won’t solve the problem.
The data we want to use is challenging. We can’t model and clean and maintain it fast enough. We don’t need more data modeling to solve this problem. We need less modeling and more metadata.
And lastly, a change in scale has occurred. It isn’t a simple problem of “big”. The problem with current workloads has been solved, despite the performance problems that many people still have today. Scale has many dimensions – important among them are the number of discrete sources and structures, the rate of change of individual structures, the rate of change in data use, the variety of uses and the concurrency of those uses.
In short, we need new architecture that is not focused on creating stability in data, but one that is adaptable to continuous and rapidly changing uses of data.
Briefing room: An alternative for streaming data collectionmark madsen
Knowing what’s happening in your enterprise right now can mark the difference between success and failure. The key is to have a rich view of activity, such that analysts and others can explore in a fully multidimensional fashion. Benefiting from such a detailed perspective can help professionals identify the exact nature of problems or opportunities, thus enabling precise actions that make a difference quickly.
Register for this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature explain how a nexus of innovations for analyzing network traffic can help companies stay on top of their game. He’ll be briefed by Erik Giesa of ExtraHop, who will showcase his company’s stream analytics technology for wire data, which provides real-time, multidimensional views of network traffic. He’ll share success stories of how ExtraHop has solved otherwise intractable problems and enabled a new level of root-cause analysis.
Disruptive Innovation: how do you use these theories to manage your IT?mark madsen
The term disruptive innovation was popularized by Harvard professor Clayton Christensen in his 1997 book “The Innovator’s Dilemma.” Nearly 20 years later “Disrupt!” is a popular leadership mantra that is more frequently uttered than experienced. You can't productize it. You can't always control it – at least what effects it has in practice. You aren't necessarily going to like every product of innovation. So are you sure you want it? If so, how do you promote a culture in which innovation can flower – and, potentially, thrive? Because that's probably the best that you can do.
Perhaps there's a better framing for innovation than just "disruption.“ This session is an overview of commmoditization and innovation theories followed by basic things you can do to apply that theory to your daily job architecting, choosing and managing a data environment in your company.
7 Big Data Challenges and How to Overcome ThemQubole
Implementing a big data project is difficult. Hadoop is complex, and data governance is crucial. Learn common big data challenges and how to overcome them.
Trivadis TechEvent 2016 Analyzing Oracle related issues using TFACTL by Raine...Trivadis
One of the biggest challenges when maintaining Oracle software, is the number of components to be considered and the corresponding number of trace files. This is especially the case if you are operating Grid Infrastructures because there are additional steps of complexity added. Analysing Real Application Clusters with two or more nodes involves additional analysis due to increasing number of additional trace logs. This session introduces you to Trace File Analyzer Collector ("TFACTL"), a CLI tool, which integrates several analysis tools to monitor various GI/DB components even on multiple nodes.
Trivadis TechEvent 2016 Der Trivadis Weg mit der Cloud von Florian van Keulen...Trivadis
Cloud ist nicht das Ziel: Cloud ist ein Mittel zum Ziel! Das Ziel ist Agilität, Funktionalismus, Skalierbarkeit und Kostenersparnis für die digitale Businesstransformation.
Trivadis TechEvent 2016 Useful Oracle 12c Features for Data Warehousing by Da...Trivadis
Oracle Database 12c containes many new features and extensions. some of them are often mentioned: Pluggable Databases, Information Lifecycle Management, In-Memory Option. In addition to these "big" features, there are a lot of little, often unknown extensions that are very practical for our daily business in developing and operating Data Warehouses. In this session, Dani Schnider will present several nice little 12c features that are useful for developing ETL processes, for SQL queries on Data Marts and for the administration of Data Warehouse databases.
Trivadis TechEvent 2016 IoT Portal with PowerBI and SharePoint by Jens Berten...Trivadis
Showing only reports of data is only a part of the whole story. To be able to make correct decisions, additional information is needed. But most of the information, specifically documents and information outside database, are not recognized by BI reports. With the portal idea, we visualize the IoT Data with PowerBI and provide additional values by showing reports, documents and additional information in one portal. Users will get a real "single point of information" for that topic. An example with a demo will be shown.
Trivadis TechEvent 2016 Die Rolle der Unterschrift bei der Digitalisierung vo...Trivadis
Die Unterschrift hat eine zentrale Bedeutung in vielen Geschäftsprozessen. Bei Digitalisierungsvorhaben - wie wir z.B. bei den Dokumentenservices, die die Firma Arizon AG für die Raiffeisenbanken der Schweiz zur Verfügung stellt, anzutreffen - steht die papierbasierte Unterschrift zunächst einmal im Weg. Das kryptographische Verfahren der digitalen Signatur ist ein wunderbares Hilfsmittel, aber auch immer nur ein Baustein, denn die Funktion, die die Unterschrift in unseren Prozessen hat, sind so vielfältig wie die Umsetzungsmöglichkeiten, wie das Beispiel von Arizon zeigt.
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis
First Steps of an Oracle-expert in the Big Data World. Everyone speaks about Big Data. But what does it mean? This speech focuses on one animal of the Big Data Zoo - Cassandra and answers the following questions:
- Why another database?
- There is Impala and Spark. Why would I need Cassandra?
- New database - do I need to learn a new language?
- How do I get the data in?
- Can I use SQL?
- Is it part of a distribution, for example Cloudera?
Demos will explain the theory.
Big Data and Fast Data - big and fast combined, is it possible?Guido Schmutz
Big Data (volume) and real-time information processing (velocity) are two important aspects of Big Data systems. At first sight, these two aspects seem to be incompatible. Are traditional software architectures still the right choice? Do we need new, revolutionary architectures to tackle the requirements of Big Data. This presentation discusses the idea of the so-called lambda architecture for Big Data, which acts on the assumption of a bisection of the data-processing: in a batch-phase a temporally bounded, large dataset is processed either through traditional ETL or MapReduce. In parallel, a real-time, online processing is constantly calculating the values of the new data coming in during the batch phase. The combination of the two results, batch and online processing is giving the constantly up-to-date view. This talk presents how such an architecture can be implemented using Oracle products such as Oracle NoSQL, Hadoop and Oracle Event Processing.
Big Data and Fast Data - Lambda Architecture in ActionGuido Schmutz
Big Data (volume) and real-time information processing (velocity) are two important aspects of Big Data systems. At first sight, these two aspects seem to be incompatible. Are traditional software architectures still the right choice? Do we need new, revolutionary architectures to tackle the requirements of Big Data?
This presentation discusses the idea of the so-called lambda architecture for Big Data, which acts on the assumption of a bisection of the data-processing: in a batch-phase a temporally bounded, large dataset is processed either through traditional ETL or MapReduce. In parallel, a real-time, online processing is constantly calculating the values of the new data coming in during the batch phase. The combination of the two results, batch and online processing is giving the constantly up-to-date view.
This talk presents how such an architecture can be implemented using Oracle products such as Oracle NoSQL, Hadoop and Oracle Event Processing as well as some selected products from the Open Source Software community. While this session mostly focuses on the software architecture of BigData and FastData systems, some lessons learned in the implementation of such a system are presented as well.
Big Data and Fast Data combined – is it possible ? Introduction aux architectures Big Data. M. Ulises Fasoli, Senior Consultant Trivadis. Conférence donnée dans le cadre du Swiss Data Forum du 24 novembre 2015 à Lausanne
Data, Interconnectedness & The Internet of Things Software AG
Innovation World 2013 presentation.
The key to deriving value from fast data is being able to access, analyze and respond to it in real-time. Robin Gilthorpe explores the deep capabilities and synergies of Real-Time Analytics (Apama) and In-Memory (Terracotta) Platforms, sharing a breadth of insights around use cases and customer successes.
Speaker:
Robin Gilthorpe
CEO, Terracotta
Watch full webinar here: https://bit.ly/2Y0vudM
What is Data Virtualization and why do I care? In this webinar we intend to help you understand not only what Data Virtualization is but why it's a critical component of any organization's data fabric and how it fits. How data virtualization liberates and empowers your business users via data discovery, data wrangling to generation of reusable reporting objects and data services. Digital transformation demands that we empower all consumers of data within the organization, it also demands agility too. Data Virtualization gives you meaningful access to information that can be shared by a myriad of consumers.
Register to attend this session to learn:
- What is Data Virtualization?
- Why do I need Data Virtualization in my organization?
- How do I implement Data Virtualization in my enterprise?
Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)Denodo
Over the last couple of years, Big Data has been big news. If you read the press, it seems that everyone is using Big Data – either that or they are getting left behind. However, the Big Data products, such as Hadoop, are not for the faint-hearted! They introduce new technologies, new data models, and new non-standard APIs into your data infrastructure. This runs the risk of creating more data silos, integration problems, and data governance nightmares. Data Virtualization can eliminate these risks and allow all users to leverage your Big Data assets.
More information and FREE registrations to this webinar: http://goo.gl/91YUsZ
Landing page for the entire Packed Lunch webinar series: http://goo.gl/NATMHw
Attend & get unique insights into:
- Typical Big Data use cases
- How Data Virtualization integrates Big Data into your existing data infrastructure
- How Data Virtualization increases effectiveness and penetration of Big Data initiatives by enabling non-technical users to access Big Data result sets.
- Case studies that demonstrate how Data Virtualization has helped companies tackle their Big Data challenges
Watch full webinar here: https://bit.ly/3puUCIc
What is Data Virtualization and why do I care? In this webinar we intend to help you understand not only what Data Virtualization is but why it's a critical component of any organization's data fabric and how it fits. How data virtualization liberates and empowers your business users via data discovery, data wrangling to generation of reusable reporting objects and data services. Digital transformation demands that we empower all consumers of data within the organization, it also demands agility too. Data Virtualization gives you meaningful access to information that can be shared by a myriad of consumers.
Watch on-demand this session to learn:
- What is Data Virtualization?
- Why do I need Data Virtualization in my organization?
- How do I implement Data Virtualization in my enterprise? Where does it fit?
Big Data Management: A Unified Approach to Drive Business ResultsCA Technologies
Traditional data management is changing rapidly, attributed to significant changes brought on by evolving big data environments. IT complexity is on the rise as businesses choose the technologies they need to support their big data strategies and targeted business outcomes. Now, more than ever, we need IT management tools that can accommodate and effectively manage these evolving, complex environments to ensure that enterprises can move forward with their preferred technology and vendor choices.
For more information on Mainframe solutions from CA Technologies, please visit: http://bit.ly/1wbiPkl
Innovation med big data – chr. hansens erfaringerMicrosoft
Mange steder er Big Data stadig det nye og ukendte, der ikke har topprioritet hos IT, da ”vi ikke har store datamængder”. Men Big Data er meget mere end store datamængder. I Chr. Hansen A/S har Forskning og Udvikling (Innovation) afdelingen arbejdet med værdien af data og som resultat etableret et tværfagligt BioInformatik-program på Big Data teknologier fra Microsoft.
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
With the aid of any number of data management and processing tools, data flows through multiple on-prem and cloud storage locations before it’s delivered to business users. As a result, IT teams — including IT Ops, DataOps, and DevOps — are often overwhelmed by the complexity of creating a reliable data pipeline that includes the automation and observability they require.
The answer to this widespread problem is a centralized data pipeline orchestration solution.
Join Stonebranch’s Scott Davis, Global Vice President and Ravi Murugesan, Sr. Solution Engineer to learn how DataOps teams orchestrate their end-to-end data pipelines with a platform approach to managing automation.
Key Learnings:
- Discover how to orchestrate data pipelines across a hybrid IT environment (on-prem and cloud)
- Find out how DataOps teams are empowered with event-based triggers for real-time data flow
- See examples of reports, dashboards, and proactive alerts designed to help you reliably keep data flowing through your business — with the observability you require
- Discover how to replace clunky legacy approaches to streaming data in a multi-cloud environment
- See what’s possible with the Stonebranch Universal Automation Center (UAC)
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Trivadis
During major irregularities, the service desks of airline companies are heavily overloaded for short periods of time. A chatbot could help out during these peak hours. In this session we show how SWISS International Airlines developed a chatbot for irregularity handling. We shed light on the challenges, such as sensitive customer data and a company starting its journey into the cloud.
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...Trivadis
Trivadis Azure Foundation – Das Fundament für den erfolgreichen Einsatz der Azure Cloud
Die Azure Cloud steuert auf ihr 10-jähriges Jubiläum zu und ist in der Schweiz angekommen. Im Vergleich zum Betrieb von On-Premise Lösungen bietet die Cloud eine Vielzahl von Vorteilen. Viele Aufgaben aus der On-Premise Welt werden im Cloud Computing vom Anbieter übernommen.
Aber die Freiheiten, welche Cloud Computing bietet, sind sehr mächtig und das beste Rezept für Wildwuchs und Chaos. Viele unserer Kunden werden sich erst jetzt bewusst, um welche Aufgaben sie sich bereits vor 5 Jahren hätten kümmern sollen. Die Trivadis Azure Foundation ist unser in der Praxis erprobtes Vorgehen, um alle Vorteile der Cloud optimal Nutzen zu können, ohne die Kontrolle zu verlieren. In dieser Session bekommen Sie einen Einblick in unsere Azure Foundation Methodik, zusätzlich berichten wir von den Azure-Erfahrungen unserer Kunden.
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
In dieser Session stellen wir ein Projekt vor, in welchem wir ein umfassendes BI-System mit Hilfe von Azure Blob Storage, Azure SQL, Azure Logic Apps und Azure Analysis Services für und in der Azure Cloud aufgebaut haben. Wir berichten über die Herausforderungen, wie wir diese gelöst haben und welche Learnings und Best Practices wir mitgenommen haben.
Azure Days 2019: Master the Move to Azure (Konrad Brunner)Trivadis
Die Azure Cloud hat sich in den letzten 10 Jahren etabliert und steht heute sowohl global, als auch lokal zur Verfügung,
der Schritt in die Cloud muss aber gut geplant werden. In diesem Talk teilen wir unsere Erfahrungen aus diversen Projekten mit Ihnen. Wir zeigen, worauf Sie besonders achten müssen, damit Ihr Wechsel in die Cloud ein Erfolg wird.
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Trivadis
Die Azure Cloud ist in der Schweiz angekommen. In dieser Session beleuchtet Primo Amrein, Cloud Lead bei Microsoft Schweiz, die Einführung der Azure Cloud in der Schweiz, berichtet über die Erfolgsgeschichten und die Lessons Learned. Die Session wird mit einem Ausblick auf die Roadmap abgerundet.
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Trivadis
«Moderne» Data Warehouse/Data Lake Architekturen strotzen oft nur von Layern und Services. Mit solchen Systemen lassen sich Petabytes von Daten verwalten und analysieren. Das Ganze hat aber auch seinen Preis (Komplexität, Latenzzeit, Stabilität) und nicht jedes Projekt wird mit diesem Ansatz glücklich.
Der Vortrag zeigt die Reise von einer technologieverliebten Lösung zu einer auf die Anwender Bedürfnisse abgestimmten Umgebung. Er zeigt die Sonnen- und Schattenseiten von massiv parallelen Systemen und soll die Sinne auf das Aufnehmen der realen Kundenanforderungen sensibilisieren.
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...Trivadis
API-Management bietet eine integrierte Umgebung zur Erstellung, Ausführung, Verwaltung und Sicherung von Enterprise-APIs für moderne digitale Anwendungen. Die Firma Vinci Energies Schweiz setzt den Azure API-Management Dienst seit mehreren Jahren in unterschiedlichen Projekten erfolgreich ein. Ein Erfahrungsbericht, der die Möglichkeiten, aber auch die Grenzen von Azure API-Management aufzeigt.
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...Trivadis
Heutzutage schreibt man nicht nur Applikationen mit Code. Dank der Cloud wird die Konfiguration von Infrastruktur wie virtuellen Maschinen oder Netzwerken in Code definiert und automatisiert ausgeliefert. Man spricht von Infrastructure as Code, kurz: IAC. Für Infrastructure as Code auf Azure gibt es viele tools wie Ansible, Puppet, Chef, etc. Zwei Lösungen stechen durch Ihren unterschiedlichen Ansatz heraus - Die Azure Resource Manager Templates (ARM) als Microsoft-native Lösung, immer auf dem neusten Stand, aber an Azure gebunden. Auf der anderen Seite Terraform von HashiCorp mit einer deskriptiven Sprache als Grundlage, dafür weniger Features im Security-Bereich. Für einen Grosskunden haben wir die beiden Technologien verglichen. Die Resultate zeigen wir in dieser Session mit Livedemos auf.
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...Trivadis
Was waren die Learnings und Challenges um eine auf Azure basierende, moderne Data Analytics Plattform für einen großen Konzern als Service bereitzustellen und in das Enterprise zu integrieren? Ein Projekt mit vielen interessanten Aspekten über Azure BI Services wie HDInsight, die Integration in ein Enterprise in einem "as a Service" Model, Management der Kosten und Verrechnungen der Services, und noch viel mehr. Diese Session bietet Einblicke in eines unserer Projekte, die Ihnen in Ihrem nächsten Projekt behilflich sein werden.
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...Trivadis
Die Helsana (https://www.helsana.ch), die Nummer 2 der grössten Krankenversicherungen der Schweiz, verfolgt eine moderne Cloud-First Strategie. Um komplexe Marketingkampagnen mit einem hohen Grad an Automatisierung ausführen zu können, wurden von Helsana diverse Produkte evaluiert. Leider fand sich keines, welches allen Anforderungen genügte. In enger Zusammenarbeit mit Microsoft wurde die zu 100% Azure-basierte Anwendung CRM-Analytics (CRMa) erstellt, welche Leads und Aufgaben aus dem Dynamics CRM gemäss komplexen Verteilregelwerken an die Regionen, Niederlassungen und Kundenbetreuer verteilt. Die Resultate und Performance der Kampagnen können über eine Data Analytics Strecke analysiert und in PowerBI visualisiert werden. Manuelle Prozesse zur Zielgruppenselektion wurden automatisiert und die Zeit von der Idee bis zur Selektion der Zielgruppe konnte von 10(!) Tagen auf einige Minuten reduziert werden. Mit der Einführung von CRMa hat die Helsana einen massgebenden Schritt in die Digitalisierung und zu einem ganzheitlichen Kampagnenmanagement geschafft.
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...Trivadis
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individuelles Angebot in 5 Sek formulierst; Martin Kortstiege, Ronny Bauer - Trivadis
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...Trivadis
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing PostgreSQL to Oracle, the best kept secrets; Konrad Häfeli, Jan Karremans - Trivadis
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...Trivadis
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 - von klassischen zu agilen Prozessen; Martin Moog, Esther Trapp, Norbert Ziebarth - Trivadis
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.