Let us take you with us on our journey to redmesh - Bosch Digital's realization of the data mesh concept as enabler for our digital business and foundation for a data-driven enterprise. Follow us from local databases, via data lakes to data mesh, a federated sociotechnical approach to share, access and manage analytical data in enterprise and large-scale environments by providing a maximum of flexibility and autonomy to our customers while ensuring interoperability and standardization. The redmesh self-service data platform provides almost Petabyte of enterprise data stored in different distributed objects, enabling significant increase of data consumption and faster ad-hoc analysis based on Data Products. We innovate products, processes and company through data! We maximize the value of data for Bosch!
Presentation at Data Science and Engineering Club looking at ways to create a Data Analytics Portfolio to demonstrate the skills that add direct value to customers and organisations.
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Leveraging the Power of the ServiceNow® Platform with Mainframe and IBM i Sys...Precisely
ServiceNow is a recognized leader transforming the impact, speed and delivery of IT by breaking down silos and providing visibility across the enterprise. Meanwhile, more than 2.5 billion business transactions run on mainframes each day and over 100,000 companies use IBM i technology to run their business. Yet, until recently, these critical systems have been disconnected from the ServiceNow platform – leaving a significant blind spot in the enterprise-wide view of IT infrastructure.
View this webinar on-demand to learn about Syncsort Ironstream, the first product to seamlessly integrate IBM mainframe and IBM i systems into the ServiceNow platform to support IT Operations and Service Management.
Product experts will discuss the value this integration delivers to your business, as well as show how mainframe and IBM i data is used within the ServiceNow platform to deliver high-performance business services.
During this webinar, we cover:
• The benefits – and challenges – of including mainframe and IBM i data in the ServiceNow platform
• How Syncsort Ironstream integrates with ServiceNow Discovery and ServiceNow Event Management
• A demonstration of how mainframe and IBM i data works within ServiceNow to address top ITSM use cases, including change management, incident management and event management
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
Synopsis:
[Video link: http://www.youtube.com/watch?v=ZNrTxSU5IQ0 ]
Jim Stagnitto and John DiPietro of consulting firm a2c) will discuss Agile Data Warehouse Design - a step-by-step method for data warehousing / business intelligence (DW/BI) professionals to better collect and translate business intelligence requirements into successful dimensional data warehouse designs.
The method utilizes BEAM✲ (Business Event Analysis and Modeling) - an agile approach to dimensional data modeling that can be used throughout analysis and design to improve productivity and communication between DW designers and BI stakeholders. BEAM✲ builds upon the body of mature "best practice" dimensional DW design techniques, and collects "just enough" non-technical business process information from BI stakeholders to allow the modeler to slot their business needs directly and simply into proven DW design patterns.
BEAM✲ encourages DW/BI designers to move away from the keyboard and their entity relationship modeling tools and begin "white board" modeling interactively with BI stakeholders. With the right guidance, BI stakeholders can and should model their own BI data requirements, so that they can fully understand and govern what they will be able to report on and analyze.
The BEAM✲ method is fully described in
Agile Data Warehouse Design - a text co-written by Lawrence Corr and Jim Stagnitto.
About the speaker:
Jim Stagnitto Director of a2c Data Services Practice
Data Warehouse Architect: specializing in powerful designs that extract the maximum business benefit from Intelligence and Insight investments.
Master Data Management (MDM) and Customer Data Integration (CDI) strategist and architect.
Data Warehousing, Data Quality, and Data Integration thought-leader: co-author with Lawrence Corr of "Agile Data Warehouse Design", guest author of Ralph Kimball’s “Data Warehouse Designer” column, and contributing author to Ralph and Joe Caserta's latest book: “The DW ETL Toolkit”.
John DiPietro Chief Technology Officer at A2C IT Consulting
John DiPietro is the Chief Technology Officer for a2c. Mr. DiPietro is responsible
for setting the vision, strategy, delivery, and methodologies for a2c’s Solution
Practice Offerings for all national accounts. The a2c CTO brings with him an
expansive depth and breadth of specialized skills in his field.
Sponsor Note:
Thanks to:
Microsoft NERD for providing awesome venue for the event.
http://A2C.com IT Consulting for providing the food/drinks.
http://Cognizeus.com for providing book to give away as raffle.
Introduction to Integration TechnologiesBizTalk360
In this presentation, Arunkumar Kumaresan highlights how the Integration Technologies have emerged over the last few years and cites few interesting examples.
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
Many organizations are immature when it comes to data and analytics use. The answer lies in delivering a greater level of insight from data, straight to the point of need.
There are so many Data Architecture best practices today, accumulated from years of practice. In this webinar, William will look at some Data Architecture best practices that he believes have emerged in the past two years and are not worked into many enterprise data programs yet. These are keepers and will be required to move towards, by one means or another, so it’s best to mindfully work them into the environment.
Presentation at Data Science and Engineering Club looking at ways to create a Data Analytics Portfolio to demonstrate the skills that add direct value to customers and organisations.
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Leveraging the Power of the ServiceNow® Platform with Mainframe and IBM i Sys...Precisely
ServiceNow is a recognized leader transforming the impact, speed and delivery of IT by breaking down silos and providing visibility across the enterprise. Meanwhile, more than 2.5 billion business transactions run on mainframes each day and over 100,000 companies use IBM i technology to run their business. Yet, until recently, these critical systems have been disconnected from the ServiceNow platform – leaving a significant blind spot in the enterprise-wide view of IT infrastructure.
View this webinar on-demand to learn about Syncsort Ironstream, the first product to seamlessly integrate IBM mainframe and IBM i systems into the ServiceNow platform to support IT Operations and Service Management.
Product experts will discuss the value this integration delivers to your business, as well as show how mainframe and IBM i data is used within the ServiceNow platform to deliver high-performance business services.
During this webinar, we cover:
• The benefits – and challenges – of including mainframe and IBM i data in the ServiceNow platform
• How Syncsort Ironstream integrates with ServiceNow Discovery and ServiceNow Event Management
• A demonstration of how mainframe and IBM i data works within ServiceNow to address top ITSM use cases, including change management, incident management and event management
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
Synopsis:
[Video link: http://www.youtube.com/watch?v=ZNrTxSU5IQ0 ]
Jim Stagnitto and John DiPietro of consulting firm a2c) will discuss Agile Data Warehouse Design - a step-by-step method for data warehousing / business intelligence (DW/BI) professionals to better collect and translate business intelligence requirements into successful dimensional data warehouse designs.
The method utilizes BEAM✲ (Business Event Analysis and Modeling) - an agile approach to dimensional data modeling that can be used throughout analysis and design to improve productivity and communication between DW designers and BI stakeholders. BEAM✲ builds upon the body of mature "best practice" dimensional DW design techniques, and collects "just enough" non-technical business process information from BI stakeholders to allow the modeler to slot their business needs directly and simply into proven DW design patterns.
BEAM✲ encourages DW/BI designers to move away from the keyboard and their entity relationship modeling tools and begin "white board" modeling interactively with BI stakeholders. With the right guidance, BI stakeholders can and should model their own BI data requirements, so that they can fully understand and govern what they will be able to report on and analyze.
The BEAM✲ method is fully described in
Agile Data Warehouse Design - a text co-written by Lawrence Corr and Jim Stagnitto.
About the speaker:
Jim Stagnitto Director of a2c Data Services Practice
Data Warehouse Architect: specializing in powerful designs that extract the maximum business benefit from Intelligence and Insight investments.
Master Data Management (MDM) and Customer Data Integration (CDI) strategist and architect.
Data Warehousing, Data Quality, and Data Integration thought-leader: co-author with Lawrence Corr of "Agile Data Warehouse Design", guest author of Ralph Kimball’s “Data Warehouse Designer” column, and contributing author to Ralph and Joe Caserta's latest book: “The DW ETL Toolkit”.
John DiPietro Chief Technology Officer at A2C IT Consulting
John DiPietro is the Chief Technology Officer for a2c. Mr. DiPietro is responsible
for setting the vision, strategy, delivery, and methodologies for a2c’s Solution
Practice Offerings for all national accounts. The a2c CTO brings with him an
expansive depth and breadth of specialized skills in his field.
Sponsor Note:
Thanks to:
Microsoft NERD for providing awesome venue for the event.
http://A2C.com IT Consulting for providing the food/drinks.
http://Cognizeus.com for providing book to give away as raffle.
Introduction to Integration TechnologiesBizTalk360
In this presentation, Arunkumar Kumaresan highlights how the Integration Technologies have emerged over the last few years and cites few interesting examples.
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
Many organizations are immature when it comes to data and analytics use. The answer lies in delivering a greater level of insight from data, straight to the point of need.
There are so many Data Architecture best practices today, accumulated from years of practice. In this webinar, William will look at some Data Architecture best practices that he believes have emerged in the past two years and are not worked into many enterprise data programs yet. These are keepers and will be required to move towards, by one means or another, so it’s best to mindfully work them into the environment.
The Data Driven University - Automating Data Governance and Stewardship in Au...Pieter De Leenheer
Data Governance and Stewardship requires automation of business semantics management at its nucleus, in order to achieve data trust between business and IT communities in the organization. University divisions operate highly autonomously and decentralized, and are often geographically distributed. Hence, they benefit more from an collaborative and agile approach to Data Governance and Stewardship approach that adapts to its nature.
In this lecture, we start by reviewing 'C' in ICT and reflect on the dilemma: what is the most important quality of data being shared: truth or trust? We review the wide spectrum of business semantics. We visit the different phases of growing data pain as an organization expands, and we map each phase on this spectrum of semantics.
Next, we introduce our principles and framework for business semantics management to support Data Governance and Stewardship focusing on the structural (what), processual (how) and organizational (who) components. We illustrate with use cases from Stanford University, George Washington University and Public Science and Innovation Administrations.
Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark.
The Data Operating System: Changing the Digital Trajectory of HealthcareHealth Catalyst
In 1989, John Reed, the CEO of Citibank and the early pioneer for ATMs, said, “I can see a future in which the data and information that is exchanged in our transactions are worth more than the transactions themselves.” We are at an interesting digital nexus in healthcare. Few of us would argue against the notion that data and digital health will play a bigger and bigger role in the future. But, are we on the right track to deliver on that future? It required $30B in federal incentive money to subsidize the uptake of Electronic Health Records (EHRs). You could argue that the federal incentives stimulated the first major step towards the digitization of health, but few physicians would celebrate its value in comparison to its expense. As the healthcare market consolidates through mergers and acquisitions (M&A), patching disparate EHRs and other information systems together becomes even more important, and challenging. An organization is not integrated until its data is integrated, but costly forklift replacements of these transaction information systems and consolidating them with a single EHR solution is not a viable financial solution.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
QuerySurge - the automated Data Testing solutionRTTS
QuerySurge is the leading Data Testing solution built specifically to automate the testing of Data Warehouses & Big Data. QuerySurge ensures that the data extracted from data sources remains intact in the target data store by analyzing and pinpointing any differences quickly.
And QuerySurge makes it easy for both novice and experienced team members to validate their organization's data quickly through Query Wizards while still allowing power users the flexibility they need.
All with deep dive reporting and data health dashboards that quickly provides you with a holistic view of your project’s data.
Types of Automated Data Testing
--------------------------------------------
QuerySurge provides data testing solutions for all of your automated data testing needs
- Data Warehouse testing & ETL testing
- Big Data (Hadoop, NoSQL) testing
- Data Interface testing
- Data Migration testing
- Database Upgrade testing
FREE TRIAL
www.QuerySurge.com
DataEd Webinar: Reference & Master Data Management - Unlocking Business ValueDATAVERSITY
Data tends to pile up and can be rendered unusable or obsolete without careful maintenance processes. Reference and Master Data Management (MDM) has been a popular Data Management approach to effectively gain mastery over not just the data but the supporting architecture for processing it. This webinar presents MDM as a strategic approach to improving and formalizing practices around those data items that provide context for many organizational transactions—its master data. Too often, MDM has been implemented technology-first and achieved the same very poor track record (one-third succeeding on-time, within budget, and achieving planned functionality). MDM success depends on a coordinated approach typically involving Data Governance and Data Quality activities.
Learning Objectives:
- Understand foundational reference and MDM concepts based on the Data Management Body of Knowledge (DMBOK)
- Understand why these are an important component of your Data Architecture
- Gain awareness of Reference and MDM Frameworks and building blocks
- Know what MDM guiding principles consist of and best practices
- Know how to utilize reference and MDM in support of business strategy
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges can often trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This in turn allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from reoccurring.
Learning objectives:
-Help you understand foundational Data Quality concepts for improving Data Quality at your organization
-Demonstrate how chronic business challenges for organizations are often rooted in poor Data Quality
-Share case studies illustrating the hallmarks and benefits of Data Quality success
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
While early big data systems, such as MapReduce, focused on batch processing, the demands on these systems have quickly grown. Users quickly needed to run (1) more interactive ad-hoc queries, (2) sophisticated multi-pass algorithms (e.g. machine learning), and (3) real-time stream processing. The result has been an explosion of specialized systems to tackle these new workloads. Unfortunately, this means more systems to learn, manage, and stitch together into pipelines. Spark is unique in taking a step back and trying to provide a *unified* post-MapReduce programming model that tackles all these workloads. By generalizing MapReduce to support fast data sharing and low-latency jobs, we achieve best-in-class performance in a variety of workloads, while providing a simple programming model that lets users easily and efficiently combine them.
Today, Spark is the most active open source project in big data, with high activity in both the core engine and a growing array of standard libraries built on top (e.g. machine learning, stream processing, SQL). I'm going to talk about the latest developments in Spark and show examples of how it can combine processing algorithms to build rich data pipelines in just a few lines of code.
Talk by Databricks CTO and Apache Spark creator Matei Zaharia at QCON San Francisco 2014.
Build Real-Time Applications with Databricks StreamingDatabricks
In this presentation, we will study a recent use case we implemented recently. In this use case we are working with a large, metropolitan fire department. Our company has already created a complete analytics architecture for the department based upon Azure Data Factory, Databricks, Delta Lake, Azure SQL and Azure SQL Server Analytics Services (SSAS). While this architecture works very well for the department, they would like to add a real-time channel to their reporting infrastructure.
This channel should serve up the following information: •The most up-to-date locations and status of equipment (fire trucks, ambulances, ladders etc.)
• The current locations and status of firefighters, EMT personnel and other relevant fire department employees
• The current list of active incidents within the city The above information should be visualized through an automatically updating dashboard. The central component of the dashboard will be map which automatically updates with the locations and incidents. This view should be as real-time as possible and will be used by the fire chiefs to assist with real-time decision-making on resource and equipment deployments.
In this presentation, we will leverage Databricks, Spark Structured Streaming, Delta Lake and the Azure platform to create this real-time delivery channel.
Data Lakehouse Symposium | Day 1 | Part 2Databricks
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDenodo
Watch full webinar here: https://bit.ly/3Ek4gUb
In recent years, there has been a significant push towards decentralized data organizations where different domains are partially or fully responsible for exposing their own data for analytics.
Join us in this session with Daniel Tenreiro, Sales Engineer at Denodo, in which he will share important design guidelines and best practices that can be used to implement many of the decentralization principles, such as the ones defined by the popular data mesh paradigm, using the Denodo Platform, powered by data virtualization.
Watch On-Demand & Learn:
- Overview of decentralized data organizations features
- Implementation best practices using data virtualization
Discover here the keynote at the Gartner ITxpo of Helmut Reisinger, CEO at Orange Business Services.
How can you accelerate the convergence of OT and IT teams, systems and data to create new value with IoT- and AI-enabled business processes and products? This session will help you overcome data integration, analytics and connectivity challenges to combat new cybersecurity threats that come from linking your production systems and supply chains via the Internet.
This presentation reports on data governance best practices. Based on a definition of fundamental terms and the business rationale for data governance, a set of case studies from leading companies is presented. The content of this presentation is a result of the Competence Center Corporate Data Quality (CC CDQ) at the University of St. Gallen, Switzerland.
Governance, Risk and Compliance and you | CollabDays Bletchley Park 2022Nikki Chapple
5 October 2022: CollabDays Bletchley Park 2022 - October edition | In-person event United Kingdom
Governance, Risk and Compliance and you – Microsoft Purview and beyond | Simon Hudson & Nikki Chapple
Governance, Risk and Compliance; it’s not nice to have, It’s The Law. Every organisation needs to pay attention to GRC, but not everyone has the tools, expertise or strategy. Microsoft Purview is a surprisingly capable tool in your organisation’s GRC tool bag when combined with a broad & competent approach. This session will provide: – an overview of GRC obligations and approaches – what’s in Purview – pragmatic approaches to elevating your Compliance Score – wider technical and business thinking for de-risking your operations and organisation – thoughts on using the Maturity Model for Microsoft 365 GRC Competency to set your objectives.
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
The Data Driven University - Automating Data Governance and Stewardship in Au...Pieter De Leenheer
Data Governance and Stewardship requires automation of business semantics management at its nucleus, in order to achieve data trust between business and IT communities in the organization. University divisions operate highly autonomously and decentralized, and are often geographically distributed. Hence, they benefit more from an collaborative and agile approach to Data Governance and Stewardship approach that adapts to its nature.
In this lecture, we start by reviewing 'C' in ICT and reflect on the dilemma: what is the most important quality of data being shared: truth or trust? We review the wide spectrum of business semantics. We visit the different phases of growing data pain as an organization expands, and we map each phase on this spectrum of semantics.
Next, we introduce our principles and framework for business semantics management to support Data Governance and Stewardship focusing on the structural (what), processual (how) and organizational (who) components. We illustrate with use cases from Stanford University, George Washington University and Public Science and Innovation Administrations.
Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark.
The Data Operating System: Changing the Digital Trajectory of HealthcareHealth Catalyst
In 1989, John Reed, the CEO of Citibank and the early pioneer for ATMs, said, “I can see a future in which the data and information that is exchanged in our transactions are worth more than the transactions themselves.” We are at an interesting digital nexus in healthcare. Few of us would argue against the notion that data and digital health will play a bigger and bigger role in the future. But, are we on the right track to deliver on that future? It required $30B in federal incentive money to subsidize the uptake of Electronic Health Records (EHRs). You could argue that the federal incentives stimulated the first major step towards the digitization of health, but few physicians would celebrate its value in comparison to its expense. As the healthcare market consolidates through mergers and acquisitions (M&A), patching disparate EHRs and other information systems together becomes even more important, and challenging. An organization is not integrated until its data is integrated, but costly forklift replacements of these transaction information systems and consolidating them with a single EHR solution is not a viable financial solution.
Enterprise data literacy. A worthy objective? Certainly! A realistic goal? That remains to be seen. As companies consider investing in data literacy education, questions arise about its value and purpose. While the destination – having a data-fluent workforce – is attractive, we wonder how (and if) we can get there.
Kicking off this webinar series, we begin with a panel discussion to explore the landscape of literacy, including expert positions and results from focus groups:
- why it matters,
- what it means,
- what gets in the way,
- who needs it (and how much they need),
- what companies believe it will accomplish.
In this engaging discussion about literacy, we will set the stage for future webinars to answer specific questions and feature successful literacy efforts.
QuerySurge - the automated Data Testing solutionRTTS
QuerySurge is the leading Data Testing solution built specifically to automate the testing of Data Warehouses & Big Data. QuerySurge ensures that the data extracted from data sources remains intact in the target data store by analyzing and pinpointing any differences quickly.
And QuerySurge makes it easy for both novice and experienced team members to validate their organization's data quickly through Query Wizards while still allowing power users the flexibility they need.
All with deep dive reporting and data health dashboards that quickly provides you with a holistic view of your project’s data.
Types of Automated Data Testing
--------------------------------------------
QuerySurge provides data testing solutions for all of your automated data testing needs
- Data Warehouse testing & ETL testing
- Big Data (Hadoop, NoSQL) testing
- Data Interface testing
- Data Migration testing
- Database Upgrade testing
FREE TRIAL
www.QuerySurge.com
DataEd Webinar: Reference & Master Data Management - Unlocking Business ValueDATAVERSITY
Data tends to pile up and can be rendered unusable or obsolete without careful maintenance processes. Reference and Master Data Management (MDM) has been a popular Data Management approach to effectively gain mastery over not just the data but the supporting architecture for processing it. This webinar presents MDM as a strategic approach to improving and formalizing practices around those data items that provide context for many organizational transactions—its master data. Too often, MDM has been implemented technology-first and achieved the same very poor track record (one-third succeeding on-time, within budget, and achieving planned functionality). MDM success depends on a coordinated approach typically involving Data Governance and Data Quality activities.
Learning Objectives:
- Understand foundational reference and MDM concepts based on the Data Management Body of Knowledge (DMBOK)
- Understand why these are an important component of your Data Architecture
- Gain awareness of Reference and MDM Frameworks and building blocks
- Know what MDM guiding principles consist of and best practices
- Know how to utilize reference and MDM in support of business strategy
Good data is like good water: best served fresh, and ideally well-filtered. Data Management strategies can produce tremendous procedural improvements and increased profit margins across the board, but only if the data being managed is of high quality. Determining how Data Quality should be engineered provides a useful framework for utilizing Data Quality management effectively in support of business strategy. This, in turn, allows for speedy identification of business problems, delineation between structural and practice-oriented defects in Data Management, and proactive prevention of future issues. Organizations must realize what it means to utilize Data Quality engineering in support of business strategy. This webinar will illustrate how organizations with chronic business challenges can often trace the root of the problem to poor Data Quality. Showing how Data Quality should be engineered provides a useful framework in which to develop an effective approach. This in turn allows organizations to more quickly identify business problems as well as data problems caused by structural issues versus practice-oriented defects and prevent these from reoccurring.
Learning objectives:
-Help you understand foundational Data Quality concepts for improving Data Quality at your organization
-Demonstrate how chronic business challenges for organizations are often rooted in poor Data Quality
-Share case studies illustrating the hallmarks and benefits of Data Quality success
Unified Big Data Processing with Apache Spark (QCON 2014)Databricks
While early big data systems, such as MapReduce, focused on batch processing, the demands on these systems have quickly grown. Users quickly needed to run (1) more interactive ad-hoc queries, (2) sophisticated multi-pass algorithms (e.g. machine learning), and (3) real-time stream processing. The result has been an explosion of specialized systems to tackle these new workloads. Unfortunately, this means more systems to learn, manage, and stitch together into pipelines. Spark is unique in taking a step back and trying to provide a *unified* post-MapReduce programming model that tackles all these workloads. By generalizing MapReduce to support fast data sharing and low-latency jobs, we achieve best-in-class performance in a variety of workloads, while providing a simple programming model that lets users easily and efficiently combine them.
Today, Spark is the most active open source project in big data, with high activity in both the core engine and a growing array of standard libraries built on top (e.g. machine learning, stream processing, SQL). I'm going to talk about the latest developments in Spark and show examples of how it can combine processing algorithms to build rich data pipelines in just a few lines of code.
Talk by Databricks CTO and Apache Spark creator Matei Zaharia at QCON San Francisco 2014.
Build Real-Time Applications with Databricks StreamingDatabricks
In this presentation, we will study a recent use case we implemented recently. In this use case we are working with a large, metropolitan fire department. Our company has already created a complete analytics architecture for the department based upon Azure Data Factory, Databricks, Delta Lake, Azure SQL and Azure SQL Server Analytics Services (SSAS). While this architecture works very well for the department, they would like to add a real-time channel to their reporting infrastructure.
This channel should serve up the following information: •The most up-to-date locations and status of equipment (fire trucks, ambulances, ladders etc.)
• The current locations and status of firefighters, EMT personnel and other relevant fire department employees
• The current list of active incidents within the city The above information should be visualized through an automatically updating dashboard. The central component of the dashboard will be map which automatically updates with the locations and incidents. This view should be as real-time as possible and will be used by the fire chiefs to assist with real-time decision-making on resource and equipment deployments.
In this presentation, we will leverage Databricks, Spark Structured Streaming, Delta Lake and the Azure platform to create this real-time delivery channel.
Data Lakehouse Symposium | Day 1 | Part 2Databricks
The world of data architecture began with applications. Next came data warehouses. Then text was organized into a data warehouse.
Then one day the world discovered a whole new kind of data that was being generated by organizations. The world found that machines generated data that could be transformed into valuable insights. This was the origin of what is today called the data lakehouse. The evolution of data architecture continues today.
Come listen to industry experts describe this transformation of ordinary data into a data architecture that is invaluable to business. Simply put, organizations that take data architecture seriously are going to be at the forefront of business tomorrow.
This is an educational event.
Several of the authors of the book Building the Data Lakehouse will be presenting at this symposium.
Design Guidelines for Data Mesh and Decentralized Data OrganizationsDenodo
Watch full webinar here: https://bit.ly/3Ek4gUb
In recent years, there has been a significant push towards decentralized data organizations where different domains are partially or fully responsible for exposing their own data for analytics.
Join us in this session with Daniel Tenreiro, Sales Engineer at Denodo, in which he will share important design guidelines and best practices that can be used to implement many of the decentralization principles, such as the ones defined by the popular data mesh paradigm, using the Denodo Platform, powered by data virtualization.
Watch On-Demand & Learn:
- Overview of decentralized data organizations features
- Implementation best practices using data virtualization
Discover here the keynote at the Gartner ITxpo of Helmut Reisinger, CEO at Orange Business Services.
How can you accelerate the convergence of OT and IT teams, systems and data to create new value with IoT- and AI-enabled business processes and products? This session will help you overcome data integration, analytics and connectivity challenges to combat new cybersecurity threats that come from linking your production systems and supply chains via the Internet.
This presentation reports on data governance best practices. Based on a definition of fundamental terms and the business rationale for data governance, a set of case studies from leading companies is presented. The content of this presentation is a result of the Competence Center Corporate Data Quality (CC CDQ) at the University of St. Gallen, Switzerland.
Governance, Risk and Compliance and you | CollabDays Bletchley Park 2022Nikki Chapple
5 October 2022: CollabDays Bletchley Park 2022 - October edition | In-person event United Kingdom
Governance, Risk and Compliance and you – Microsoft Purview and beyond | Simon Hudson & Nikki Chapple
Governance, Risk and Compliance; it’s not nice to have, It’s The Law. Every organisation needs to pay attention to GRC, but not everyone has the tools, expertise or strategy. Microsoft Purview is a surprisingly capable tool in your organisation’s GRC tool bag when combined with a broad & competent approach. This session will provide: – an overview of GRC obligations and approaches – what’s in Purview – pragmatic approaches to elevating your Compliance Score – wider technical and business thinking for de-risking your operations and organisation – thoughts on using the Maturity Model for Microsoft 365 GRC Competency to set your objectives.
This describes a conceptual model approach to designing an enterprise data fabric. This is the set of hardware and software infrastructure, tools and facilities to implement, administer, manage and operate data operations across the entire span of the data within the enterprise across all data activities including data acquisition, transformation, storage, distribution, integration, replication, availability, security, protection, disaster recovery, presentation, analytics, preservation, retention, backup, retrieval, archival, recall, deletion, monitoring, capacity planning across all data storage platforms enabling use by applications to meet the data needs of the enterprise.
The conceptual data fabric model represents a rich picture of the enterprise’s data context. It embodies an idealised and target data view.
Designing a data fabric enables the enterprise respond to and take advantage of key related data trends:
• Internal and External Digital Expectations
• Cloud Offerings and Services
• Data Regulations
• Analytics Capabilities
It enables the IT function demonstrate positive data leadership. It shows the IT function is able and willing to respond to business data needs. It allows the enterprise to meet data challenges
• More and more data of many different types
• Increasingly distributed platform landscape
• Compliance and regulation
• Newer data technologies
• Shadow IT where the IT function cannot deliver IT change and new data facilities quickly
It is concerned with the design an open and flexible data fabric that improves the responsiveness of the IT function and reduces shadow IT.
Webinar presented live on August 11, 2017
Today, the majority of big data and analytics use cases are built on hybrid cloud infrastructure. A hybrid cloud is a combination of on-premises and local cloud resources integrated with one or more dedicated cloud(s) and one or more public cloud(s). Hybrid cloud computing has matured to support data security and privacy requirements as well as increased scalability and computational power needed for big data and analytics solutions.
This webinar summarizes what hybrid cloud is, explains why it is important in the context of big data and analytics, and discusses implementation considerations unique to hybrid cloud computing.
The presentation draws from the CSCC's deliverable, Hybrid Cloud Considerations for Big Data and Analytics:
http://www.cloud-council.org/deliverables/hybrid-cloud-considerations-for-big-data-and-analytics.htm
Download the presentation deck here:
http://www.cloud-council.org/webinars/hybrid-cloud-considerations-for-big-data-and-analytics.htm
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo
Watch full webinar here: https://buff.ly/46pRfV7
This Denodo session explores the power of data virtualization, shedding light on its architecture, customer value, and a diverse range of use cases. Attendees will discover how the Denodo Platform enables seamless connectivity to various data sources while effortlessly combining, cleansing, and delivering data through 5 differentiated use cases.
Architecture: Delve into the core architecture of the Denodo Platform and learn how it empowers organizations to create a unified virtual data layer. Understand how data is accessed, integrated, and delivered in a real-time, agile manner.
Value for the Customer: Explore the tangible benefits that Denodo offers to its customers. From cost savings to improved decision-making, discover how the Denodo Platform helps organizations derive maximum value from their data assets.
Five Different Use Cases: Uncover five real-world use cases where Denodo's data virtualization platform has made a significant impact. From data governance to analytics, Denodo proves its versatility across a variety of domains.
- Logical Data Fabric
- Self Service Analytics
- Data Governance
- 360 degree of Entities
- Hybrid/Multi-Cloud Integration
Watch this illuminating session to gain insights into the transformative capabilities of the Denodo Platform.
stackconf 2022: Scaling the Grail – Cloud-Native Computing on Encrypted Data ...NETWAYS
Computing on Encrypted Data (CoED) is considered a holy grail of data security. A major roadblock for the adoption of CoEDs is a lack of integration with cloud technologies to enable scalable, resilient, and easy to operate deployments. The Carbyne Stack open-source project has set out to close this gap. This talk will take the audience down the rabbit hole of CoED technologies and explain how Carbyne Stack blends cloud-native technologies to solve the challenges of scaling sensitive workloads.
Industrial Data Space Association - New Members, New Insights, New Future Dir...Thorsten Huelsmann
Digitalisation is both an enabler and a driving force behind innovative business models. A key ability for innovative business models is to be able to combine data in one “ecosystem”: Services are decoupled from physical platforms/products, The architecture levels are decoupled, Products become platforms and vice versa, “Ecosystems“ develop around platforms, Innovation takes place cooperatively.
Data as strategic resource enables smart services, products and our desired lifestyle of the future.
Proposte ORACLE per la modernizzazione dello sviluppo applicativoJürgen Ambrosi
Argomenti trattati nella sessione:
•gli obiettivi della collaborazione Oracle / CRUI; overview delle soluzioni proposte
l’evoluzione dell’offerta Oracle, on prem e in Cloud
•certificazione CSP Agid e modello di pricing su Cloud
•le soluzioni per la modernizzazione dello Sviluppo Applicativo (prodotti, servizi e formazione)
•Database “Multi-Modello” (relazionale, non relazionale / json, REST): le novità del DB Oracle
•Sviluppo rapido di API e UI “Digital” su Oracle DB: le novità di Apex 18.2
•Sviluppo “poliglotta” su Docker e Kubernetes, in Integrazione e Deployment continui
•Arricchire le applicazioni con funzionalità analitiche evolute, “in-database”
•Tecnologia e framework per gli adempimenti di base del GDPR
•Gestione federata delle Identità (SPID, Social Login)
•Survey
•Q/A
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...Jürgen Ambrosi
Agenda
gli obiettivi della collaborazione Oracle / CRUI; overview delle soluzioni proposte
l’evoluzione dell’offerta Oracle, on prem e in Cloud
certificazione CSP Agid e modello di pricing su Cloud
le soluzioni per la Comunicazione “Digital” (prodotti, servizi e formazione)
Redazione collaborativa e gestione dei contenuti digitali; integrazione con strumenti di produttività come Office365 e Google
Sviluppo rapido e self-service di micrositi e API per front-end digitali
Assistenti Digitali
le soluzioni per la Ricerca Scientifica e l’Innovazione tecnologica
Il Cloud Oracle per l’HPC
soluzioni on-premise e Cloud per BigData e Data Science / Deep Learning
soluzioni in Cloud per IoT, Blockchain
Survey
Q/A
Maximizing Oil and Gas (Data) Asset Utilization with a Logical Data Fabric (A...Denodo
Watch full webinar here: https://bit.ly/3g9PlQP
It is no news that Oil and Gas companies are constantly faced with immense pressure to stay competitive, especially in the current climate while striving towards becoming data-driven at the heart of the process to scale and gain greater operational efficiencies across the organization.
Hence, the need for a logical data layer to help Oil and Gas businesses move towards a unified secure and governed environment to optimize the potential of data assets across the enterprise efficiently and deliver real-time insights.
Tune in to this on-demand webinar where you will:
- Discover the role of data fabrics and Industry 4.0 in enabling smart fields
- Understand how to connect data assets and the associated value chain to high impact domain areas
- See examples of organizations accelerating time-to-value and reducing NPT
- Learn best practices for handling real-time/streaming/IoT data for analytical and operational use cases
Webinar presentation March 9, 2017
IT environments are now fundamentally hybrid in nature – devices, systems, and people are spread across the globe, and at the same time virtualized. Achieving integration across this ever changing environment, and doing so at the pace of modern digital initiatives, is a significant challenge.
This presentation introduces a hybrid integration reference architecture published by the Cloud Standards Customer Council. Learn best practices from leading-edge enterprises that are starting to leverage a hybrid integration platform to take advantage of best of breed cloud-based and on-premises integration approaches.
This webinar draws from the CSCC's deliverable, Cloud Customer Architecture for Hybrid Integration. Read it here: http://www.cloud-council.org/deliverables/cloud-customer-architecture-for-hybrid-integration.htm
Qualifications and Expertise
Cloud and infrastructure architecture, design and operations management with an emphasis in Enterprise Systems and Cloud Management Systems. Extensive experience in operations management, DevOps, systems design and testing, technical writing, and customer communications. Over 20 years’ experience providing IT solutions to customers in commercial and federal markets.
cloud computing is a growing field in computer science. This ppt can help the beginners understand it. contains information about PaaS, Iaas, SaaS and other concepts of Cloud Computing.It also contains a video on cloud computing.
Overcoming Data Gravity in Multi-Cloud Enterprise ArchitecturesVMware Tanzu
Enterprise architectures never sleep because cloud-first strategies must also become multi-cloud-first strategies. Public cloud providers such as Microsoft Azure are providing compelling services and pricing. And, most enterprises now consider their own datacenter a private cloud.
This is not a one-cloud playing field and enterprise architects must develop strategies, standards, and policies about how their data is being used, moved, and created across multiple cloud infrastructures.
Join Pivotal’s Jag Mirani and Mike Stolz along with guest, Forrester Vice President and Principal Analyst, Mike Gualtieri, as they examine the trends driving multi-cloud adoption and more importantly how to architect technical solutions to make data free to roam among them safely.
Speakers:
Mike Gualtieri, VP, PRINCIPAL ANALYST, Forrester
Jag Mirani, Product Marketing, Data Services, Pivotal
Mike Stolz, Product Lead, GemFire, Pivotal
The cloud has come a long way since it was first introduced as a computing utility, being paid for only when and in the amount it was used.
While the cloud's future is wide open, but with the variety of workload types growing with no end in sight, the hybrid cloud is going to be the dominant option over the next couple of years.
This white paper discusses why open source is going to be a key component of cloud computing as a gateway to innovation.
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...actualtechmedia
More and more companies are leveraging the cloud for disaster recovery. After all, the limitless compute resources of the cloud are perfectly suited for disaster recovery. Learn how to easily leverage the cloud for DR.
Redefining the Cloud with AI – State & Use Cases | SoftCloudsSoftClouds LLC
AI has made significant headlines in 2023 and is expected to see an annual growth rate of 37.3% from 2023 to 2030. With that said, AI presents the ability to become truly transformative and most enterprises are just starting to explore how they will apply this technology. AI has tremendous potential to revolutionize how we live and work.
While AI has been the hype, cloud and cloudification have matured over the years. The cloud has evolved from being a technology enabler to a business disruptor. As this transition is happening and with AI starting to revolutionize businesses, IT leaders must ensure they understand their organization’s business strategy and seek opportunities to leverage new and emerging cloud capabilities with AI to accelerate that strategy.
This PPT will cover the following topics:
- Cloud & Cloudification in 2023
- Current State of AI with Futuristic Use Cases
- CX Platforms (Oracle, Salesforce), Cloud, & AI
- AI for IT/Software Development – CI/CD, Migration
- Ensuring a Successful Cloud + AI Journey
For more information,
please contact: info-at-softclouds.com
From the Network to Multi-Cloud: How to Chart an Integrated StrategyXO Communications
This presentation served as a basis for the November 2013 webinar featuring David Linthicum, cloud technology expert, and Sam Koetter, Sr. Product Manager, Ethernet Services, XO Communications. The speakers discussed the emerging patterns of multi-clouds and their applications within the enterprise. They also looked at the importance of the network in support of cloud services, and why selecting the right network infrastructure is as important as selecting the right cloud providers.
Topics explored include:
• The emerging use of multi-cloud solutions and the changing network requirements around this movement
• How to define your network strategy with a cloud strategy in mind. A stepwise approach that most enterprises should follow
• How to select a strategic network partner around your multi-cloud services. What you should look for to be successful the first time
• How to create a master implementation plan and budget. A strategy to make sure both cloud and network resources will be there to support the core business.
Find out -- from cloud industry insiders -- how to navigate the confluence of network and multi-cloud solutions.
Find out more about XO's network solutions: http://bit.ly/1g6QYLr.
View the entire webinar replay on the XO Communications YouTube channel: http://youtu.be/PaGkYmFuq6k.
Similar to [DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh - The Evolution of redmesh (20)
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
In this talk, I'll journey from my time as a Research Assistant at the Bernoulli Institute, delving into the classification of neurodegenerative diseases, to my encounters with groundbreaking biotechnology and AI companies like Proteinea, AlProtein, Rology, and Natrify in Egypt. These innovative ventures are reshaping industries from their Egyptian hub. Join me as I illuminate the transformative power of this thriving ecosystem, showcasing Egypt's remarkable strides in biotech and AI on the global stage.
Building big scale data product doesn't rely only on sophisticated modeling. It also requires an agile methodology, iterative research & development process, versatile big data stack, and a value-oriented mindset. I'll discuss how we -at Dsquares- build big-scale AI product that leverages clients' data from different industries to deliver business-critical value to the end customer. I'll cover the process of product discovery, R&D tasks for unsolved problems, and mapping business requirements into big data technical requirements.
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
Innovation thrives at the intersection of data and creativity. While brainstorming has traditionally fueled the generation of new ideas, leveraging data alongside creative techniques empowers organizations to develop more effective and impactful innovations
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
In today's fast-paced and competitive business environment, harnessing the power of data is essential for staying ahead. Building a data-driven culture within an organization is not just a strategic advantage, but a necessity for those who wish to thrive and innovate. In this insightful talk, our esteemed speaker, a Chief Data Scientist with a decade of experience in the financial services sector, will unravel the complexities of embedding data into the DNA of your organization. The speaker will explore the key tenets of establishing a data-centric mindset, the importance of executive support, and the need for enhancing data literacy across the company. Practical solutions and real-world examples will be provided, demonstrating how to overcome obstacles and successfully integrate a data-driven approach. Attendees will learn strategies for empowering every team member to use data effectively and how to leverage technology to facilitate this cultural shift. The session promises to be a guide for those looking to champion data within their organizations, offering actionable insights for transformation.
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
The use of Artificial Intelligence (AI) is rapidly transforming the recruitment landscape. This talk explores the various ways AI is being used in hiring, from candidate sourcing and screening to skills assessments and interview preparation. We'll discuss the benefits of AI, such as increased efficiency and reduced bias, but also address potential drawbacks like ethical considerations and the human touch.
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
In today's business landscape, data strategy plays a pivotal role in driving innovation within business models. This talk explores how organizations can leverage data effectively to transform their operations, products, and services.
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
Delve into the unexplored potential of scene graphs in the realms of Generative AI and innovative data product development. This session unveils the intricate role of scene graphs in generating realistic content and driving advancements in computer vision, and automated content creation. Join us for a journey into the intersection of scene graphs and cutting-edge AI, gaining insights into their pivotal role in reshaping the landscape of data-centric innovation. This talk is your gateway to understanding how structured visual representations are shaping the future of AI and revolutionizing the creation of data-driven solutions.
This presentation will delve into the transformative role of Artificial Intelligence in reshaping social media landscapes. We'll explore cutting-edge AI technologies that are integrating with social media platforms, altering how we interact, consume content, and perceive digital communities. The talk will also cast a visionary eye towards future trends, discussing potential impacts on user experience, content creation, digital marketing, and privacy concerns. Join us to uncover how AI is not just a tool but a game-changer in the evolving narrative of social media.
Supercharge your software development with Azure OpenAI Service! Azure cloud platform provides access to cutting-edge AI models for diverse tasks. Explore different models for generating content, translating languages, and even generating code. Leverage data grounding to fine-tune models for your specific needs. Discover how Azure OpenAI Service accelerates innovation and injects intelligence into your software creations.
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
In this insightful talk, we'll embark on a journey from the origins of programming in 1883 and the conceptualization of AI in the 1950s, to the current explosion of AI applications reshaping our world. We'll unravel why AI has surged to prominence in the last decade, driven by unprecedented data generation and significant hardware advancements. With examples ranging from individual email filtering to complex supply chain optimizations, we'll explore AI's pervasive impact across various sectors including finance, manufacturing, healthcare, and media. The talk will address the challenges of AI implementation, such as the high cost of AI teams and the quest for universally applicable models, while highlighting the promising horizon of no-code AI platforms democratizing access. Furthermore, we'll delve into the ethical dimensions of AI, from biases to privacy concerns, and the pressing question of AI's potential to replace human roles. Lastly, we'll discuss the transformative potential of language models and generative AI, underscoring the importance of understanding and integrating AI into our lives and businesses for a future that's both scalable and sustainable.
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
Transitioning to a career in data science requires careful planning and smart choices. In this session, I'll help you understand how to switch to data science. Using my own experiences and what I've learned from the industry, we'll break down the important steps for a successful transition. We'll cover everything from figuring out which skills you can carry over to learning the technical stuff and connecting with other professionals. By the end, you'll have the knowledge and tools you need to start your journey into data science, whether you're a seasoned professional looking for something new or just starting out in the field.
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
With the continuous growth of the digital environment, the risks in the online realm also increase. This calls for strong security measures to safeguard valuable information and essential systems. Artificial Intelligence (AI) has become a powerful weapon in the fight against cyber threats. This talk presents a thorough examination of the most recent algorithms and applications of artificial intelligence in the field of cybersecurity.
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
What is Generative AI and how does it work? Could it eventually replace us? Let's delve deep into the heart of this groundbreaking technology and uncover the truths and myths surrounding Generative AI and how to make the most of it.
Background: The digital twin paradigm holds great promise for healthcare, most importantly efficiently integrating many disparate healthcare data sources and servicing complex tasks like personalizing care, predicting health outcomes, and planning patient care, even though many technical and scientific challenges remain to be overcome. Objective: As part of the QUALITOP project, we conducted a comprehensive analysis of diverse healthcare data, encompassing both prospective and retrospective datasets, along with an in-depth examination of the advanced analytical needs of medical institutions across five European Union countries. Through these endeavors, we have systematically developed and refined a formal Personal Medical Digital Twin (PMDT) model subjected to iterative validation by medical institutions to ensure its applicability, efficacy, and utility. Findings: The PMDT is based on an interconnected set of expressive knowledge structures that are calibrated to capture an individual patient’s psychosomatic, cognitive, biometrical and genetic information in one personal digital footprint in a manner that allows medical professionals to run various models to predict an individual’s health issues over time and intervene early with personalized preventive care.Conclusion: At the forefront of digital transformation, the PMDT emerges as a pivotal entity, positioned at the convergence of Big Data and Artificial Intelligence. This paper introduces a PMDT environment that lays the foundation for the application of comprehensive big data analytics, continuous monitoring, cognitive simulations, and AI techniques. By integrating stakeholders across the care continuum, including patients, this system enables the derivation of insights and facilitates informed decision-making for personalized preventive care.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
[DSC Europe 23] Predrag Ilic & Simeon Rilling - From Data Lakes to Data Mesh - The Evolution of redmesh
1. From Data Lakes to Data Mesh
The evolution of redmesh
DSC Europe 23
Predrag Ilić – Cloud Tech Lead
Rilling Simeon– Product Owner Enterprise Data Mesh