Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
Presented at The Hawaii International Conference on System Sciences by Hong-Mei Chen and Rick Kazman (University of Hawaii), Serge Haziyev (SoftServe).
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
Hortonworks and Revolution Analytics have teamed up to bring the predictive analytics power of R to Hortonworks Data Platform.
Hadoop, being a disruptive data processing framework, has made a large impact in the data ecosystems of today. Enabling business users to translate existing skills to Hadoop is necessary to encourage the adoption and allow businesses to get value out of their Hadoop investment quickly. R, being a prolific and rapidly growing data analysis language, now has a place in the Hadoop ecosystem.
This presentation covers:
- Trends and business drivers for Hadoop
- How Hortonworks and Revolution Analytics play a role in the modern data architecture
- How you can run R natively in Hortonworks Data Platform to simply move your R-powered analytics to Hadoop
Presentation replay at:
http://www.revolutionanalytics.com/news-events/free-webinars/2013/modern-data-architecture-revolution-hortonworks/
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
Why and How has the Big Data based Enterprise Data Lake solution based on No-SQL and SQL technologies has become significantly effective in solving enterprise data challenges than its predecessor EDW which had tried and failed to solve the same problem entirely based on SQL database only.
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...Denodo
Companies such as Autodesk are fast replacing the once-true- and-tried physical data warehouses with logical data warehouses/ data lakes. Why? Because they are able to accomplish the same results in 1/6 th of the time and with 1/4 th of the resources.
In this webinar, Autodesk’s Platform Lead, Kurt Jackson,, will describe how they designed a modern fast data architecture as a single unified logical data warehouse/ data lake using data virtualization and contemporary big data analytics like Spark.
Logical data warehouse / data lake is a virtual abstraction layer over the physical data warehouse, big data repositories, cloud, and other enterprise applications. It unifies both structured and unstructured data in real-time to power analytical and operational use cases.
This presentation will help you understand the basic building blocks of Business Intelligence. Learn how decisions are triggered, the complete decision process and who makes decisions in the corporate world.
More importantly, understand core components of a Business Intelligence architecture such as a data warehouse, data mining, OLAP (Online analytical procession) , OLTP (Online Transaction Processing) and data reporting. Each component plays an integral part which enables today's managers and decision makers collect, analyze and interpret data to make it actionable for decision making.
Business intelligence has become an integral part that needs to be incorporated to ensure business survival. It is a tool that helps analyze historical data and forecast future so that your are always one step ahead in your business.
Please feel free to like, share and comment as you please!
Agile Big Data Analytics Development: An Architecture-Centric ApproachSoftServe
Presented at The Hawaii International Conference on System Sciences by Hong-Mei Chen and Rick Kazman (University of Hawaii), Serge Haziyev (SoftServe).
The Modern Data Architecture for Predictive Analytics with Hortonworks and Re...Revolution Analytics
Hortonworks and Revolution Analytics have teamed up to bring the predictive analytics power of R to Hortonworks Data Platform.
Hadoop, being a disruptive data processing framework, has made a large impact in the data ecosystems of today. Enabling business users to translate existing skills to Hadoop is necessary to encourage the adoption and allow businesses to get value out of their Hadoop investment quickly. R, being a prolific and rapidly growing data analysis language, now has a place in the Hadoop ecosystem.
This presentation covers:
- Trends and business drivers for Hadoop
- How Hortonworks and Revolution Analytics play a role in the modern data architecture
- How you can run R natively in Hortonworks Data Platform to simply move your R-powered analytics to Hadoop
Presentation replay at:
http://www.revolutionanalytics.com/news-events/free-webinars/2013/modern-data-architecture-revolution-hortonworks/
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
Why and How has the Big Data based Enterprise Data Lake solution based on No-SQL and SQL technologies has become significantly effective in solving enterprise data challenges than its predecessor EDW which had tried and failed to solve the same problem entirely based on SQL database only.
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...Denodo
Companies such as Autodesk are fast replacing the once-true- and-tried physical data warehouses with logical data warehouses/ data lakes. Why? Because they are able to accomplish the same results in 1/6 th of the time and with 1/4 th of the resources.
In this webinar, Autodesk’s Platform Lead, Kurt Jackson,, will describe how they designed a modern fast data architecture as a single unified logical data warehouse/ data lake using data virtualization and contemporary big data analytics like Spark.
Logical data warehouse / data lake is a virtual abstraction layer over the physical data warehouse, big data repositories, cloud, and other enterprise applications. It unifies both structured and unstructured data in real-time to power analytical and operational use cases.
This presentation will help you understand the basic building blocks of Business Intelligence. Learn how decisions are triggered, the complete decision process and who makes decisions in the corporate world.
More importantly, understand core components of a Business Intelligence architecture such as a data warehouse, data mining, OLAP (Online analytical procession) , OLTP (Online Transaction Processing) and data reporting. Each component plays an integral part which enables today's managers and decision makers collect, analyze and interpret data to make it actionable for decision making.
Business intelligence has become an integral part that needs to be incorporated to ensure business survival. It is a tool that helps analyze historical data and forecast future so that your are always one step ahead in your business.
Please feel free to like, share and comment as you please!
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program.
For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.
Data Lakes are early in the Gartner hype cycle, but companies are getting value from their cloud-based data lake deployments. Break through the confusion between data lakes and data warehouses and seek out the most appropriate use cases for your big data lakes.
Big Data, IoT, data lake, unstructured data, Hadoop, cloud, and massively parallel processing (MPP) are all just fancy words unless you can find uses cases for all this technology. Join me as I talk about the many use cases I have seen, from streaming data to advanced analytics, broken down by industry. I’ll show you how all this technology fits together by discussing various architectures and the most common approaches to solving data problems and hopefully set off light bulbs in your head on how big data can help your organization make better business decisions.
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
If you’ve spent time investigating Big Data, you quickly realize that the issues surrounding Big Data are often complex to analyze and solve. The sheer volume, velocity and variety changes the way we think about data – including how enterprises approach data architecture.
Significant reduction in costs for processing, managing, and storing data, combined with the need for business agility and analytics, requires CIOs and enterprise architects to rethink their enterprise data architecture and develop a next-generation approach to solve the complexities of Big Data.
Creating the data architecture while integrating Big Data into the heart of the enterprise data architecture is a challenge. This webinar covered:
-Why Big Data capabilities must be strategically integrated into an enterprise’s data architecture
-How a next-generation architecture can be conceptualized
-The key components to a robust next generation architecture
-How to incrementally transition to a next generation data architecture
Data summit connect fall 2020 - rise of data opsRyan Gross
Data governance teams attempt to apply manual control at various points for consistency and quality of the data. By thinking of our machine learning data pipelines as compilers that convert data into executable functions and leveraging data version control, data governance and engineering teams can engineer the data together, filing bugs against data versions, applying quality control checks to the data compilers, and other activities. This talk illustrates how innovations are poised to drive process and cultural changes to data governance, leading to order-of-magnitude improvements.
Building enterprise advance analytics platformHaoran Du
By Raymond Fu - Practice Architect
This lecture talks about the best practices in building an advanced analytics platform to help companies apply machine learning, deep learning and data science to their structured and unstructured data.
At Southern California Data Science Conference Sept.25.2016 at USC
http://socaldatascience.org/
http://www.datalaus.com/en/
The importance of efficient data management for Digital TransformationMongoDB
Digital Transformation has developed from hype into a “standard” tool for businesses that need to modernise and compete. Experiencing pressure from new market entrants, incumbents are challenged on a daily basis to redefine their ways of doing business. This doesn’t only include people and processes, but of course also the underlying technology. With data being the force behind the most successful transformation stories in the past years, we are explored some of the challenges of legacy Information Management Systems, and look at new ways of managing Data in Motion, Data at Rest, and Data in Use to drive a successful Digital Transformation programme to gain a competitive advantage.
BI is the “Gathering of data from multiple sources to present it in a way that allows executives to make better business decisions”. I will describe in more detail exactly what BI is, what encompasses the Microsoft BI stack, why it is so popular, and why a BI career pays so much. I will review specific examples from previous projects of mine that show the benefits of BI and its huge return-on-investment. I'll go into detail on the components of a BI solution, and I will discuss key concepts for successfully implementing BI in your organization.
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsightsGord Sissons
This presentation is from TDWI's event in Boston during the summer of 2014. IBM InfoSphere BigInsights is IBM's enterprise grade Hadoop offering. It combines the best of open-source Hadoop, with advanced capabilities including Big SQL that clients can optionally deploy to get to market faster with a variety of big data and analytic applications.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
Joe Caserta, President at Caserta Concepts presented at the 3rd Annual Enterprise DATAVERSITY conference. The emphasis of this year's agenda is on the key strategies and architecture necessary to create a successful, modern data analytics organization.
Joe Caserta presented Incorporating the Data Lake into Your Analytics Architecture.
For more information on the services offered by Caserta Concepts, visit out website at http://casertaconcepts.com/.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program.
For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.
Data Lakes are early in the Gartner hype cycle, but companies are getting value from their cloud-based data lake deployments. Break through the confusion between data lakes and data warehouses and seek out the most appropriate use cases for your big data lakes.
Big Data, IoT, data lake, unstructured data, Hadoop, cloud, and massively parallel processing (MPP) are all just fancy words unless you can find uses cases for all this technology. Join me as I talk about the many use cases I have seen, from streaming data to advanced analytics, broken down by industry. I’ll show you how all this technology fits together by discussing various architectures and the most common approaches to solving data problems and hopefully set off light bulbs in your head on how big data can help your organization make better business decisions.
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
If you’ve spent time investigating Big Data, you quickly realize that the issues surrounding Big Data are often complex to analyze and solve. The sheer volume, velocity and variety changes the way we think about data – including how enterprises approach data architecture.
Significant reduction in costs for processing, managing, and storing data, combined with the need for business agility and analytics, requires CIOs and enterprise architects to rethink their enterprise data architecture and develop a next-generation approach to solve the complexities of Big Data.
Creating the data architecture while integrating Big Data into the heart of the enterprise data architecture is a challenge. This webinar covered:
-Why Big Data capabilities must be strategically integrated into an enterprise’s data architecture
-How a next-generation architecture can be conceptualized
-The key components to a robust next generation architecture
-How to incrementally transition to a next generation data architecture
Data summit connect fall 2020 - rise of data opsRyan Gross
Data governance teams attempt to apply manual control at various points for consistency and quality of the data. By thinking of our machine learning data pipelines as compilers that convert data into executable functions and leveraging data version control, data governance and engineering teams can engineer the data together, filing bugs against data versions, applying quality control checks to the data compilers, and other activities. This talk illustrates how innovations are poised to drive process and cultural changes to data governance, leading to order-of-magnitude improvements.
Building enterprise advance analytics platformHaoran Du
By Raymond Fu - Practice Architect
This lecture talks about the best practices in building an advanced analytics platform to help companies apply machine learning, deep learning and data science to their structured and unstructured data.
At Southern California Data Science Conference Sept.25.2016 at USC
http://socaldatascience.org/
http://www.datalaus.com/en/
The importance of efficient data management for Digital TransformationMongoDB
Digital Transformation has developed from hype into a “standard” tool for businesses that need to modernise and compete. Experiencing pressure from new market entrants, incumbents are challenged on a daily basis to redefine their ways of doing business. This doesn’t only include people and processes, but of course also the underlying technology. With data being the force behind the most successful transformation stories in the past years, we are explored some of the challenges of legacy Information Management Systems, and look at new ways of managing Data in Motion, Data at Rest, and Data in Use to drive a successful Digital Transformation programme to gain a competitive advantage.
BI is the “Gathering of data from multiple sources to present it in a way that allows executives to make better business decisions”. I will describe in more detail exactly what BI is, what encompasses the Microsoft BI stack, why it is so popular, and why a BI career pays so much. I will review specific examples from previous projects of mine that show the benefits of BI and its huge return-on-investment. I'll go into detail on the components of a BI solution, and I will discuss key concepts for successfully implementing BI in your organization.
Use cases for Hadoop and Big Data Analytics - InfoSphere BigInsightsGord Sissons
This presentation is from TDWI's event in Boston during the summer of 2014. IBM InfoSphere BigInsights is IBM's enterprise grade Hadoop offering. It combines the best of open-source Hadoop, with advanced capabilities including Big SQL that clients can optionally deploy to get to market faster with a variety of big data and analytic applications.
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
Joe Caserta, President at Caserta Concepts presented at the 3rd Annual Enterprise DATAVERSITY conference. The emphasis of this year's agenda is on the key strategies and architecture necessary to create a successful, modern data analytics organization.
Joe Caserta presented Incorporating the Data Lake into Your Analytics Architecture.
For more information on the services offered by Caserta Concepts, visit out website at http://casertaconcepts.com/.
Shows some advanced REXX techniques to make your programs more efficient and more readable for easier debugging. Also describes some tips for creating file and program structures not discussed in a typical REXX class.
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
Erik Baardse and Ajit Gadge from EDB Postgres presented on how to transform your DBMS in order to drive digital business. How Postgres enables you to support a wider range of workloads with your relational database which opens the Big Data doors. They also cover EnterpriseDB’s Strategy around Big Data which focuses on 3 areas and finally last but not the last how to find money in IT with Big Data and digital transformation
Customer value analysis of big data productsVikas Sardana
Business value analysis through Customer Value Model for software technology choices with a case study from Mobile Advertising industry for Big Data use case.
The Common BI/Big Data Challenges and Solutions presented by seasoned experts, Andriy Zabavskyy (BI Architect) and Serhiy Haziyev (Director of Software Architecture).
This was a complimentary workshop where attendees had the opportunity to learn, network and share knowledge during the lunch and education session.
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarioskcmallu
What's the origin of Big Data? What are the real life usage scenarios where Hadoop has been successfully adopted? How do you get started within your organizations?
The next generation user experience should move to customer engagement zones along their preferred channels with desired action to outcome approaches. With scores of information ranging from inventory to inquiry, weather to warehouse alerts, product to promotion info at disposal, enterprise digitization can create value at every customer touch point. Attendees witnessed the manifestation of TCS’ Thought Leadership in the Game of Retail.
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
Watch here: https://bit.ly/2NGQD7R
In an era increasingly dominated by advancements in cloud computing, AI and advanced analytics it may come as a shock that many organizations still rely on data architectures built before the turn of the century. But that scenario is rapidly changing with the increasing adoption of real-time data virtualization - a paradigm shift in the approach that organizations take towards accessing, integrating, and provisioning data required to meet business goals.
As data analytics and data-driven intelligence takes centre stage in today’s digital economy, logical data integration across the widest variety of data sources, with proper security and governance structure in place has become mission-critical.
Attend this session to learn:
- Learn how you can meet cloud and data science challenges with data virtualization.
- Why data virtualization is increasingly finding enterprise-wide adoption
- Discover how customers are reducing costs and improving ROI with data virtualization
Horses for Courses: Database RoundtableEric Kavanagh
The blessing and curse of today's database market? So many choices! While relational databases still dominate the day-to-day business, a host of alternatives has evolved around very specific use cases: graph, document, NoSQL, hybrid (HTAP), column store, the list goes on. And the database tools market is teeming with activity as well. Register for this special Research Webcast to hear Dr. Robin Bloor share his early findings about the evolving database market. He'll be joined by Steve Sarsfield of HPE Vertica, and Robert Reeves of Datical in a roundtable discussion with Bloor Group CEO Eric Kavanagh. Send any questions to info@insideanalysis.com, or tweet with #DBSurvival.
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
In this webinar, Carl W. Olofson, Research Vice President, Application Development and Deployment for IDC, and Dale Kim, Director of Industry Solutions for MapR, will provide an insightful outlook for Hadoop in 2015, and will outline why enterprises should consider using Hadoop as a "Decision Data Platform" and how it can function as a single platform for both online transaction processing (OLTP) and real-time analytics.
So you got a handle on what Big Data is and how you can use it to find business value in your data. Now you need an understanding of the Microsoft products that can be used to create a Big Data solution. Microsoft has many pieces of the puzzle and in this presentation I will show how they fit together. How does Microsoft enhance and add value to Big Data? From collecting data, transforming it, storing it, to visualizing it, I will show you Microsoft’s solutions for every step of the way
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with HadoopPrecisely
With so many new, evolving frameworks, tools, and languages, a new big data project can lead to confusion and unwarranted risk.
Many organizations have found Data Warehouse Optimization with Hadoop to be a good starting point on their Big Data journey. Offloading ETL workloads from the enterprise data warehouse (EDW) into Hadoop is a well-defined use case that produces tangible results for driving more insights while lowering costs. You gain significant business agility, avoid costly EDW upgrades, and free up EDW capacity for faster queries. This quick win builds credibility and generates savings to reinvest in more Big Data projects.
A proven reference architecture that includes everything you need in a turnkey solution – the Hadoop distribution, data integration software, servers, networking and services – makes it even easier to get started.
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
Tackling the challenge of designing a machine learning model and putting it into production is the key to getting value back – and the roadblock that stops many promising machine learning projects. After the data scientists have done their part, engineering robust production data pipelines has its own set of challenges. Syncsort software helps the data engineer every step of the way.
Building on the process of finding and matching duplicates to resolve entities, the next step is to set up a continuous streaming flow of data from data sources so that as the sources change, new data automatically gets pushed through the same transformation and cleansing data flow – into the arms of machine learning models.
Some of your sources may already be streaming, but the rest are sitting in transactional databases that change hundreds or thousands of times a day. The challenge is that you can’t affect performance of data sources that run key applications, so putting something like database triggers in place is not the best idea. Using Apache Kafka or similar technologies as the backbone to moving data around doesn’t solve the problem of needing to grab changes from the source pushing them into Kafka and consuming the data from Kafka to be processed. If something unexpected happens – like connectivity is lost on either the source or the target side, you don’t want to have to fix it or start over because the data is out of sync.
View this 15-minute webcast on-demand to learn how to tackle these challenges in large scale production implementations.
Choosing technologies for a big data solution in the cloudJames Serra
Has your company been building data warehouses for years using SQL Server? And are you now tasked with creating or moving your data warehouse to the cloud and modernizing it to support “Big Data”? What technologies and tools should use? That is what this presentation will help you answer. First we will cover what questions to ask concerning data (type, size, frequency), reporting, performance needs, on-prem vs cloud, staff technology skills, OSS requirements, cost, and MDM needs. Then we will show you common big data architecture solutions and help you to answer questions such as: Where do I store the data? Should I use a data lake? Do I still need a cube? What about Hadoop/NoSQL? Do I need the power of MPP? Should I build a "logical data warehouse"? What is this lambda architecture? Can I use Hadoop for my DW? Finally, we’ll show some architectures of real-world customer big data solutions. Come to this session to get started down the path to making the proper technology choices in moving to the cloud.
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
In recent years, Apache™ Hadoop® has emerged from humble beginnings to disrupt the traditional disciplines of information management. As with all technology innovation, hype is rampant, and data professionals are easily overwhelmed by diverse opinions and confusing messages.
Even seasoned practitioners sometimes miss the point, claiming for example that Hadoop replaces relational databases and is becoming the new data warehouse. It is easy to see where these claims originate since both Hadoop and Teradata® systems run in parallel, scale up to enormous data volumes and have shared-nothing architectures. At a conceptual level, it is easy to think they are interchangeable, but the differences overwhelm the similarities. This session will shed light on the differences and help architects, engineering executives, and data scientists identify when to deploy Hadoop and when it is best to use MPP relational database in a data warehouse, discovery platform, or other workload-specific applications.
Two of the most trusted experts in their fields, Steve Wooledge, VP of Product Marketing from Teradata and Jim Walker of Hortonworks will examine how big data technologies are being used today by practical big data practitioners.
Health care or healthcare is the maintenance or improvement of health via the diagnosis, treatment, and prevention of disease, illness, injury, and other physical and mental impairments in human beings
Manufacturing is the production of merchandise for use or sale using labour and machines, tools, chemical and biological processing, or formulation. The term may refer to a range of human activity, from handicraft to high tech, but is most commonly applied to industrial production, in which raw materials are transformed into finished goods on a large scale.
Logistics is the function of making goods and other resources physically available for use as and when required. This generally includes two basic activities of moving or transporting these resources, and storing them at different location till required for use or further transportation.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
3. Big Data is a Hot Topic Because Technology Makes it Possible to Analyze
ALL Available Data
Cost effectively manage and analyze
all available data in its native form
unstructured, structured, streaming
ERP
CRM RFID
Website
Network Switches
Social Media
Billing
4. BIG DATA is not just HADOOP
Manage & store huge
volume of any data
Hadoop File System
MapReduce
Manage streaming data Stream Computing
Analyze unstructured data Text Analytics Engine
Data WarehousingStructure and control data
Integrate and govern all
data sources
Integration, Data Quality, Security,
Lifecycle Management, MDM
Understand and navigate
federated big data sources
Federated Discovery and Navigation
5. Business-Centric Big Data Enables You to Start With a Critical Business Pain and Expand the
Foundation for Future Requirements
“Big data” isn’t just a
technology—it’s a business
strategy for capitalizing on
information resources
Getting started is crucial
Success at each entry point is
accelerated by products within
the Big Data platform
Build the foundation for future
requirements by expanding
further into the big data platform
6. 1 – Unlock Big Data
Customer need
• Understand existing data sources
• Search and navigate data within existing
systems
• No copying of data
Value statement
• Get up and running quickly
• Discover and retrieve big data
• Work even with big data sources – by
business users
Solution
• Vivisimo Velocity renamed to
• IBM InfoSphere DataDiscovery
7. 2 – Analyze Raw Data
Customer need
• Ingest data as-is into Hadoop
• Combine it with data from DWH
• Process very large volume of data
Value statement
• Gain new insight
• Overcome the high cost of converting
data from unstructured to structured
format
• Experiment with analysis on different
data and combine them with other
sources
Solution
• IBM InfoSphere BigInsights
8. Merging the Traditional and Big Data
Approaches
IT
Structures the
data to answer
that question
IT
Delivers a platform to
enable creative
discovery
Business
Explores what questions
could be asked
Business Users
Determine what
question to ask
Monthly sales reports
Profitability analysis
Customer surveys
Brand sentiment
Product strategy
Maximum asset utilization
Big Data Approach
Iterative & Exploratory Analysis
Traditional Approach
Structured & Repeatable Analysis
9. InfoSphere BigInsights is more than
just HADOOP
IBM InfoSphere Big Insights
• Is much more than HADOOP
IBM Big data platform
• Includes much more than
IBM InfoSphere Big Insights
10. Hadoop
Open-source software framework from Apache
Inspired by
Google MapReduce
GFS (Google File System)
HDFS
Map/Reduce
11. InfoSphere BigInsightsPlatform for volume, variety,
velocity
Enhanced Hadoop
foundation
Analytics
Text analytics & tooling
Application accelerators
Usability
Web console
Spreadsheet-style tool
Ready-made “apps”
Enterprise Class
Storage, security, cluster
management
Integration
Connectivity to Netezza,
DB2, JDBC databases, etc
Apache
Hadoop
Basic Edition
Enterprise Edition
Licensed
Application accelerators
Pre-built applications
Text analytics
Spreadsheet-style tool
RDBMS, warehouse connectivity
Administrative tools, security
Eclipse development tools
Performance enhancements
. . . .
Free download
Integrated install
Online InfoCenter
BigData Univ.
Breadth of capabilities
Enterpriseclass
Can run also on top of
12. Spreadsheet-style Analysis
Web-based analysis
and visualization
Spreadsheet-like
interface
Define and manage
long running data
collection jobs
Analyze content of
the text on the
pages that have
been retrieved
13. Build a Big Data Program – MapReduce example
Eclipse tools
For Jaql, Hive, Pig Java MapReduce, BigSheets
plug-ins, text analytics, etc.
14. JAQL – IBM’s programming language in hadoop world
• Jaql is a complete solutions environment
supporting all other BigInsights components Integration point for
various analytics
– Text analytics
– Statistical analysis
– Machine learning
– Ad-hoc analysis
Integration point for
various data sources
– Local and distributed file
systems
– NoSQL data bases
– Content repositories
– Relational sources
(Warehouses, operational
data bases)
BigInsightsText
Analytics
StatisticalAnalysis
(Rmodule)
Machinelearning
(SystemML)
Ad-Hocanalysis
(BigSheets)
(Integration)DB2,
Netezza,Streams,
…
Jaql
Jaql I/O Jaql Core
Operators
Jaql Modules
DFS NoSQL RDBMS File System
16. 3 – Simplify your warehouse
Customer need – SIGNIFICANTLY
• Make performance of DWH better
• Reduce DWH administration costs
Value statement
• Speed: 10 – 100x better performance
• Simplicity: Administration costs reduced by 75% - 90%
• Scalability
• Smart system
• In-database analytics
• Out-of-the box integration with SPSS
Solution
• IBM Netezza renamed to
• PureData System for Analytics
17. Analyst
IT
I need to evaluate the possible
relationship between client salary
and overdrafts
OK. We have to evaluate a lot of
statistics, set the correct db
indexes and db partitioning. It will
take us 5 days.
18. Analyst IT
Great. Thanks a lot.
I’m going to check the results.
Done. You can run your analytical
query.
19. Analyst IT
Great. I can see here some nice
correlations. Now I need to look at it
from the different perspective.
Ohhh, welcome dear friend.
Understand. So, it’s …. another 5
days of our work
Noooo!!!
It’s not possible to work
here!
21. Analyst
IT
I need to evaluate the possible
relationship between client salary
and overdrafts.
I will use Netezza.
22. Analyst IT
Great. I can see here some nice
correlations. Now I need to look at it from
the different perspective.
With Netezza I can run the query
immediately. The response will be in the
same time
IT can do something else
– much more useful
23.
24. Built-In Expertise Makes This as Simple as an Appliance
2
Dedicated device
Optimized for purpose
Complete solution
Fast installation
Very easy operation
Standard interfaces
Low cost
25. IBM Netezza was renamed to IBM PureData System for Analytics
In October 2012
26. Netezza
Genesis in T-Mobile CZ
Proof-Of-Concept Project
–New EnterpriseDataWarehouse platform selection
–Comparison of existing and other platforms
–Selection Criteria
• Performance
• Operational Savings
….and the winner was: Netezza
27. Netezza Genesis in T-Mobile CZ
Expectations
Significant response improvement:
Faster platform means better reports response
Direct Data Availability
Higher trust in data , one version of truth
Aggregation reduction
Any attribute available
Operational Benefits
Storage savings (no data replicas)
Administration costs reduction(DBA)
Infrastructure Simplification
Lower environment complexity
29. Netezza Genesis in T-Mobile CZ
Actual Status
All relevant ETL procecessing redesigned
Actual parallel run to Original and Netezza platform finished
Netezza as only primary platform
30. Original
Platform
Netezza
Workflow Reporting 2 hours 1 minute
Invoicing and Payments reporting
Payment discipline of current month invoices 33 minutes 17 seconds
Overdue Debt of Invoices – in Current Month 10 hours 23 seconds
Average Monthly Invoice Figures 50 minutes 38 seconds
RESPONSE TIME MASSIVELY IMPROVED
Real Netezza experience from T-Mobile Czech Rep.
31. 4 – Reduce costs with Hadoop
Customer need – SIGNIFICANTLY
• Too much data => Too expensive to store and to maintain
• Big portion is used “just in case”
• Data amount is still growing => it’s more expensive
• => too expensive to have all data in standard DWH
Value statement
• Leverage the architecture of parallel processing in Hadoop
• Hadoop uses cheap commodity HW
• Enable business users still work in the same or similar way
Solution
• IBM InfoSphere BigInsights
32. BigInsights and the data warehouse
BigInsights
• Query-ready archive for “cold” warehouse data
Data Warehouse
Big Data
analytic
applications
Traditional
analytic
tools From Cognos BI
via Hive JDBC
33. Application
SQL interface Engine
InfoSphere BigInsights
HiveTables HBase tables CSV Files
Data Sources
SQL Language
JDBC / ODBC Driver
JDBC / ODBC Server
Future: The SQL interface . . . .
• Rich SQL query capabilities
– SQL '92 and 2011 features
– Correlated subqueries
– Windowed aggregates
• SQL access to all data stored in
InfoSphere BigInsights
• Robust JDBC/ODBC support
• Take advantage of key features of
each data source
• Leverage MapReduce parallelism
OR
achieving low-latency
34. 5 – Analyze Streaming Data
Customer need
• Process and leverage streaming data
• Select valuable data from data stream for
future processing
• Quickly process data going to be useless if it’s
not processed immediately
Value statement
• React in real-time to take an oppurtinity
before it expires
• Periodically adjust streaming models based
on analysis on data at rest
Solution
• IBM InfoSphere Streams
Streams Computing
Streaming Data
Sources
ACTION
35. Why and when to use InfoSphere
Streams?
Sensors
Environmental, Industrial, GPS, …
Images, Videos, …
Data Exhaust
Network data
system logs (web server, app server), …
High-rate transaction data
Financial transactions
CDRs
Isolation
Processing in isolation
… or in limited windows (time / nr. Of records)
Non-traditional formats included Spatial data, images, text, voice, …
Integration challenges
Different connection methods
Different data rates
Different processing requirements
Multiple processing nodes Volume / rate very high => scalability required
Sub-millisecond latency Immediate analysis and response
Store & mine approach doesn’t work Because of very high volume of data (and its rates)
At least 2 criteria from the list bellow should be fulfilled
Applications needing on-fly processing, filtering and analyzing streaming data
36. Streams and BigInsights - Integrated Analytics on Data in Motion &
Data at Rest
1. Data Ingest
Data Integration,
data mining,
machine learning,
statistical modeling
Visualization of real-
time and historical
insights
3. Adaptive Analytics Model
Data ingest,
preparation, online
analysis, model
validation
Data
2. Bootstrap/Enrich
Contro
l flow
InfoSphere
BigInsights,
Database &
Warehouse
InfoSphere
Streams
37. The Platform Advantage
BI /
Reporting
BI /
Reporting
Exploration /
Visualization
Functional
App
Industry
App
Predictive
Analytics
Content
Analytics
Analytic Applications
IBM Big Data Platform
Systems
Management
Application
Development
Visualization
& Discovery
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
BENEFITS IN DETAIL
Increase over
time
By moving from entry to a 2nd
and 3rd project
Lowering
deployment costs
Shared components
Integration
Points of leverage Shared text analytics for
Streams and BigInsights
HDFS connectors (data
integration (ETL, …),
Streams)
Accelerators
Build across multiple
engines