Real time web app integration with hadoop on docker.
Apache Spark – For large data processing
Apache Sqoop – To integrate OLTP databases
Apache HBase – NoSQL store
Apache Kafka – Act as distributed ESB between Real-time application and cluster
Apache Flume – Web application log streaming
Hadoop core [DFS, YARN].
Running hadoop in docker reference architecture.
Volodymyr Lyubinets "Introduction to big data processing with Apache Spark"IT Event
In this talk we’ll explore Apache Spark — the most popular cluster computing framework right now. We’ll look at the improvements that Spark brought over Hadoop MapReduce and what makes Spark so fast; explore Spark programming model and RDDs; and look at some sample use cases for Spark and big data in general.
This talk will be interesting for people who have little or no experience with Spark and would like to learn more about it. It will also be interesting to a general engineering audience as we’ll go over the Spark programming model and some engineering tricks that make Spark fast.
Scan, Import, and Automatically File documents to Google Drive and Docs with ...Capture Components LLC
ccScan Advanced for Google Drive and Docs is best explained with some examples: run unattended jobs to import electronic faxes, upload them to automatically selected Google Docs folders, and automatically name the document, all based on information extracted from barcodes and text pattern searches in text obtained through OCR. Or scan documents and automatically create document names and descriptions. Since ccScan is highly configurable it can be applied to many scenarios where opportunities for automation and large time savings are present.
ccScan Advanced features are based upon sophisticated technologies such as barcode detection, OCR, and Text Pattern search with Regular Expressions. These capabilities are applied to both the scanning of paper documents in the paper-based office and the processing of electronic documents in the paperless office.
ccScan Standard for Google Drive and Docs is similar to ccScan Advanced without the automation capabilities. ccScan Standard is typically used in a paper-based office to efficiently scan paper documents to Google. Use ccScan Standard to eliminate wasted time and the following extra steps: scan a document to your PC using any Twain scanner, name the document, create a folder in Google Docs, upload the document to Google Docs, and finally setting up Google Docs document properties such as Description, Visibility, and Sharing mode. Instead, ccScan Standard does all of the above in a single-step operation.
Windows Azure and SQL Database Tutorials; Jonathan Gao. These Windows Azure and SQL Database (formerly SQL Azure) tutorials are
designed for beginners who have some .NET development experience. Using a common
scenario, each tutorial introduces one or two Windows Azure features or components.
Even though each tutorial builds upon the previous ones, the tutorials are self-contained
and can be used without completing the previous tutorials.
Volodymyr Lyubinets "Introduction to big data processing with Apache Spark"IT Event
In this talk we’ll explore Apache Spark — the most popular cluster computing framework right now. We’ll look at the improvements that Spark brought over Hadoop MapReduce and what makes Spark so fast; explore Spark programming model and RDDs; and look at some sample use cases for Spark and big data in general.
This talk will be interesting for people who have little or no experience with Spark and would like to learn more about it. It will also be interesting to a general engineering audience as we’ll go over the Spark programming model and some engineering tricks that make Spark fast.
Scan, Import, and Automatically File documents to Google Drive and Docs with ...Capture Components LLC
ccScan Advanced for Google Drive and Docs is best explained with some examples: run unattended jobs to import electronic faxes, upload them to automatically selected Google Docs folders, and automatically name the document, all based on information extracted from barcodes and text pattern searches in text obtained through OCR. Or scan documents and automatically create document names and descriptions. Since ccScan is highly configurable it can be applied to many scenarios where opportunities for automation and large time savings are present.
ccScan Advanced features are based upon sophisticated technologies such as barcode detection, OCR, and Text Pattern search with Regular Expressions. These capabilities are applied to both the scanning of paper documents in the paper-based office and the processing of electronic documents in the paperless office.
ccScan Standard for Google Drive and Docs is similar to ccScan Advanced without the automation capabilities. ccScan Standard is typically used in a paper-based office to efficiently scan paper documents to Google. Use ccScan Standard to eliminate wasted time and the following extra steps: scan a document to your PC using any Twain scanner, name the document, create a folder in Google Docs, upload the document to Google Docs, and finally setting up Google Docs document properties such as Description, Visibility, and Sharing mode. Instead, ccScan Standard does all of the above in a single-step operation.
Windows Azure and SQL Database Tutorials; Jonathan Gao. These Windows Azure and SQL Database (formerly SQL Azure) tutorials are
designed for beginners who have some .NET development experience. Using a common
scenario, each tutorial introduces one or two Windows Azure features or components.
Even though each tutorial builds upon the previous ones, the tutorials are self-contained
and can be used without completing the previous tutorials.
In this presentation we introduce the basic concepts around SQL Server Azure: the database in the cloud.
Regards,
Ing. Eduardo Castro, PhD
http://ecastrom.blogspot.com
http://comunidadwindows.org
Hadoop, Evolution of Hadoop, Features of HadoopDr Neelesh Jain
Hadoop, Evolution of Hadoop, Features of Hadoop is explained in the presentation as per the syllabus of RGPV, BU and MCU for the students of BCA, MCA and B. Tech.
Getting Started with Azure SQL Database (Presented at Pittsburgh TechFest 2018)Chad Green
Are you still hosting your databases on your own SQL Server? Would you like to consider putting those up in the cloud? Then come and learn what exactly Azure SQL can do for you and how to go about moving your databases to the cloud.
Ingesting streaming data into Graph DatabaseGuido Schmutz
This talk presents the experience of a customer project where we built a stream-based ingestion into a graph database. It is one thing to load the graph first and then querying it. But it is another story if the data to be added to the graph is constantly streaming in, while querying it. Data is easy to add, if each single message ends up as a new vertex in the graph. But if a message consists of hierarchical information, it most often means creating multiple new vertices as well adding edges to connect this information. What if a node already exists in the graph? Do we create it again or do we rather add edges which link to the existing node? Creating multiple nodes for the same real-life entity is not the best choice, so we have to check for existence first. We end up requiring multiple operations against the graph, which demonstrated to be a bottle neck. This talk presents the implementation of an ingestion pipeline and the design choice we made to improve performance.
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform. A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Data sources flowing into Kafka are often native data streams such as social media streams, telemetry data, financial transactions and many others. But these data stream only contain part of the information. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. To implement new and modern, real-time solutions, an up-to-date view of that information is needed. So how do we make sure that information can flow between the RDBMS and Kafka, so that changes are available in Kafka as soon as possible in near-real-time? This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate and bridging Kafka with Oracle Advanced Queuing (AQ).
Azure SQL Database (SQL DB) is a database-as-a-service (DBaaS) that provides nearly full T-SQL compatibility so you can gain tons of benefits for new databases or by moving your existing databases to the cloud. Those benefits include provisioning in minutes, built-in high availability and disaster recovery, predictable performance levels, instant scaling, and reduced overhead. And gone will be the days of getting a call at 3am because of a hardware failure. If you want to make your life easier, this is the presentation for you.
Seminario realizado en el marco del master CANS en la Facultad de Informática de Barcelona.
Anatomia de una aplicación Web
Demasiadas escrituras en la BD, ¿qué puedo hacer?
¿Cómo puedo aprovechar el "Cloud"?
Optimizando aplicaciones Facebook
CloudStack Metering - Working with Usage Data #CCCNA14ShapeBlue
Organisations looking to build and offer Cloud services on Apache CloudStack need to be able to either monetize their offerings and charge for usage or monitor and report on their Cloud's consumption. Majority of such organisations already have existing billing or business support systems and do not require an integrated billing or reporting system provided the usage data can be exported from CloudStack in a standard and structured format such as XML, JSON, or CSV.
CloudStack includes a Usage Server that creates summary usage records for the various resources consumed in CloudStack. Tariq covers how usage of such resources is metered in CloudStack and also:
· What usage metrics are recorded
· Configuration of the Usage Server
· Creation of the Usage Data
· Explore various methods of accessing the Usage Data
· Overview of solutions for analysing or processing the Usage Data such as MS Excel, CloudPortal (CPBM), Splunk, Amysta.
Data Con LA 2019 - Data warehouse and Kubernetes: Lessons from ClickHouse Ope...Data Con LA
Kubernetes Operators allow you to create custom resources in Kubernetes. They are popular for managing databases, which tend to be complex to manage. Our team built an operator to stand up ClickHouse, a popular open source data warehouse, in Kubernetes clusters. We'll share major learnings from this experience which we feel are applicable generally to running scalable, high performance databases in this environment. The talk starts with a level-set of Kubernetes, ClickHouse, and what an operator does. We'll then jump into the design of the ClickHouse operator example, covering challenges associated with the following problems:* Reducing the complexity of Kubernetes through definition of new resources for databases* Defining and managing storage* Performance, including comparative results which look pretty good* Monitoring* Upgrade and configuration changesKubernetes is not free from challenges, and we'll cover these as we touch on each point above. We'll conclude with a summary of reasons that we think Kubernetes is a great environment for data warehouses, based on our experience to date.
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA
MakeMyTrip - India's #1 online travel platform having more than 70% of the traffic from mobile apps embarked on a journey to revolutionize its customer experience by building a scalable, personalized, machine learning based platform which powers onboarding, in-funnel and post-funnel engagement flows, such as ranking, dynamic pricing, persuasions, cross-sell and propensity models.For a company like MakeMyTrip, the next wave of consumer growth is driven and powered by data products for personalization, context-aware mobile experiences. Having a better data architecture to ingest user activity streams (events), processing and data APIs enable a foundation for real-time feature generation for machine learning models.Topics include:* Why common feature-store, removing dataset fragmentation caused by usecase-by-usecase approach!* Productionizing ML via standardization : MetaConfigs & FeatureCatalog | Reducing Data-Tech Debt* Developing Real-Time Serving store over Spark Streaming, Kafka, RocksDB, Akka HTTP Data APIs* Lifecycle of feature generation | Online(Near Real-Time) & Historical(Batch) Compute* Consistent Feature Engineering & Model Deployment for DSA: DataScience AutomationAs Technology we leverage Kafka, Spark (Streaming, SQL), Scala, Python, AWS (S3, EMR, Glue and other services), DRUID, Hive, Presto, Cassandra, RocksDB, Redis, Akka HTTP
Current big data technology scope overview prepared for V.I.Tech and Wellcentive companies. Answers questions why we are taking these products and what do we really do with them on very high level.
This webinar (done in December,2007) shows how the new Data Services capability in WSO2's Web Services Application Server can become a key component in your SOA/Data strategy. Using simple screens and a basic knowledge of SQL, any database programmer or administrator can configure and expose Data Services. As well as major databases such as Oracle, DB2 and MySQL, you can also extract data from Excel and CSV files.
Docker Java App with MariaDB – Deployment in Less than a Minutedchq
DCHQ is a deployment automation, life-cycle management & governance platform for Docker-based applications. Developers can model, deploy, backup, update and monitor container-based applications in seconds.
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
In this presentation we introduce the basic concepts around SQL Server Azure: the database in the cloud.
Regards,
Ing. Eduardo Castro, PhD
http://ecastrom.blogspot.com
http://comunidadwindows.org
Hadoop, Evolution of Hadoop, Features of HadoopDr Neelesh Jain
Hadoop, Evolution of Hadoop, Features of Hadoop is explained in the presentation as per the syllabus of RGPV, BU and MCU for the students of BCA, MCA and B. Tech.
Getting Started with Azure SQL Database (Presented at Pittsburgh TechFest 2018)Chad Green
Are you still hosting your databases on your own SQL Server? Would you like to consider putting those up in the cloud? Then come and learn what exactly Azure SQL can do for you and how to go about moving your databases to the cloud.
Ingesting streaming data into Graph DatabaseGuido Schmutz
This talk presents the experience of a customer project where we built a stream-based ingestion into a graph database. It is one thing to load the graph first and then querying it. But it is another story if the data to be added to the graph is constantly streaming in, while querying it. Data is easy to add, if each single message ends up as a new vertex in the graph. But if a message consists of hierarchical information, it most often means creating multiple new vertices as well adding edges to connect this information. What if a node already exists in the graph? Do we create it again or do we rather add edges which link to the existing node? Creating multiple nodes for the same real-life entity is not the best choice, so we have to check for existence first. We end up requiring multiple operations against the graph, which demonstrated to be a bottle neck. This talk presents the implementation of an ingestion pipeline and the design choice we made to improve performance.
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
Apache Kafka is a popular distributed streaming data platform. A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Data sources flowing into Kafka are often native data streams such as social media streams, telemetry data, financial transactions and many others. But these data stream only contain part of the information. A lot of data necessary in stream processing is stored in traditional systems backed by relational databases. To implement new and modern, real-time solutions, an up-to-date view of that information is needed. So how do we make sure that information can flow between the RDBMS and Kafka, so that changes are available in Kafka as soon as possible in near-real-time? This session will present different approaches for integrating relational databases with Kafka, such as Kafka Connect, Oracle GoldenGate and bridging Kafka with Oracle Advanced Queuing (AQ).
Azure SQL Database (SQL DB) is a database-as-a-service (DBaaS) that provides nearly full T-SQL compatibility so you can gain tons of benefits for new databases or by moving your existing databases to the cloud. Those benefits include provisioning in minutes, built-in high availability and disaster recovery, predictable performance levels, instant scaling, and reduced overhead. And gone will be the days of getting a call at 3am because of a hardware failure. If you want to make your life easier, this is the presentation for you.
Seminario realizado en el marco del master CANS en la Facultad de Informática de Barcelona.
Anatomia de una aplicación Web
Demasiadas escrituras en la BD, ¿qué puedo hacer?
¿Cómo puedo aprovechar el "Cloud"?
Optimizando aplicaciones Facebook
CloudStack Metering - Working with Usage Data #CCCNA14ShapeBlue
Organisations looking to build and offer Cloud services on Apache CloudStack need to be able to either monetize their offerings and charge for usage or monitor and report on their Cloud's consumption. Majority of such organisations already have existing billing or business support systems and do not require an integrated billing or reporting system provided the usage data can be exported from CloudStack in a standard and structured format such as XML, JSON, or CSV.
CloudStack includes a Usage Server that creates summary usage records for the various resources consumed in CloudStack. Tariq covers how usage of such resources is metered in CloudStack and also:
· What usage metrics are recorded
· Configuration of the Usage Server
· Creation of the Usage Data
· Explore various methods of accessing the Usage Data
· Overview of solutions for analysing or processing the Usage Data such as MS Excel, CloudPortal (CPBM), Splunk, Amysta.
Data Con LA 2019 - Data warehouse and Kubernetes: Lessons from ClickHouse Ope...Data Con LA
Kubernetes Operators allow you to create custom resources in Kubernetes. They are popular for managing databases, which tend to be complex to manage. Our team built an operator to stand up ClickHouse, a popular open source data warehouse, in Kubernetes clusters. We'll share major learnings from this experience which we feel are applicable generally to running scalable, high performance databases in this environment. The talk starts with a level-set of Kubernetes, ClickHouse, and what an operator does. We'll then jump into the design of the ClickHouse operator example, covering challenges associated with the following problems:* Reducing the complexity of Kubernetes through definition of new resources for databases* Defining and managing storage* Performance, including comparative results which look pretty good* Monitoring* Upgrade and configuration changesKubernetes is not free from challenges, and we'll cover these as we touch on each point above. We'll conclude with a summary of reasons that we think Kubernetes is a great environment for data warehouses, based on our experience to date.
Data Con LA 2019 - MetaConfig driven FeatureStore with Feature compute & Serv...Data Con LA
MakeMyTrip - India's #1 online travel platform having more than 70% of the traffic from mobile apps embarked on a journey to revolutionize its customer experience by building a scalable, personalized, machine learning based platform which powers onboarding, in-funnel and post-funnel engagement flows, such as ranking, dynamic pricing, persuasions, cross-sell and propensity models.For a company like MakeMyTrip, the next wave of consumer growth is driven and powered by data products for personalization, context-aware mobile experiences. Having a better data architecture to ingest user activity streams (events), processing and data APIs enable a foundation for real-time feature generation for machine learning models.Topics include:* Why common feature-store, removing dataset fragmentation caused by usecase-by-usecase approach!* Productionizing ML via standardization : MetaConfigs & FeatureCatalog | Reducing Data-Tech Debt* Developing Real-Time Serving store over Spark Streaming, Kafka, RocksDB, Akka HTTP Data APIs* Lifecycle of feature generation | Online(Near Real-Time) & Historical(Batch) Compute* Consistent Feature Engineering & Model Deployment for DSA: DataScience AutomationAs Technology we leverage Kafka, Spark (Streaming, SQL), Scala, Python, AWS (S3, EMR, Glue and other services), DRUID, Hive, Presto, Cassandra, RocksDB, Redis, Akka HTTP
Current big data technology scope overview prepared for V.I.Tech and Wellcentive companies. Answers questions why we are taking these products and what do we really do with them on very high level.
This webinar (done in December,2007) shows how the new Data Services capability in WSO2's Web Services Application Server can become a key component in your SOA/Data strategy. Using simple screens and a basic knowledge of SQL, any database programmer or administrator can configure and expose Data Services. As well as major databases such as Oracle, DB2 and MySQL, you can also extract data from Excel and CSV files.
Docker Java App with MariaDB – Deployment in Less than a Minutedchq
DCHQ is a deployment automation, life-cycle management & governance platform for Docker-based applications. Developers can model, deploy, backup, update and monitor container-based applications in seconds.
Data warehouse on Kubernetes - gentle intro to Clickhouse Operator, by Robert...Altinity Ltd
San Diego Cloud Native Computing Meetup, January 23, 2020
Presented by Robert Hodges, Altinity CEO
Data services are the latest wave of applications to catch the Kubernetes bug, but how many people would guess that includes data warehouses? We proved it works by developing the ClickHouse Kubernetes operator, which is now in production use at companies like Mux.com. It's an open source operator to stand up and run ClickHouse, a popular Apache 2.0 data warehouse that can return queries on trillions of rows in seconds or less. This talk introduces ClickHouse and shows why it's a 'cloud friendly' DBMS. We'll go mano-a-mano with the ClickHouse operator, showing how you can spin up data warehouses in 60 seconds or less. We'll cover issues like storage management, monitoring and upgrade. In short, everything you need to know to try running your own ClickHouse data warehouses on Kubernetes.
MS Office install has required the removal of the previously installed version of your Office product on the device or system. Office 365 and other subscription offers the various features, which you do not get when you do not purchase the Office product. The office can be used free, as MS provides the trial versions of every tool. VISIT HERE: Office setup TODAY.
Docker containers have been making inroads into Windows and Azure world. Docker has now replaced the traditional Azure IaaS & PaaS services, offering superior container versions which are more responsive, cost effective, and agile. In this session for Charlotte Azure User Group, we will take an in-depth look at the intersection of Docker and Azure, and how Docker is empowering next gen Azure services.
Here's the link to CAG meetup for the event - https://www.meetup.com/Charlotte-Microsoft-Azure/events/fpftgmyxjbjb/
This report describes how the Aucfanlab team used Azure’s Data Factory service to
implement the orchestration and monitoring of all data pipelines for our “Aucfan
Datalake” project.
The slide deck used in the Apache Camel / Syndesis Seminar at Red Hat, K.K., Ebisu --
https://jcug-oss.connpass.com/event/99168/
Uploaded with permission of Christina Lin
Schema-based multi-tenant architecture using Quarkus & Hibernate-ORM.pdfseo18
Architecture design is a must while developing a SaaS application to ensure its scalability and optimising infrastructure costs. In this blog, Lets discuss the implementation of one such architecture with Quarkus java framework and Hibernate ORM
If you are planning on building a connect integration for any of the Atlassians cloud offerings, growth, performance, and stability should be your highest priorities. In addition, you have to think of keeping the cost down, delivering the product on time, and keeping both users and developers happy.
In this session, Nathan Burrell will talk about the architecture of Bitbucket Pipelines (Beta) which is a feature of Bitbucket Cloud that is integrated via connect, runs on AWS and heavily utilises docker. He will walk you through examples that show how one can implement a solid integration while staying aligned and meeting all previously mentioned priorities. You will learn about best practices, software architecture insights, and the technologies that are readily available to assist you in your endeavours.
Products covered:
Bitbucket
Container orchestration from theory to practiceDocker, Inc.
"Join Laura Frank and Stephen Day as they explain and examine technical concepts behind container orchestration systems, like distributed consensus, object models, and node topology. These concepts build the foundation of every modern orchestration system, and each technical explanation will be illustrated using SwarmKit and Kubernetes as a real-world example. Gain a deeper understanding of how orchestration systems work in practice and walk away with more insights into your production applications."
Tuning and optimizing webcenter spaces application white paperVinay Kumar
This white paper focuses on Oracle WebCenter Spaces performance problem and analysis after post production deployment. We will tune JVM ( JRocket). Webcenter Portal, Webcenter content and ADF task flow.
Best Hadoop Institutes : kelly tecnologies is the best Hadoop training Institute in Bangalore.Providing hadoop courses by realtime faculty in Bangalore.
Similar to Real time web app integration with hadoop on docker (20)
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Real time web app integration with hadoop on docker
1. Author:RajasekaranKandhasamy
Real-Time Web App Integration with Hadoop On Docker
Reference Architecture
Introduction.......................................................................................................................................2
Big Data Deployment..........................................................................................................................2
Data Exchange/Gateway Layer [ Docker Host 1]................................................................................2
Data Center/Store/Processing Layer [Docker Host 2] ........................................................................3
Packaging and bundling:..................................................................................................................3
Real Time Web Application- Big Data: Integration Architecture.............................................................4
..............................................................................................................Error! Bookmark not defined.
Broker.jar:......................................................................................................................................5
ml-jobs.jar:.....................................................................................................................................5
Hadoop Job Configurations through Real Time WebApplication:...........................................................5
Manage Jobs UI Layout: ..................................................................................................................5
Future Enhancements:.................................................................................................................5
Add/Edit Job UI Layout: ...............................................................................................................5
Execute Button Action:................................................................................................................6
View Job Status ButtonAction:.....................................................................................................6
Sample real-time web application job details table structures: ..........................................................7
Job table:....................................................................................................................................7
Job status table:..........................................................................................................................7
Monitor Job Status from Real-time Web Application.........................................................................7
Current Implementation [Spark Listener Instrumentation]:............................................................7
Design option 2 [Polling HDFS Consumer]:....................................................................................7
Design option 3 [UI Uses HBase for Job status]:.............................................................................7
Design option 4 [UI Uses SPARK History server REST API]:..............................................................7
2. Author:RajasekaranKandhasamy
Introduction
The emergence of Docker containers, development simplifies the process of building and
shippingapps. HadoopDockerdeploymentprovide the lightweight,disposable environment for
learning and exploring new technology, playing with new ideas, and for doing continuous
integration before testing at scale.
For single node setup, we have usedHortonworks Dockerimage andformulti node,2Docker
hostshave used.[Settingupthisisseparate process].
The proposed bigdata systemconsistsof Hadoopsupporting components suchas,
o Apache Spark – For large data processing
o Apache Sqoop– To integrate OLTPdatabases
o Apache HBase – NoSQLstore
o Apache Kafka– Actas distributedESBbetweenReal-timeapplicationandcluster
o Apache Flume – Webapplicationlogstreaming
o Hadoop core [DFS,YARN].
In the below document, we are goingtosee
o howHadoop modulesare deployed inclusterenvironmentwith Dockersupport.
o howeach module getinteracted
o how are we configuringdifferentHadooprelatedjobsthrough real-time webapplication
o howare we execute Hadoopjobsthrough real-time webapplication
o howare we monitoringjobsubmissionstatus throughreal-timewebapplication
Big Data Deployment
The proposedbigdata systemconsistsof one “Data Exchange/Gateway Layer[DockerHost 1]”and
can have one or many “Data Center/DataStore/DataProcessingLayer[DockerHost 2(N)]”.
Horizontal scalingcanachieve in“Data Store/ProcessingLayer”byaddingadditionalDocker
Host using“docker-machinecreate <HOST-NAME>command.
No separate clustersetupforSpark.Why?
Leverage YARN environmentfordataprocessing,
More hardware utilization,
No separate machine forspark.
Data Exchange/Gateway Layer [ Docker Host 1]
Real time web application andbigdatacluster/ecosystemcommunicationhappensthroughthis
layer.
Mediatorcomponentslike flume, Kafka,Map-reduce clientsandsparkclientsare deployedhere
as individual Docker container in Docker Host 1.
No data storage and distributed computation here.
All bigdata jobsubmissionsfrom real time webapplication,[this includes Spark job, Map-Reduce job, Sqoop
jobs] are done through Kafka to enforce generic adapter design pattern.
Data ingestion done through Flume sink(s) and Sqoop server.
3. Author:RajasekaranKandhasamy
Data processing initiated through Kafka and Spark.
Data Center/Store/Processing Layer [Docker Host 2]
Specifically, for data storage and distributed data processing.
Data are more secured in access and can enable cloud based multi tenancy for data as well as
for processing.
Use horizontal scaling approach to add new Docker host/hardware or to add new tenant.
Packaging and bundling:
Sqoop :
Sqoop server artifacts packaged together and deployed in Docker Host -1 Sqoop server
location.
Sqoop client artifacts packaged together and deployed in Docker Host -1 Kafka location.
Kafka:
KafkaProducersand consumerspackagedseparatelyanddeployed in Docker Host -1 Kafka
location.
Flume:
Flume sink packaged separately and deployed in Docker Host -1 Flume location.
Note: Some Kafka producers and flume agents may be run in real time web application server they
need to deploy appropriately.
4. Author:RajasekaranKandhasamy
Real Time Web Application-Big Data: IntegrationArchitecture
Modules Database
Web Application
Hadoop Eco System
Horizontal
Scaling
YARN
NM RM
HDFS
NN DN
HBASE
MASTER REGION
ZK
Docker Host 1
Flume HBase Sink
YARN
NM
HDFS
DN
HBASE
REGION
Data Store/ Processing LayerData Exchange/GatewayLayer
Log Configuration
Y
KafkaBootStrap
SparkLauncher
Broker.jar
Sqoop Server
Job-Status-QueueJob-Submit-Queue
Spark
Kafka Server
Spark Server
ml-jobs.jar
SparkListener
ML Logic Classes
Job Status Updater
Horizontal
Scaling
JobStatusfromYARN
Docker Host 2 Docker Host N
5. Author:RajasekaranKandhasamy
Broker.jar:
This jar run as a service to start Kafka consumers [Java programs] and deployed in Kafka server
location [Docker Host 1]. Note: Kafka server started already.
One of the Kafkaconsumerinside this jar “SparkJobExecutor.java” listening to the queue “Job-
Submit-Queue”.
Wheneverthe jobsubmittedthrough Kafkaproduceravailable inreal time webapplication then
above consume the message andlaunchthe sparkclientbasedonincoming parameters.
ml-jobs.jar:
This jar deployed in spark server location. [Docker Host 1].
It contains list of machine learning implementations that can run on Hadoop distributed
environment.
It’smandatorythat all machine learningimplementationsshould implement “SparkListener” in
order to publish the job status to “Job-Status-Queue”. So, the real-time web application can
consume job message from this queue and able to update in OLTP database.
HadoopJob Configurations throughReal Time Web Application:
Introduce “Manage Big Data” link inUI mainmenu.
“Manage Big Data” page consistsof “Manage Jobs” tab.
Manage Jobs UI Layout:
Contain“JobSearch” widgetand“JobLists”data table where datatable showslistof jobs
configured.
“Job Lists”contain“Add”,“Delete”,“Edit”, “Execute”and“View JobStatus”buttonstodo
appropriate actionsonconfiguredjobs.
“Add” or “Edit” jobdialogscreenprovidesjob name,description, jobtype, FQCN andexecution
mode configurations.
Future Enhancements:
Enable rule basedroutinginformationforjobs,
Enable rule basedalertsoreventsforjobs,
More UI optionsoncamel route configurations.
Add/EditJobUILayout:
Clickon “Add/Save”shouldsave the above configurationin real time webapplication OLTP
database.
User can see the saveddata intable view.
6. Author:RajasekaranKandhasamy
ExecuteButtonAction:
ViewJobStatus ButtonAction:
User can view executedjobstatusandloginformationfromclusters.
Thisinformationretrievedfromjobstatustable.
There isKafkaconsumerrunninginside real time webapplication whichislisteningto“Job-
Status-Queue”fromBigData clusterperiodically andupdate jobstatusinOLTPdatabase.
7. Author:RajasekaranKandhasamy
Sample real-time web application job details table structures:
Jobtable:
Jobstatus table:
Monitor Job Status from Real-time Web Application
CurrentImplementation [SparkListenerInstrumentation]:
Executable Spark job should register Spark Listener.
Spark Listener event handler(s) will be triggered for each job event.
Event handler should publish the job status message to “Job-Status-Queue”.
User can viewjobstatusin“View JobStatus”tab in real time webapplication UI from OLTP data
base.
Designoption2[PollingHDFS Consumer]:
Executable Spark job should write job status to HDFS.
Real-time webapplication Camel HDFSPollingConsumerkeep checking the above location and
once find job status file then this will read content from the file and write it to data base.
User can view job status in “Job Status” tab in real time web application UI from data base.
Designoption3[UIUses HBaseforJobstatus]:
Executable Spark job should write job status to HBASE.
User can view job status in “Job Status” tab in real time web application UI from HBASE.
Designoption4[UIUses SPARKHistoryserverREST API]:
Enable spark history server.
User can view job status in “Job Status” tab in real time web application UI by using spark
history server REST API.