Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...Windows Developer
Recently, we released the Spark Connector for our distributed NoSQL service – Azure Cosmos DB (formerly known as Azure DocumentDB). By connecting Apache Spark running on top Azure HDInsight to Azure Cosmos DB, you can accelerate your ability to solve fast-moving data science problems and machine learning. The Spark to Azure Cosmos DB connector efficiently exploits the native Cosmos DB managed indexes and enables updateable columns when performing analytics, push-down predicate filtering against fast-changing globally-distributed data, ranging from IoT, data science, and analytics scenarios. Come learn how you can perform blazing fast planet-scale data processing with Azure Cosmos DB and HDInsight.
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
SQL-based databases have been around for decades and they power a wide range of applications. So what exactly do NoSQL databases bring to the table? In this webcast, you'll find out how NoSQL can liberate your development cycle, allow your application to scale and improve your system's uptime.
In the world of NoSQL, each database has its own strengths and weaknesses. Understanding which open source database is "the right tool for the job" is half the battle if you want to start building better applications quickly. IBM developer advocate Glynn Bird explores practical examples of how two popular NoSQL databases - the Cloudant JSON document store and the Redis in-memory key-value store - can be used together to create performant and scalable Web applications. It also includes real world use cases you can try today, for free, using the IBM Cloud Data Services suite of fully managed NoSQL databases-as-a-service.
Build 2017 - P4010 - A lap around Azure HDInsight and Cosmos DB Open Source A...Windows Developer
Recently, we released the Spark Connector for our distributed NoSQL service – Azure Cosmos DB (formerly known as Azure DocumentDB). By connecting Apache Spark running on top Azure HDInsight to Azure Cosmos DB, you can accelerate your ability to solve fast-moving data science problems and machine learning. The Spark to Azure Cosmos DB connector efficiently exploits the native Cosmos DB managed indexes and enables updateable columns when performing analytics, push-down predicate filtering against fast-changing globally-distributed data, ranging from IoT, data science, and analytics scenarios. Come learn how you can perform blazing fast planet-scale data processing with Azure Cosmos DB and HDInsight.
Move your on prem data to a lake in a Lake in CloudCAMMS
With the boom in data; the volume and its complexity, the trend is to move data to the cloud. Where and How do we do this? Azure gives you the answer. In this session, I will give you an introduction to Azure Data Lake and Azure Data Factory, and why they are good for the type of problem we are talking about. You will learn how large datasets can be stored on the cloud, and how you could transport your data to this store. The session will briefly cover Azure Data Lake as the modern warehouse for data on the cloud,
SQL-based databases have been around for decades and they power a wide range of applications. So what exactly do NoSQL databases bring to the table? In this webcast, you'll find out how NoSQL can liberate your development cycle, allow your application to scale and improve your system's uptime.
In the world of NoSQL, each database has its own strengths and weaknesses. Understanding which open source database is "the right tool for the job" is half the battle if you want to start building better applications quickly. IBM developer advocate Glynn Bird explores practical examples of how two popular NoSQL databases - the Cloudant JSON document store and the Redis in-memory key-value store - can be used together to create performant and scalable Web applications. It also includes real world use cases you can try today, for free, using the IBM Cloud Data Services suite of fully managed NoSQL databases-as-a-service.
The slides will walk thru the requirements and will enable to pick appropriate No SQL database as a best match for the requirements i.e. fitness for the purpose
Find out how NoSQL can help your application with practical examples and use-cases from our Cloud Data Services Developer Advocate Glynn Bird. This webinar won't dwell on the science behind the database, but will walk you through real-life use-cases for NoSQL technologies that you can start using today.
Webinar: https://youtu.be/M_Jqw
Azure Cosmos DB is Microsoft’s globally-distributed database service "for managing data at planet-scale" launched in May 2017. It builds upon and extends the earlier Azure DocumentDB, which was released in 2014. It is schema-less and generally classified as a NoSQL database. Please refer git@github.com:NexThoughts/Cosmos-DB.git
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullyMd Kamaruzzaman
In modern Software Development and Software Architecture, selecting the right DataStore is one of the most challenging and important task. In this presentation, I have summarized the major DataStores and the decision criteria to select the right DataStore according to the use case.
Scylla Summit 2018: Scaling your time series data with NewtsScyllaDB
Today's datasets are growing at an exponential rate. Collection, storage, analysis, and reporting are becoming more challenging, and the results more valued. A decade ago, RRDTool's algorithms were well-suited to our requirements, but they fall short of scaling to current demands. A new direction is needed, one that prioritizes write-optimized storage, and that scales beyond a single host.
This presentation will provide an overview of Newts, a distributed time-series data store based on ScyllaDB, show how it compares to other solutions, and take a look at how it is integrated in OpenNMS.
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...ScyllaDB
AdTech requires high speed at massive scale. Sizmek serves millions of requests every second. Requests need to be processed in tens of milliseconds, while involving 10 simultaneous lookups into a database that contains tens of billions of profiles. In this presentation, you will discover how Scylla enables Sizmek’s real-time bidders to query a gigantic user profile store quickly and reliably with only a few nodes. We’ll discuss data modeling, server and driver configuration, techniques to minimize disk access, as well as considerations for leveraging Spark while migrating from HBase.
Why MongoDB over other Databases - HabilelabsHabilelabs
MongoDB is the faster-growing database. It is an open-source document and leading NoSQL database with the scalability and flexibility that you want with the querying and indexing that you need. In this Document, I presented why to choose MongoDB is over another database.
In this session, we will discuss DataGrid usage and integration patterns that go beyond the typical Cache#put and Cache#get. Attendees will share their own experiences and we will discuss:
• How DataGrid can be used with ESB to provide a reliable and fault tolerant SEDA implementation (based on Mule);
• How a DataGrid can be used to execute dynamic jobs ala MapReduce using scripting languages;
• How a DataGrid can be used to scale Lucene index storage, as well as use Compass to index the DataGrid.
Narasimhan Sampath and Avinash Ramineni share how Choice Hotels International used Spark Streaming, Kafka, Spark, and Spark SQL to create an advanced analytics platform that enables business users to be self-reliant by accessing the data they need from a variety of sources to generate customer insights and property dashboards and enable data-driven decisions with minimal IT engagement. Narasimhan and Avinash highlight the architecture, lessons learned, and the challenges that were overcome on both the business and technology fronts.
The analytics platform is designed as a framework to enable self-service data intake, data processing, and report/model generation by the business users. The data-driven framework consists of a distributed hybrid-cloud data ingestor for data intake and a Cloudera CDH cluster with Spark as the distributed compute engine. The solution is built in such a way that storage and compute have been decoupled and encourages the concept of BYOC (bring your own compute). The platform uses EC2 instances to run CDH and leverages Amazon S3 as a data warehouse storage layer (data lake), Spark as an ETL engine, and Spark SQL as a distributed query engine. Results (computations/derived tables) are exposed to the end users via Spark SQL and are discovered via Tableau. The platform supports both batch and streaming use cases and is built on the following technology stack: AWS (S3, EC2, SQS, SNS), Cloudera CDH (YARN, Navigator, Sentry), Spark, Kafka, Spark SQL, and Spark Streaming.
The slides will walk thru the requirements and will enable to pick appropriate No SQL database as a best match for the requirements i.e. fitness for the purpose
Find out how NoSQL can help your application with practical examples and use-cases from our Cloud Data Services Developer Advocate Glynn Bird. This webinar won't dwell on the science behind the database, but will walk you through real-life use-cases for NoSQL technologies that you can start using today.
Webinar: https://youtu.be/M_Jqw
Azure Cosmos DB is Microsoft’s globally-distributed database service "for managing data at planet-scale" launched in May 2017. It builds upon and extends the earlier Azure DocumentDB, which was released in 2014. It is schema-less and generally classified as a NoSQL database. Please refer git@github.com:NexThoughts/Cosmos-DB.git
SQL, NoSQL, Distributed SQL: Choose your DataStore carefullyMd Kamaruzzaman
In modern Software Development and Software Architecture, selecting the right DataStore is one of the most challenging and important task. In this presentation, I have summarized the major DataStores and the decision criteria to select the right DataStore according to the use case.
Scylla Summit 2018: Scaling your time series data with NewtsScyllaDB
Today's datasets are growing at an exponential rate. Collection, storage, analysis, and reporting are becoming more challenging, and the results more valued. A decade ago, RRDTool's algorithms were well-suited to our requirements, but they fall short of scaling to current demands. A new direction is needed, one that prioritizes write-optimized storage, and that scales beyond a single host.
This presentation will provide an overview of Newts, a distributed time-series data store based on ScyllaDB, show how it compares to other solutions, and take a look at how it is integrated in OpenNMS.
Scylla Summit 2018: Adventures in AdTech: Processing 50 Billion User Profiles...ScyllaDB
AdTech requires high speed at massive scale. Sizmek serves millions of requests every second. Requests need to be processed in tens of milliseconds, while involving 10 simultaneous lookups into a database that contains tens of billions of profiles. In this presentation, you will discover how Scylla enables Sizmek’s real-time bidders to query a gigantic user profile store quickly and reliably with only a few nodes. We’ll discuss data modeling, server and driver configuration, techniques to minimize disk access, as well as considerations for leveraging Spark while migrating from HBase.
Why MongoDB over other Databases - HabilelabsHabilelabs
MongoDB is the faster-growing database. It is an open-source document and leading NoSQL database with the scalability and flexibility that you want with the querying and indexing that you need. In this Document, I presented why to choose MongoDB is over another database.
In this session, we will discuss DataGrid usage and integration patterns that go beyond the typical Cache#put and Cache#get. Attendees will share their own experiences and we will discuss:
• How DataGrid can be used with ESB to provide a reliable and fault tolerant SEDA implementation (based on Mule);
• How a DataGrid can be used to execute dynamic jobs ala MapReduce using scripting languages;
• How a DataGrid can be used to scale Lucene index storage, as well as use Compass to index the DataGrid.
Narasimhan Sampath and Avinash Ramineni share how Choice Hotels International used Spark Streaming, Kafka, Spark, and Spark SQL to create an advanced analytics platform that enables business users to be self-reliant by accessing the data they need from a variety of sources to generate customer insights and property dashboards and enable data-driven decisions with minimal IT engagement. Narasimhan and Avinash highlight the architecture, lessons learned, and the challenges that were overcome on both the business and technology fronts.
The analytics platform is designed as a framework to enable self-service data intake, data processing, and report/model generation by the business users. The data-driven framework consists of a distributed hybrid-cloud data ingestor for data intake and a Cloudera CDH cluster with Spark as the distributed compute engine. The solution is built in such a way that storage and compute have been decoupled and encourages the concept of BYOC (bring your own compute). The platform uses EC2 instances to run CDH and leverages Amazon S3 as a data warehouse storage layer (data lake), Spark as an ETL engine, and Spark SQL as a distributed query engine. Results (computations/derived tables) are exposed to the end users via Spark SQL and are discovered via Tableau. The platform supports both batch and streaming use cases and is built on the following technology stack: AWS (S3, EC2, SQS, SNS), Cloudera CDH (YARN, Navigator, Sentry), Spark, Kafka, Spark SQL, and Spark Streaming.
Blueprints enable quick creation of governed subscriptions. This allows Cloud Architects to design environments that comply with organizational standards and best practices – enabling your app teams to get to production faster.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
6. SQL
• RDBMS
• SCALE UP/Vertically
Scalable
• Structured or Schema
• Atomic Transaction
• Difficult for programmer
• Eg. CAR Disassembled in
and reassembled when
required.
NOSQL
• NonRelational
• Scale OUT/Horizontally
Scalable
• Semi or unstructured data
• EVENTUAL CONSISTENCY
• Easy for Programmers
• Eg. Park a car in parking