The document discusses new technologies that are emerging beyond Hadoop to process big data more effectively. It summarizes Google's Percolator, Dremel, and Pregel systems, which provide incremental processing, ad-hoc querying, and graph processing respectively. These new approaches help address some of Hadoop's limitations and represent the next generation of big data technologies on the horizon.
This document provides a summary of Mike Browning's experience, education, and objectives. It outlines his previous roles in sales and account management from 2009 to 1984 at companies like Dialogic Communications, Dell Computer Corp, Southeast Appliance Distributing, Tennessee Mat Company, and L'Aire Liquide. It also lists his education as a Bachelor of Science degree in Marketing from Brigham Young University. His stated objectives are to be an enterprising and adaptive team player with 27 years of sales experience.
This document discusses new technologies that are emerging beyond Hadoop and the original "Google canon" to process big data more efficiently. It introduces Percolator for incremental processing, Dremel for ad-hoc queries on nested data, and Pregel for large-scale graph processing. These new systems provide benefits like lower latency, easier analysis of diverse data types, and scalability to analyze graphs with billions of vertices.
The scheduled redis speaker was sick so I whipped up in about an hour and filled in on a different subject. It's a bit crude, but you get a big picture view of how to build a simple AI application using BigCouch. The accompanying video is up at http://www.youtube.com/watch?v=QEBDNxbSRuk
The document describes revisions made to the table of contents page for a music magazine called "Flava." Key details include:
- Placing the magazine cover image and title "Flava" on the left third of the page.
- Adding editorial pillars ("News, Features, Music") and 2-4 cover lines below each pillar.
- Including an editorial letter and image of featured artist "Pretty In Pink" on the right third.
- Revising layout and formatting elements like fonts, sizes, alignments to improve professional appearance and readability.
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
We asked LinkedIn members worldwide about their levels of interest in the latest wave of technology: whether they’re using wearables, and whether they intend to buy self-driving cars and VR headsets as they become available. We asked them too about their attitudes to technology and to the growing role of Artificial Intelligence (AI) in the devices that they use. The answers were fascinating – and in many cases, surprising.
This SlideShare explores the full results of this study, including detailed market-by-market breakdowns of intention levels for each technology – and how attitudes change with age, location and seniority level. If you’re marketing a tech brand – or planning to use VR and wearables to reach a professional audience – then these are insights you won’t want to miss.
The future of cloud computing - Jisc Digifest 2016Jisc
In Jisc's future of cloud computing horizon scan report, we identified three strategic areas where Jisc could support universities and colleges in moving to the cloud – cloud as a utility, app as a service, and working to build capability in cloud technologies.
Come along to this session to hear more about this work from Jisc futurist Martin Hamilton, and find out how you can get involved.
The document discusses new technologies that are emerging beyond Hadoop to process big data more effectively. It summarizes Google's Percolator, Dremel, and Pregel systems, which provide incremental processing, ad-hoc querying, and graph processing respectively. These new approaches help address some of Hadoop's limitations and represent the next generation of big data technologies on the horizon.
This document provides a summary of Mike Browning's experience, education, and objectives. It outlines his previous roles in sales and account management from 2009 to 1984 at companies like Dialogic Communications, Dell Computer Corp, Southeast Appliance Distributing, Tennessee Mat Company, and L'Aire Liquide. It also lists his education as a Bachelor of Science degree in Marketing from Brigham Young University. His stated objectives are to be an enterprising and adaptive team player with 27 years of sales experience.
This document discusses new technologies that are emerging beyond Hadoop and the original "Google canon" to process big data more efficiently. It introduces Percolator for incremental processing, Dremel for ad-hoc queries on nested data, and Pregel for large-scale graph processing. These new systems provide benefits like lower latency, easier analysis of diverse data types, and scalability to analyze graphs with billions of vertices.
The scheduled redis speaker was sick so I whipped up in about an hour and filled in on a different subject. It's a bit crude, but you get a big picture view of how to build a simple AI application using BigCouch. The accompanying video is up at http://www.youtube.com/watch?v=QEBDNxbSRuk
The document describes revisions made to the table of contents page for a music magazine called "Flava." Key details include:
- Placing the magazine cover image and title "Flava" on the left third of the page.
- Adding editorial pillars ("News, Features, Music") and 2-4 cover lines below each pillar.
- Including an editorial letter and image of featured artist "Pretty In Pink" on the right third.
- Revising layout and formatting elements like fonts, sizes, alignments to improve professional appearance and readability.
Study: The Future of VR, AR and Self-Driving CarsLinkedIn
We asked LinkedIn members worldwide about their levels of interest in the latest wave of technology: whether they’re using wearables, and whether they intend to buy self-driving cars and VR headsets as they become available. We asked them too about their attitudes to technology and to the growing role of Artificial Intelligence (AI) in the devices that they use. The answers were fascinating – and in many cases, surprising.
This SlideShare explores the full results of this study, including detailed market-by-market breakdowns of intention levels for each technology – and how attitudes change with age, location and seniority level. If you’re marketing a tech brand – or planning to use VR and wearables to reach a professional audience – then these are insights you won’t want to miss.
The future of cloud computing - Jisc Digifest 2016Jisc
In Jisc's future of cloud computing horizon scan report, we identified three strategic areas where Jisc could support universities and colleges in moving to the cloud – cloud as a utility, app as a service, and working to build capability in cloud technologies.
Come along to this session to hear more about this work from Jisc futurist Martin Hamilton, and find out how you can get involved.
This document provides an overview and introduction to CouchDB, a document-oriented NoSQL database. It discusses CouchDB's core features like its schema-free document model, RESTful API, views and indexing with MapReduce, replication, and how it can be scaled horizontally. The document also provides examples of real-world use cases for CouchDB and takeaways for learning more about CouchDB.
Reliability & Scale in AWS while letting you sleep through the night Jos Boumans
More and more startups/companies are deploying their infrastructure directly and exclusively in EC2 or similar cloud provider. With that comes a whole new set of challenges and paradigms around scalability, reliability and availability.
This talk will focus on how to leverage all the infrastructure parts of AWS, augment them with great (affordable) third party services and solid Open Source Software to create an operations environment that will scale with you, be as reliable as it can be, providing you and your peers with all the data you need to make good decisions to support (rapid) changes while letting you sleep through the night. And all that using a tiny operations team.
It may make you coffee in the morning too.
The document proposes atomizing student data and applications into Docker containers to create a hybrid cloud campus environment. Key points include:
- Using multiple containers per student instead of individual VMs for better resource utilization and mobility across campus infrastructure.
- Student web applications and data can remain unchanged and be accessed via container IDs instead of direct hardware.
- Docker provides benefits like layered builds to simplify deployments across heterogeneous hardware like PCs and Raspberry Pis.
- Performance tests show Docker containers have comparable or better performance than VMs, though Raspberry Pis have limitations due to their hardware.
Building Cloud-Native Applications in MiCADO - MiCADO webinar No.2/4 - 09/2019Project COLA
2/4 Webinar: How to Automate Deployment and Orchestration of Application (MiCADO introduction)
This part of the webinar provides information on how to develop cloud-native applications in MiCADO. It was presented by Jay DesLauriers (University of Westminster). The webinar took place on the 26th of September 2019. If you would like to have more information visit: https://micado-scale.eu
MiCADO is open-source and a highly customisable multi-cloud orchestration and auto-scaling framework for Docker containers, orchestrated by Kubernetes.
Developed by Project COLA funded by the European Commission (grant agreement no: 731574). https://project-cola.eu
This document provides an overview and agenda for a Docker and cloud native training. It introduces Brian Christner as the trainer and his background. It then covers various cloud native topics that will be discussed including containers, microservices, DevOps, and orchestration. The remainder of the document demonstrates Docker concepts hands-on and discusses container architecture, portability, and monitoring. It also briefly explores future directions like serverless and concludes by providing additional Docker resources.
This document provides a summary of a presentation about Full Stack Reactivity using the Meteor framework. It includes a definition of full-stack reactivity as allowing every level of a web application's stack to respond in real-time to changes. The presentation demonstrates a sample Meteor application, discusses key Meteor concepts like publications and subscriptions, and argues that Meteor's approach could help transform how Plone applications are developed. The goal is to integrate Meteor's Distributed Data Protocol into Plone to provide real-time reactivity across the stack using ZODB events.
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
Ravi Yadav, Mesosphere
Anil Kumar, Couchbase
Organizations focused on delivering exceptional customer experiences are building applications using microservice architectures because of the flexibility, speed of delivery, and maintainability that they provide. In this session, you will learn how Couchbase can fit into a microservice architecture using containers and orchestration. We will explore how Couchbase and Mesosphere work together to simplify application development and delivery. Additionally, you will see a demonstration of exactly how to create a Couchbase cluster on Mesosphere DC/OS Enterprise.
Mike Miller is the Co-Founder and Chief Scientist of Cloudant, a company that provides a globally distributed data layer for web applications. He has a background in machine learning, analysis, big data, and distributed systems. Cloudant was founded in 2009 by MIT data scientists and provides a hyper-scalable document database and analytics platform that runs across multiple data centers.
This presentation was presented to the Fachhochschule Bern. The course was part of the Master program and we covered the topics of Cloud Native & Docker
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...MayaData
Presented at FOSDEM 2019
K8s as a universal control plane to deploy containerised applications • Public cloud is moving on premises (GKE, Outpost) • K8s capable of doing more then containers due to controllers (VMs)
The document discusses several advanced data and computing architectures, including data lake architectures, lambda architectures, serverless architectures, and self-organizing architectures. It also covers specific implementations like COAST, Zeta, Unikernels, ClickOS, and MirageOS. Deep learning architectures and efforts to use deep learning to generate code are briefly mentioned at the end.
This presentation covers the Cloud Native standards for building applications and enforces how important Docker & Containers are when builiing Cloud Native Applications/Architecture. Next, we cover how to use Docker to build a Serverless infrastrucutre.
This document discusses integrating the BlobSeer data management system with the HGMDS distributed metadata management system to build a global file system deployed across multiple sites. Two approaches are considered: having multiple BlobSeer instances, one per site, or having a single BlobSeer-WAN instance spanning sites. The authors propose the latter approach with distributed version managers and provider managers to leverage data locality. Preliminary evaluations show BlobSeer-WAN performance on two sites is similar to vanilla BlobSeer on a single site. Next steps include further evaluation, integrating BlobSeer-WAN with HGMDS, and submitting a co-authored paper.
CodeFutures - Scaling Your Database in the CloudRightScale
RightScale Conference Santa Clara 2011: Scaling an application in the cloud often hits the most common bottleneck – the database tier. Not only is database performance the number one cause of poor application performance, but also the issue is magnified in cloud environments where I/O and bandwidth is generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier, operating on top of proven RDBMS products such as MySQL and Postgres – as well as the new NoSQL database platforms. One critical aspect often given too little consideration is monitoring and continuous operation of your databases, including the full lifecycle, to ensure that they stay up.
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
This document discusses different database options for developers including SQL, NoSQL, and NewSQL databases. It provides an overview of why developers may choose NoSQL databases like MongoDB or Cassandra over traditional SQL databases, as well as when a NewSQL database could be a better option. The document uses a fictional example of a food delivery application to illustrate how different database choices would work for persisting and querying application data.
This document discusses various research topics related to proactive networking and edge computing. It begins with an outline of topics including edge caching, mobile edge computing (MEC), 5G vehicle-to-everything (V2X), virtual reality (VR), unmanned aerial vehicles (UAVs), and ultra-reliable low-latency communications (URLLC). It then discusses the need to move from reactive to proactive networking approaches to meet new requirements from applications like VR and industry 4.0. Key challenges discussed include time-varying content popularity, hierarchical caching, fog/edge computing with mobility, and ultra-reliable low-latency networking.
This document provides a high-level overview of a cloud architecture design. It discusses considerations for the design including service assurance, high availability, secure tenant segregation, and data center scalability. It then describes the proposed design which includes pods, availability zones, and regions to provide modular scalability, redundancy, and tenant isolation. Management servers and databases are separated for control and data planes.
This document summarizes the history and development of the Gluster distributed storage system. It describes how Gluster began in Bangalore and has grown to support large scale deployments. Key developments included creating a unified global namespace, adding features through a stackable architecture, and developing templates to simplify common use cases. Gluster now provides a simple, scalable, and low-cost storage solution that is well suited for the growing data demands of cloud computing through its scale-out architecture and open source model.
The document provides an overview of 14 topics related to Oracle Autonomous Database. It begins with how to get started with the Autonomous Database free tier and Oracle Machine Learning. It then discusses cross region data guard, exporting data as JSON to object storage, wallet rotation, partitions with external tables in cloud, set patch level when cloning, performance monitoring, data safe audit retention time increase, change concurrency limits via console, SQL monitor report, ASH analytics in performance hub, workload metrics on performance hub, and customer managed keys.
This document provides an overview and introduction to CouchDB, a document-oriented NoSQL database. It discusses CouchDB's core features like its schema-free document model, RESTful API, views and indexing with MapReduce, replication, and how it can be scaled horizontally. The document also provides examples of real-world use cases for CouchDB and takeaways for learning more about CouchDB.
Reliability & Scale in AWS while letting you sleep through the night Jos Boumans
More and more startups/companies are deploying their infrastructure directly and exclusively in EC2 or similar cloud provider. With that comes a whole new set of challenges and paradigms around scalability, reliability and availability.
This talk will focus on how to leverage all the infrastructure parts of AWS, augment them with great (affordable) third party services and solid Open Source Software to create an operations environment that will scale with you, be as reliable as it can be, providing you and your peers with all the data you need to make good decisions to support (rapid) changes while letting you sleep through the night. And all that using a tiny operations team.
It may make you coffee in the morning too.
The document proposes atomizing student data and applications into Docker containers to create a hybrid cloud campus environment. Key points include:
- Using multiple containers per student instead of individual VMs for better resource utilization and mobility across campus infrastructure.
- Student web applications and data can remain unchanged and be accessed via container IDs instead of direct hardware.
- Docker provides benefits like layered builds to simplify deployments across heterogeneous hardware like PCs and Raspberry Pis.
- Performance tests show Docker containers have comparable or better performance than VMs, though Raspberry Pis have limitations due to their hardware.
Building Cloud-Native Applications in MiCADO - MiCADO webinar No.2/4 - 09/2019Project COLA
2/4 Webinar: How to Automate Deployment and Orchestration of Application (MiCADO introduction)
This part of the webinar provides information on how to develop cloud-native applications in MiCADO. It was presented by Jay DesLauriers (University of Westminster). The webinar took place on the 26th of September 2019. If you would like to have more information visit: https://micado-scale.eu
MiCADO is open-source and a highly customisable multi-cloud orchestration and auto-scaling framework for Docker containers, orchestrated by Kubernetes.
Developed by Project COLA funded by the European Commission (grant agreement no: 731574). https://project-cola.eu
This document provides an overview and agenda for a Docker and cloud native training. It introduces Brian Christner as the trainer and his background. It then covers various cloud native topics that will be discussed including containers, microservices, DevOps, and orchestration. The remainder of the document demonstrates Docker concepts hands-on and discusses container architecture, portability, and monitoring. It also briefly explores future directions like serverless and concludes by providing additional Docker resources.
This document provides a summary of a presentation about Full Stack Reactivity using the Meteor framework. It includes a definition of full-stack reactivity as allowing every level of a web application's stack to respond in real-time to changes. The presentation demonstrates a sample Meteor application, discusses key Meteor concepts like publications and subscriptions, and argues that Meteor's approach could help transform how Plone applications are developed. The goal is to integrate Meteor's Distributed Data Protocol into Plone to provide real-time reactivity across the stack using ZODB events.
Containerizing couchbase with microservice architecture on mesosphere.pptxRavi Yadav
Ravi Yadav, Mesosphere
Anil Kumar, Couchbase
Organizations focused on delivering exceptional customer experiences are building applications using microservice architectures because of the flexibility, speed of delivery, and maintainability that they provide. In this session, you will learn how Couchbase can fit into a microservice architecture using containers and orchestration. We will explore how Couchbase and Mesosphere work together to simplify application development and delivery. Additionally, you will see a demonstration of exactly how to create a Couchbase cluster on Mesosphere DC/OS Enterprise.
Mike Miller is the Co-Founder and Chief Scientist of Cloudant, a company that provides a globally distributed data layer for web applications. He has a background in machine learning, analysis, big data, and distributed systems. Cloudant was founded in 2009 by MIT data scientists and provides a hyper-scalable document database and analytics platform that runs across multiple data centers.
This presentation was presented to the Fachhochschule Bern. The course was part of the Master program and we covered the topics of Cloud Native & Docker
OpenEBS; asymmetrical block layer in user-space breaking the million IOPS bar...MayaData
Presented at FOSDEM 2019
K8s as a universal control plane to deploy containerised applications • Public cloud is moving on premises (GKE, Outpost) • K8s capable of doing more then containers due to controllers (VMs)
The document discusses several advanced data and computing architectures, including data lake architectures, lambda architectures, serverless architectures, and self-organizing architectures. It also covers specific implementations like COAST, Zeta, Unikernels, ClickOS, and MirageOS. Deep learning architectures and efforts to use deep learning to generate code are briefly mentioned at the end.
This presentation covers the Cloud Native standards for building applications and enforces how important Docker & Containers are when builiing Cloud Native Applications/Architecture. Next, we cover how to use Docker to build a Serverless infrastrucutre.
This document discusses integrating the BlobSeer data management system with the HGMDS distributed metadata management system to build a global file system deployed across multiple sites. Two approaches are considered: having multiple BlobSeer instances, one per site, or having a single BlobSeer-WAN instance spanning sites. The authors propose the latter approach with distributed version managers and provider managers to leverage data locality. Preliminary evaluations show BlobSeer-WAN performance on two sites is similar to vanilla BlobSeer on a single site. Next steps include further evaluation, integrating BlobSeer-WAN with HGMDS, and submitting a co-authored paper.
CodeFutures - Scaling Your Database in the CloudRightScale
RightScale Conference Santa Clara 2011: Scaling an application in the cloud often hits the most common bottleneck – the database tier. Not only is database performance the number one cause of poor application performance, but also the issue is magnified in cloud environments where I/O and bandwidth is generally slower and less predictable than in dedicated data centers. Database sharding is a highly effective method of removing the database scalability barrier, operating on top of proven RDBMS products such as MySQL and Postgres – as well as the new NoSQL database platforms. One critical aspect often given too little consideration is monitoring and continuous operation of your databases, including the full lifecycle, to ensure that they stay up.
SQL? NoSQL? NewSQL?!? What’s a Java developer to do? - JDC2012 Cairo, EgyptChris Richardson
This document discusses different database options for developers including SQL, NoSQL, and NewSQL databases. It provides an overview of why developers may choose NoSQL databases like MongoDB or Cassandra over traditional SQL databases, as well as when a NewSQL database could be a better option. The document uses a fictional example of a food delivery application to illustrate how different database choices would work for persisting and querying application data.
This document discusses various research topics related to proactive networking and edge computing. It begins with an outline of topics including edge caching, mobile edge computing (MEC), 5G vehicle-to-everything (V2X), virtual reality (VR), unmanned aerial vehicles (UAVs), and ultra-reliable low-latency communications (URLLC). It then discusses the need to move from reactive to proactive networking approaches to meet new requirements from applications like VR and industry 4.0. Key challenges discussed include time-varying content popularity, hierarchical caching, fog/edge computing with mobility, and ultra-reliable low-latency networking.
This document provides a high-level overview of a cloud architecture design. It discusses considerations for the design including service assurance, high availability, secure tenant segregation, and data center scalability. It then describes the proposed design which includes pods, availability zones, and regions to provide modular scalability, redundancy, and tenant isolation. Management servers and databases are separated for control and data planes.
This document summarizes the history and development of the Gluster distributed storage system. It describes how Gluster began in Bangalore and has grown to support large scale deployments. Key developments included creating a unified global namespace, adding features through a stackable architecture, and developing templates to simplify common use cases. Gluster now provides a simple, scalable, and low-cost storage solution that is well suited for the growing data demands of cloud computing through its scale-out architecture and open source model.
The document provides an overview of 14 topics related to Oracle Autonomous Database. It begins with how to get started with the Autonomous Database free tier and Oracle Machine Learning. It then discusses cross region data guard, exporting data as JSON to object storage, wallet rotation, partitions with external tables in cloud, set patch level when cloning, performance monitoring, data safe audit retention time increase, change concurrency limits via console, SQL monitor report, ASH analytics in performance hub, workload metrics on performance hub, and customer managed keys.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
2. INTRODUCTIONS
• Mike Miller
• not an Apache CouchDB committer
• ParticlePhysics Faculty at U. of
Washington
• Cloudant cofounder, Chief Scientist
• Background: machine learning, analysis,
big data, globally distributed systems
STS 3/10/2010 2 Mike Miller UW/Cloudant
3. NEW PROBLEMS => NEW SOLUTIONS
( http://www.economist.com/specialreports/
PrinterFriendly.cfm?story_id=15557443)
Big Data Dynamic, Realtime
STS 3/10/2010 3 Mike Miller UW/Cloudant
4. NEW PROBLEMS => NEW SOLUTIONS
“old” “new”
Scale this horizontally/linearly
STS 3/10/2010 4 Mike Miller UW/Cloudant
5. CAP THEOREM
read record
consistency availability
partition
tolerance write record
masterless, scalable
Brewer: Pick two (at a time) (e.g., Cloudant’s Couch)
STS 3/10/2010 5 Mike Miller UW/Cloudant
6. NoSQL: Not Only SQL
• Name blame: Eric Evans from Rackspace
• Pragmatic response to new class of problems poorly fit to
RDBMS:
• heterogeneous/dynamic data, geographic/network separation,
scaling/concurrency
• Not an abandonment of 40+ years of DB research
• From home-brew => general solutions
• Best collection of online talks: https://nosqleast.com/2009/#agenda
STS 3/10/2010 6 Mike Miller UW/Cloudant
7. TAXONOMY: 1
E. Eifrem, no:sql(east)
STS 3/10/2010 7 Mike Miller UW/Cloudant
8. TAXONOMY: 2
E. Eifrem, no:sql(east)
STS 3/10/2010 8 Mike Miller UW/Cloudant
9. EMIL’S CLASSIFICATION
but neglects: concurrency, performance, reliability...
STS 3/10/2010 9 Mike Miller UW/Cloudant
11. CouchDB
• easy, flexible, robust, concurrent, replicates
• Document: Schema-free database server
• RESTful JSON API
• views: MapReduce system for generating custom
• replication: Bi-directional incremental
• couchapp: lightweight HTML+JavaScript apps served directly
from CouchDB using views to transform JSON
STS 3/10/2010 11 Mike Miller UW/Cloudant
12. OF THE WEB
Django may be built for the Web, but
CouchDB is built of the Web. I've never
seen software that so completely embraces
the philosophies behind HTTP ... this is
what the software of the future looks like.
Jacob Kaplan-Moss
October 17 2007
http://jacobian.org/writing/of-the-web/
STS 3/10/2010 12 Mike Miller UW/Cloudant
13. SHOULD YOU CARE?
The internet happened, and we ignored it.
In retrospect that was a mistake.
Paraphrased from Bill Warner (Founder Avid, Wildfire)
YCombinator Summer, 2008
CouchDB: An enabling technology
STS 3/10/2010 13 Mike Miller UW/Cloudant
14. FROM INTEREST TO ADOPTION
• 100+ production apps • Active
commercial
•3
development
books being written
• Rapidly maturing
• Vibrant, open
STS 3/10/2010 community 14 Mike Miller UW/Cloudant
15. LETS TAKE A LOOK
STS 3/10/2010 15 Mike Miller UW/Cloudant
16. DOCUMENTS
• Documents are JSON Objects
• Underscore-prefixed fields are reserved
• Documents can have binary attachments
• MVCC _rev deterministically generated from doc content
STS 3/10/2010 16 Mike Miller UW/Cloudant
17. REST API
• Create
PUT /mydb/mydocid
• Retrieve
GET /mydb/mydocid etag header
plug-n-play cache
• Update
PUT /mydb/mydocid
• Delete
DELETE /mydb/mydocid
STS 3/10/2010 17 Mike Miller UW/Cloudant
19. ROBUST
• Never overwrite previously committed data
• Documents stored in append, only b+trees, ‘copy-on-write’
• In the event of a server crash or power failure, just restart
CouchDB -- there is no “repair”
• Take snapshots with “cp”
• Configurable levels of durability: can choose to fsync after
every update, or less often to gain better throughput
STS 3/10/2010 19 Mike Miller UW/Cloudant
20. CONCURRENT
• Erlang approach: lightweight processes to model the natural
concurrency in a problem
• For CouchDB that means one process per TCP connection
• Lock-free architecture; each process works with an MVCC
snapshot of a DB.
• Performance degrades gracefully under heavy concurrent load
STS 3/10/2010 20 Mike Miller UW/Cloudant
21. CONCURRENCY IN ACTION: BBC
• WYSIWYG: graceful at scale: load, volume, concurrency
STS 3/10/2010 21 Mike Miller UW/Cloudant
22. VIEWS: B.Y.O. INDEX
• Custom, persistent representations of document data
• “Close to the metal” -- no dynamic queries in production, so
you know exactly what you’re getting
• Generated using MapReduce functions written in JavaScript
(and other languages)
• Each view must have a map function and may also have a
reduce function
• Leverages view collation, rich view query API
STS 3/10/2010 22 Mike Miller UW/Cloudant
23. VIEWS: INCREMENTAL
• Computing a view can be expensive, so CouchDB saves the
result in a B-tree and keeps it up-to-date
• Leaf nodes store map results, inner nodes store reductions of
children
http://horicky.blogspot.com/2008/10/couchdb-implementation.html
STS 3/10/2010 23 Mike Miller UW/Cloudant
24. VIEWS: EXAMPLE
key value
STS 3/10/2010 24 Mike Miller UW/Cloudant
28. REPLICATION
• Peer-based, bi-directional replication using normal HTTP calls
• Mediated by a replicator process which can live on the
source, target, or somewhere else entirely
• Replicate a subset of documents in a DB meeting criteria
defined in a custom filter function (coming soon)
• Applications (_design documents) replicate along with the
data
• Ideal for offline applications -- “ground computing”
STS 3/10/2010 28 Mike Miller UW/Cloudant
29. REPLICATION
Source Replicator Target
What updates do you
have since X?
Which of these
GET /db/_changes?since=X are you missing?
POST /db/_missing_revs
Target needs these
id@rev documents
Save these documents
[GET /db/doc?open_revs=...]
POST /db/_bulk_docs
Record the latest source update seen by the target
STS 3/10/2010 29 Mike Miller UW/Cloudant
33. CONFLICT RESOLUTION
• Replication can introduce conflicts in a multi-master setup
• CouchDB deterministically chooses a winner; the loser is
saved as a conflict revision
• Conflict revisions are replicated; if you replicate A→B and
B→A, both will agree on the winning and losing revisions
STS 3/10/2010 33 Mike Miller UW/Cloudant
34. CONFLICT RESOLUTION
Case 1: save the same update to multiple replication partners
PUT /a/foo PUT /b/foo
replicate
No Conflict
STS 3/10/2010 34 Mike Miller UW/Cloudant
35. CONFLICT RESOLUTION
Case 2: save different updates to the same document
PUT /a/foo PUT /b/foo
replicate
Conflict
STS 3/10/2010 35 Mike Miller UW/Cloudant
36. CONFLICT RESOLUTION
replicate
No Conflict
STS 3/10/2010 36 Mike Miller UW/Cloudant
37. CONFLICT RESOLUTION
save different updates to the same document to replication
partners
STS 3/10/2010 37 Mike Miller UW/Cloudant
38. REPLICATION IN ACTION: UBUNTU 9.10
“When Ubuntu 9.10 is released on October 29th every single Ubuntu user will
have an address book stored in CouchDB that replicates with one.ubuntu.com,
and Tomboy notes that are replicated via a web API at the application but then
stored in CouchDB and carried along in the CouchDB replication that we have
set up. Optionally they can also store all their Firefox bookmarks in CouchDB
and have those replicated as well. We'll be doing our best to help teach
application developers to use CouchDB in order to ʻcloud-enableʼ their apps. A
couch on every desktop!”
Elliot Murphy, 10/13/2009
replication is a potential game-changer
STS 3/10/2010 38 Mike Miller UW/Cloudant
39. STUFF I DIDN’T TALK ABOUT
• Server-side processing: validate, upsert, show, list
• Authentication using HTTP Basic, Cookie, OAuth
• couchapp: serve lightweight HTML+JavaScript web apps directly out of
CouchDB using _show and _list functions to transform JSON [1]
• _external indexers, including full-text search powered by CouchDB-Lucene [2]
• The many excellent client libraries, view servers, etc. contributed by the
community [3]
• SQLite wrapper for CouchDB [4], and much more...
[1]: http://github.com/couchapp/couchapp
[2]: http://github.com/rnewson/couchdb-lucene
[3]: http://wiki.apache.org/couchdb/Related_Projects
[4]: http://apsw.googlecode.com/svn/publish/couchdb.html
STS 3/10/2010 39 Mike Miller UW/Cloudant
40. TAKE IT FOR A SPIN
hosted free:
cloudant.com
easy offline:
CouchDBX
STS 3/10/2010 40 Mike Miller UW/Cloudant