Hadoop World 2011: Lily: Smart Data at Scale, Made EasyCloudera, Inc.
Lily is a repository made for the age of Data, and combines CDH, HBase and Solr in a powerful, high-level, developer-friendly backing store for content-centric application with ambition to scale. In this session, we highlight why we choose HBase as the foundation for Lily, and how Lily will allow users to not only store, index and search vast quantities of data, but also to track audience behaviour and generate recommendations, all in real-time.
The webinar discusses Lily, a smart Big Data solution. It provides an overview of opportunities and challenges of Big Data, and how Lily addresses these through its flexible storage and scaling architecture. Use cases are presented for Lily in media, education and retail companies. An adoption program is described to help companies explore, adopt and deploy Lily.
KVIV / NoSQL : the new generation of database serversNGDATA
presentation for KVIV on 2010/06/03
as usual with my presentations: if you we're not there, you missed half of the fun as some of the more important ideas are not in the slides
This document provides an overview of NoSQL and Hadoop technologies. It discusses the trends driving these technologies like increasing data size, connectivity of data, semi-structured data, and decoupled service architectures. It introduces concepts from academic research like Amazon Dynamo, Google BigTable, and Brewer's CAP theorem. Specific technologies are explained like Hadoop for processing large datasets using MapReduce on the Hadoop Distributed File System.
GLORIAD's New Measurement and Monitoring SystemEd Dodds
GLORIAD has developed a new system for measuring and monitoring global network infrastructure that focuses on individual customers rather than links. The new system collects and analyzes 200-400 million network records per day using open-source Argus software. It aims to (1) understand network utilization of individual customers, (2) identify poor application performance in near real-time, (3) mitigate poor performance by identifying fabric weaknesses, and (4) provide rich visualization tools. GLORIAD has transitioned from its previous netflow-based system to the new Argus-based system to realize this new focus on individual customers. The presentation provides details on GLORIAD's new measurement and monitoring approach and tools.
The document discusses the Spark ecosystem. It provides an overview of Spark, a cluster computing framework developed at UC Berkeley, including its core components like Resilient Distributed Datasets (RDDs) and projects like Shark. Spark aims to improve on Hadoop and MapReduce by allowing more interactive queries and streaming data analysis through its use of RDDs to cache data in memory across clusters.
The document discusses big data tools of the past, present, and future. It summarizes three types of big data processing tools and challenges for the future. First, it discusses Hadoop and MapReduce frameworks which were used for large-scale batch processing of static "data at rest." Second, it covers current in-memory tools like HBase that can process "data in motion" in real-time. Third, it mentions streaming data collection tools like Storm and Kafka. It concludes that future big data architectures will require hybrid approaches and addresses big data security as an important issue going forward.
Hadoop World 2011: Lily: Smart Data at Scale, Made EasyCloudera, Inc.
Lily is a repository made for the age of Data, and combines CDH, HBase and Solr in a powerful, high-level, developer-friendly backing store for content-centric application with ambition to scale. In this session, we highlight why we choose HBase as the foundation for Lily, and how Lily will allow users to not only store, index and search vast quantities of data, but also to track audience behaviour and generate recommendations, all in real-time.
The webinar discusses Lily, a smart Big Data solution. It provides an overview of opportunities and challenges of Big Data, and how Lily addresses these through its flexible storage and scaling architecture. Use cases are presented for Lily in media, education and retail companies. An adoption program is described to help companies explore, adopt and deploy Lily.
KVIV / NoSQL : the new generation of database serversNGDATA
presentation for KVIV on 2010/06/03
as usual with my presentations: if you we're not there, you missed half of the fun as some of the more important ideas are not in the slides
This document provides an overview of NoSQL and Hadoop technologies. It discusses the trends driving these technologies like increasing data size, connectivity of data, semi-structured data, and decoupled service architectures. It introduces concepts from academic research like Amazon Dynamo, Google BigTable, and Brewer's CAP theorem. Specific technologies are explained like Hadoop for processing large datasets using MapReduce on the Hadoop Distributed File System.
GLORIAD's New Measurement and Monitoring SystemEd Dodds
GLORIAD has developed a new system for measuring and monitoring global network infrastructure that focuses on individual customers rather than links. The new system collects and analyzes 200-400 million network records per day using open-source Argus software. It aims to (1) understand network utilization of individual customers, (2) identify poor application performance in near real-time, (3) mitigate poor performance by identifying fabric weaknesses, and (4) provide rich visualization tools. GLORIAD has transitioned from its previous netflow-based system to the new Argus-based system to realize this new focus on individual customers. The presentation provides details on GLORIAD's new measurement and monitoring approach and tools.
The document discusses the Spark ecosystem. It provides an overview of Spark, a cluster computing framework developed at UC Berkeley, including its core components like Resilient Distributed Datasets (RDDs) and projects like Shark. Spark aims to improve on Hadoop and MapReduce by allowing more interactive queries and streaming data analysis through its use of RDDs to cache data in memory across clusters.
The document discusses big data tools of the past, present, and future. It summarizes three types of big data processing tools and challenges for the future. First, it discusses Hadoop and MapReduce frameworks which were used for large-scale batch processing of static "data at rest." Second, it covers current in-memory tools like HBase that can process "data in motion" in real-time. Third, it mentions streaming data collection tools like Storm and Kafka. It concludes that future big data architectures will require hybrid approaches and addresses big data security as an important issue going forward.
N-O-SQL, new database technologies on the riseNGDATA
The document discusses new database technologies called NoSQL or non-relational databases that are gaining popularity. It provides an overview of the reasons for the rise of these technologies, including the need to scale databases for large amounts of data and high user volumes. It also discusses some of the core concepts behind NoSQL databases like document stores, key-value stores, and column-oriented databases.
Lily for the Bay Area HBase UG - NYC editionNGDATA
The document discusses Lily, an open source content application developed by Outerthought that uses HBase for scalable storage and SOLR for search. It provides a high-level overview of Lily's architecture, which maps content to HBase, indexes it in SOLR, and uses a queue implemented on HBase to connect updates between the systems. Future plans for Lily include a 1.0 release with additional features like user management and a UI framework.
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...Sirris
Data growth is rapidly surpassing Moore's Law, as data sets are growing increasingly large, hence deriving insights from these large data sets is becoming more and more complex. Lily, a software product made by Outerthought, allows you to store, index and search vast quantities of data. In the next few years, successful business models will be based on monetization of data. Steven Noels will highlight the raison d'être of Lily, discussing challenges that every data-intensive organisation encounters.
The document discusses Outerthought's vision and strategy for addressing the growing amount of digital content and user data. Their mission is to become the premier provider of content application technologies for the emerging "content as opportunity" age. They are developing technologies like Lily, a NoSQL content repository, and frameworks to help clients capture, process, and extract knowledge from large amounts of user data at scale. Their partnership strategy involves collaborating with technology companies, domain experts, and businesses to develop customized solutions and mutually support an open software platform.
Building a CMS on top of NoSQL (for ParisJUG)NGDATA
The document discusses building a content management system (CMS) using NoSQL technologies like HBase. It describes some of the scaling challenges faced with the traditional CMS architecture. These include issues with caching, access control computations, and data merging across different data stores. It explores using a database like HBase that can scale out through horizontal partitioning and replication to address these problems. Key requirements for the NoSQL database are also outlined.
The document discusses the rise of big data and NoSQL databases. It notes that organizations are drowning in large amounts of data from various sources like user-generated content. However, traditional relational databases struggle to handle this type and volume of semi-structured data in a distributed, scalable manner. This has led to the emergence of NoSQL databases that are more flexible and better suited for the distributed, large-scale requirements of big data.
Introduction to rest-full development with Java. Focussing on jax-rs (and kauriproject.org)
Slides presented at devoxx-2010.
See http://devoxx.com/display/Devoxx2K10/RESTful+development+with+Java for abstract.
Devoxx 2010 | Tools In Action : Kauri and LilyNGDATA
Introduction to our web-scale tools for "rest-full app development" and "bigdata store and search" (lily)
Slides presented at devoxx-2010
http://devoxx.com/display/Devoxx2K10/Scalable+and+RESTful+web+applications++at+the+crossroads+of+Kauri+and+Lily
The document describes Lily, a platform for managing and analyzing large datasets in real-time. It handles data storage, indexing, and recommendations. Lily uses distributed processing on technologies like BigTable and Solr to scale across infrastructure. The document outlines Lily's current roadmap and provides a sample schema for modeling book data with fields.
The document discusses the Lily project, which aims to provide smart data at scale. It describes Lily's architecture and components, including using HBase for storage, Solr for indexing, and a real-time data processing engine. It outlines Lily's roadmap, which includes releasing version 1.0 in April 2011 and adding real-time analytics and data insights.
The document discusses the challenges of managing large-scale data and the need for real-time analytics. It proposes an integrated approach called Lily that can store all data, perform real-time processing, and provide insights by combining the data with domain knowledge. This moves beyond current batch processing methods to enable interactive use of data and instant feedback. Lily aims to help organizations maximize the value of the data they collect.
The world is the computer and the programmer is youDavide Carboni
This document discusses the past, present, and future of connecting physical objects to the internet and computing networks. It outlines the evolution of related technologies over time from the 1950s to present. It also describes two approaches to programming these connected systems - a top-down approach using tools like PySense, and a bottom-up approach using a model called Hyperpipe that is based on pi-calculus.
The document discusses building cloud-ready applications. It outlines limitations of traditional hosting and how cloud computing addresses these through scaling, flexibility, and automation. It promotes a "pets vs cattle" philosophy where infrastructure is treated as standardized resources rather than individual machines. It also emphasizes that monolithic applications need restructuring to follow "12 factor principles" and integrate with cloud architectures through loose coupling, configuration as code, and other best practices.
Afterwork big data et data viz - du lac à votre écranJoseph Glorieux
This document discusses a data visualization workshop hosted by OCTOSuisse on exploring and visualizing big data from a data lake. It provides an overview of OCTO's big data capabilities and projects. It then uses a case study of Swiss public transportation data to demonstrate data exploration, analysis, and visualization techniques using tools like Tableau. The goal is to understand data, identify insights, and effectively communicate findings to others.
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...SERENEWorkshop
SERENE 2014 School on Engineering Resilient Cyber Physical Systems
Talk: Measurement-Driven Resilience Design of Cloud-Based Cyber-Physical Systems, by Imre Kocsis
This document discusses a presentation on big data and data visualization from lake to screen. It covers exploring data in a data lake using tools like Tableau and Jupyter notebooks. Models can be built to predict things like train delays. Visualizations are then created using technologies like D3.js to communicate insights from the data and models. The goal is to extract value from large, raw data sources through the entire data science process from exploration to communication.
VoltDB is a high performance database for real-time analytics that can be deployed on SoftLayer cloud infrastructure. The document outlines the process to install and run VoltDB on SoftLayer, including unpacking the VoltDB distribution, installing Java, exporting the VoltDB binaries to the path, and running VoltDB using the run.sh script. It also discusses how VoltDB enables real-time analytics by ingesting and exporting data to Netezza for deeper historical analysis in a closed loop system.
NGDATA brings big data technology and machine intelligence together, allowing organizations to capitalize on the massive amounts of data that is generated today. NGDATA develops Lily, a big data management platform that offers an easy way to extract powerful business insights in real-time and benefit from enriched data to make an immediate impact on business performance. NGDATA's global partner community provides expert services best suited to meet evolving big data needs. NGDATA is a privately-held company with headquarters in Ghent, Belgium. More information and recent updates are available at www.ngdata.com.
The document repeatedly lists an address in Zwijnaarde, Belgium and a website. It provides no other context or information beyond stating "Two things more..." in the final line.
N-O-SQL, new database technologies on the riseNGDATA
The document discusses new database technologies called NoSQL or non-relational databases that are gaining popularity. It provides an overview of the reasons for the rise of these technologies, including the need to scale databases for large amounts of data and high user volumes. It also discusses some of the core concepts behind NoSQL databases like document stores, key-value stores, and column-oriented databases.
Lily for the Bay Area HBase UG - NYC editionNGDATA
The document discusses Lily, an open source content application developed by Outerthought that uses HBase for scalable storage and SOLR for search. It provides a high-level overview of Lily's architecture, which maps content to HBase, indexes it in SOLR, and uses a queue implemented on HBase to connect updates between the systems. Future plans for Lily include a 1.0 release with additional features like user management and a UI framework.
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Oute...Sirris
Data growth is rapidly surpassing Moore's Law, as data sets are growing increasingly large, hence deriving insights from these large data sets is becoming more and more complex. Lily, a software product made by Outerthought, allows you to store, index and search vast quantities of data. In the next few years, successful business models will be based on monetization of data. Steven Noels will highlight the raison d'être of Lily, discussing challenges that every data-intensive organisation encounters.
The document discusses Outerthought's vision and strategy for addressing the growing amount of digital content and user data. Their mission is to become the premier provider of content application technologies for the emerging "content as opportunity" age. They are developing technologies like Lily, a NoSQL content repository, and frameworks to help clients capture, process, and extract knowledge from large amounts of user data at scale. Their partnership strategy involves collaborating with technology companies, domain experts, and businesses to develop customized solutions and mutually support an open software platform.
Building a CMS on top of NoSQL (for ParisJUG)NGDATA
The document discusses building a content management system (CMS) using NoSQL technologies like HBase. It describes some of the scaling challenges faced with the traditional CMS architecture. These include issues with caching, access control computations, and data merging across different data stores. It explores using a database like HBase that can scale out through horizontal partitioning and replication to address these problems. Key requirements for the NoSQL database are also outlined.
The document discusses the rise of big data and NoSQL databases. It notes that organizations are drowning in large amounts of data from various sources like user-generated content. However, traditional relational databases struggle to handle this type and volume of semi-structured data in a distributed, scalable manner. This has led to the emergence of NoSQL databases that are more flexible and better suited for the distributed, large-scale requirements of big data.
Introduction to rest-full development with Java. Focussing on jax-rs (and kauriproject.org)
Slides presented at devoxx-2010.
See http://devoxx.com/display/Devoxx2K10/RESTful+development+with+Java for abstract.
Devoxx 2010 | Tools In Action : Kauri and LilyNGDATA
Introduction to our web-scale tools for "rest-full app development" and "bigdata store and search" (lily)
Slides presented at devoxx-2010
http://devoxx.com/display/Devoxx2K10/Scalable+and+RESTful+web+applications++at+the+crossroads+of+Kauri+and+Lily
The document describes Lily, a platform for managing and analyzing large datasets in real-time. It handles data storage, indexing, and recommendations. Lily uses distributed processing on technologies like BigTable and Solr to scale across infrastructure. The document outlines Lily's current roadmap and provides a sample schema for modeling book data with fields.
The document discusses the Lily project, which aims to provide smart data at scale. It describes Lily's architecture and components, including using HBase for storage, Solr for indexing, and a real-time data processing engine. It outlines Lily's roadmap, which includes releasing version 1.0 in April 2011 and adding real-time analytics and data insights.
The document discusses the challenges of managing large-scale data and the need for real-time analytics. It proposes an integrated approach called Lily that can store all data, perform real-time processing, and provide insights by combining the data with domain knowledge. This moves beyond current batch processing methods to enable interactive use of data and instant feedback. Lily aims to help organizations maximize the value of the data they collect.
The world is the computer and the programmer is youDavide Carboni
This document discusses the past, present, and future of connecting physical objects to the internet and computing networks. It outlines the evolution of related technologies over time from the 1950s to present. It also describes two approaches to programming these connected systems - a top-down approach using tools like PySense, and a bottom-up approach using a model called Hyperpipe that is based on pi-calculus.
The document discusses building cloud-ready applications. It outlines limitations of traditional hosting and how cloud computing addresses these through scaling, flexibility, and automation. It promotes a "pets vs cattle" philosophy where infrastructure is treated as standardized resources rather than individual machines. It also emphasizes that monolithic applications need restructuring to follow "12 factor principles" and integrate with cloud architectures through loose coupling, configuration as code, and other best practices.
Afterwork big data et data viz - du lac à votre écranJoseph Glorieux
This document discusses a data visualization workshop hosted by OCTOSuisse on exploring and visualizing big data from a data lake. It provides an overview of OCTO's big data capabilities and projects. It then uses a case study of Swiss public transportation data to demonstrate data exploration, analysis, and visualization techniques using tools like Tableau. The goal is to understand data, identify insights, and effectively communicate findings to others.
SERENE 2014 School: Measurement-Driven Resilience Design of Cloud-Based Cyber...SERENEWorkshop
SERENE 2014 School on Engineering Resilient Cyber Physical Systems
Talk: Measurement-Driven Resilience Design of Cloud-Based Cyber-Physical Systems, by Imre Kocsis
This document discusses a presentation on big data and data visualization from lake to screen. It covers exploring data in a data lake using tools like Tableau and Jupyter notebooks. Models can be built to predict things like train delays. Visualizations are then created using technologies like D3.js to communicate insights from the data and models. The goal is to extract value from large, raw data sources through the entire data science process from exploration to communication.
VoltDB is a high performance database for real-time analytics that can be deployed on SoftLayer cloud infrastructure. The document outlines the process to install and run VoltDB on SoftLayer, including unpacking the VoltDB distribution, installing Java, exporting the VoltDB binaries to the path, and running VoltDB using the run.sh script. It also discusses how VoltDB enables real-time analytics by ingesting and exporting data to Netezza for deeper historical analysis in a closed loop system.
NGDATA brings big data technology and machine intelligence together, allowing organizations to capitalize on the massive amounts of data that is generated today. NGDATA develops Lily, a big data management platform that offers an easy way to extract powerful business insights in real-time and benefit from enriched data to make an immediate impact on business performance. NGDATA's global partner community provides expert services best suited to meet evolving big data needs. NGDATA is a privately-held company with headquarters in Ghent, Belgium. More information and recent updates are available at www.ngdata.com.
The document repeatedly lists an address in Zwijnaarde, Belgium and a website. It provides no other context or information beyond stating "Two things more..." in the final line.
Devoxx 2010 | Tools In Action : Kauri and LilyNGDATA
Introduction to our web-scale tools for "rest-full app development" and "bigdata store and search" (lily)
Slides presented at devoxx-2010
http://devoxx.com/display/Devoxx2K10/Scalable+and+RESTful+web+applications++at+the+crossroads+of+Kauri+and+Lily
1) The document discusses NoSQL databases and provides advice on choosing a platform, hardware requirements, backup/replication strategies, and common issues like bottlenecks and consistency.
2) It recommends analyzing your problem and understanding the benefits of your chosen platform, as well as considering the CAP theorem.
3) Contact information is provided for several NoSQL experts on Twitter and mailing lists to stay up to date on new developments.
HBase is a distributed, column-oriented database that provides random access reads and writes on top of HDFS. It uses a multi-dimensional key-value data model where keys are composed of a table, row, column family, column qualifier, and timestamp. Column families allow for locality of storage and efficient access. Data is stored at the intersection of row keys and column families/qualifiers, which is sometimes called a "cell". HBase can be used as a normal datastore with static column qualifiers or in more advanced ways by using dynamic qualifiers to build secondary indexes or embed data in the qualifier.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
1. Welcome to the age of data!
BIGDATA.BE
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
2. who am i
» Steven Noels
» Founder & VP Product
» Makers of Lily: Interactive Big Data
platform
» Open Source / Apache Software
Foundation
» co-founder bigdata.be
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 2
3. Houston,
we have
a problem.
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
36. map/reduce
» Batch-oriented
» Data locality (code is shipped around)
» Heavy parallellization
» Process management
» Append-only files
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 32
37. Hadoop ecosystem
» Hadoop Common » Hive: A data warehouse infrastructure
» Subprojects that provides data summarization and
ad hoc querying.
» Flume/SQOOP: Data collection systems
» MapReduce: A software framework for
for large distributed systems.
distributed processing of large data
» HBase: A scalable, distributed database sets on compute clusters.
that supports structured data storage
» Pig: A high-level data-flow language
for large/wide tables.
and execution framework for parallel
» HDFS: A distributed file system that computation.
provides high throughput access to
» ZooKeeper: A high-performance
application data.
coordination service for distributed
applications.
» Mahout: machine learning libraries
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 33
38. High-level data model / easy API indexes
UI Framework SDK
(HUE) (HUE SDK)
Search
Dev2Dev
Workflow Scheduling Metadata tutoring,
(OOZIE) (oozie) (HIVE) integrated
deployment
and
Languages / enterprise
Data Compilers Fast usage metrics, support
Integration (PIG, HIVE) Read/Write analytics &
(FLUME, Access recommen-
SQOOP) (HBASE) dations
(PIG, HIVE)
Coordination
(ZOOKEEPER)
CDH
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 34
39. real-time big data architecture
1. compensate for high latency of updates to serving layer
speed layer 2. fast, incremental algorithms
3. batch layer eventually overrides speed layer
storm
1. random access to batch views
serving layer 2. updated by batch layer
1. store master dataset (append-only)
batch layer 2. compute arbitrary views
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 35
42. The start of Lily.
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 38
43. Thank you !
for your attention
for your questions
» steven.noels@outerthought.com
» @stevenn
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org