Running complex data queries in a distributed systemArangoDB Database
With the always-growing amount of data, it is getting increasingly hard to store and get it back efficiently. While the first versions of distributed databases have put all the burden of sharding on the application code, there are now some smarter solutions that handle most of the data distribution and resilience tasks inside the database.
This poses some interesting questions, e.g.
- how are other than by-primary-key queries actually organized and executed in a distributed system, so that they can run most efficiently?
- how do the contemporary distributed databases actually achieve transactional semantics for non-trivial operations that affect different shards/servers?
This talk will give an overview of these challenges and the available solutions that some open source distributed databases have picked to solve them.
Searching Relational Data with Elasticsearchsirensolutions
Second Galway Data Meetup, 29th April 2015
Elasticsearch was originally developed for searching flat documents. However, as real world data is inherently more complex, e.g., nested json data, relational data, interconnected documents and entities, Elasticsearch quickly evolves to support more advanced search scenarios. In this presentation, we will review existing features and plugins to support such scenarios, discuss their advantages and disadvantages, and understand which one is more appropriate for a particular scenario.
by Joseph Idziorek, Sr. Product Manager, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
These are the slides to the webinar about Custom Pregel algorithms in ArangoDB https://youtu.be/DWJ-nWUxsO8. It provides a brief introduction to the capabilities and use cases for Pregel.
Running complex data queries in a distributed systemArangoDB Database
With the always-growing amount of data, it is getting increasingly hard to store and get it back efficiently. While the first versions of distributed databases have put all the burden of sharding on the application code, there are now some smarter solutions that handle most of the data distribution and resilience tasks inside the database.
This poses some interesting questions, e.g.
- how are other than by-primary-key queries actually organized and executed in a distributed system, so that they can run most efficiently?
- how do the contemporary distributed databases actually achieve transactional semantics for non-trivial operations that affect different shards/servers?
This talk will give an overview of these challenges and the available solutions that some open source distributed databases have picked to solve them.
Searching Relational Data with Elasticsearchsirensolutions
Second Galway Data Meetup, 29th April 2015
Elasticsearch was originally developed for searching flat documents. However, as real world data is inherently more complex, e.g., nested json data, relational data, interconnected documents and entities, Elasticsearch quickly evolves to support more advanced search scenarios. In this presentation, we will review existing features and plugins to support such scenarios, discuss their advantages and disadvantages, and understand which one is more appropriate for a particular scenario.
by Joseph Idziorek, Sr. Product Manager, AWS
Database Week at the AWS Loft is an opportunity to learn about Amazon’s broad and deep family of managed database services. These services provide easy, scalable, reliable, and cost-effective ways to manage your data in the cloud. We explain the fundamentals and take a technical deep dive into Amazon RDS and Amazon Aurora relational databases, Amazon DynamoDB non-relational databases, Amazon Neptune graph databases, and Amazon ElastiCache managed Redis, along with options for database migration, caching, search and more. You'll will learn how to get started, how to support applications, and how to scale.
These are the slides to the webinar about Custom Pregel algorithms in ArangoDB https://youtu.be/DWJ-nWUxsO8. It provides a brief introduction to the capabilities and use cases for Pregel.
Elasticsearch Introduction to Data model, Search & AggregationsAlaa Elhadba
An overview of Elasticsearch features and explains performing smart search, data aggregations, and relevancy through scoring functions. How Elasticsearch works as a distributed scalable data storage. Finally, showcasing some use cases that are currently becoming core functionalities in Zalando.
Optiq is a dynamic query planning framework. It can potentially help integrate Pentaho Mondrian and Kettle with various SQL, NoSQL and BigData data sources.
How to integrate Splunk with any data solutionJulian Hyde
A presentation Julian Hyde gave to the Splunk 2012 User conference in Las Vegas, Tue 2012/9/11. Julian demonstrated a new technology called Optiq, described how it could be used to integrate data in Splunk with other systems, and demonstrated several queries accessing data in Splunk via SQL and JDBC.
Use Cases for Elastic Search PercolatorMaxim Shelest
Overview of possible Use Cases for Elastic Search Percolator. Deep dive into Percolator feature of Elasticsearch with various examples for source code, HTTP requests and architecture.
Accelerating distributed joins in Apache Hive: Runtime filtering enhancementsPanagiotis Garefalakis
Apache Hive is an open-source relational database system that is widely adopted by several organizations for big data analytic workloads. It combines traditional MPP (massively parallel processing) techniques with more recent cloud computing concepts to achieve the increased scalability and high performance needed by modern data intensive applications. Even though it was originally tailored towards long running data warehousing queries, its architecture recently changed with the introduction of LLAP (Live Long and Process) layer. Instead of regular containers, LLAP utilizes long-running executors to exploit data sharing and caching possibilities within and across queries. Executors eliminate unnecessary disk IO overhead and thus reduce the latency of interactive BI (business intelligence) queries by orders of magnitude. However, as container startup cost and IO overhead is now minimized, the need to effectively utilize memory and CPU resources across long-running executors in the cluster is becoming increasingly essential. For instance, in a variety of production workloads, we noticed that the memory bandwidth of early decoding all table columns for every row, even when this row is dropped later on, is starting to overwhelm the performance of single query execution. In this talk, we focus on some of the optimizations we introduced in Hive 4.0 to increase CPU efficiency and save memory allocations. In particular, we describe the lazy decoding (or row-level filtering) and composite bloom-filters optimizations that greatly improve the performance of queries containing broadcast joins, reducing their runtime by up to 50%. Over several production and synthetic workloads, we show the benefit of the newly introduced optimizations as part of Cloudera’s cloud-native Data Warehouse engine. At the same time, the community can directly benefit from the presented features as are they 100% open-source!
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
Relevance trilogy may dream be with you! (dec17)Woonsan Ko
Introducing new BloomReach Experience Plugins which changes the game of DREAM (Digital Relevance Experience & Agility Management), to increase productivity and business agility.
Elasticsearch Introduction to Data model, Search & AggregationsAlaa Elhadba
An overview of Elasticsearch features and explains performing smart search, data aggregations, and relevancy through scoring functions. How Elasticsearch works as a distributed scalable data storage. Finally, showcasing some use cases that are currently becoming core functionalities in Zalando.
Optiq is a dynamic query planning framework. It can potentially help integrate Pentaho Mondrian and Kettle with various SQL, NoSQL and BigData data sources.
How to integrate Splunk with any data solutionJulian Hyde
A presentation Julian Hyde gave to the Splunk 2012 User conference in Las Vegas, Tue 2012/9/11. Julian demonstrated a new technology called Optiq, described how it could be used to integrate data in Splunk with other systems, and demonstrated several queries accessing data in Splunk via SQL and JDBC.
Use Cases for Elastic Search PercolatorMaxim Shelest
Overview of possible Use Cases for Elastic Search Percolator. Deep dive into Percolator feature of Elasticsearch with various examples for source code, HTTP requests and architecture.
Accelerating distributed joins in Apache Hive: Runtime filtering enhancementsPanagiotis Garefalakis
Apache Hive is an open-source relational database system that is widely adopted by several organizations for big data analytic workloads. It combines traditional MPP (massively parallel processing) techniques with more recent cloud computing concepts to achieve the increased scalability and high performance needed by modern data intensive applications. Even though it was originally tailored towards long running data warehousing queries, its architecture recently changed with the introduction of LLAP (Live Long and Process) layer. Instead of regular containers, LLAP utilizes long-running executors to exploit data sharing and caching possibilities within and across queries. Executors eliminate unnecessary disk IO overhead and thus reduce the latency of interactive BI (business intelligence) queries by orders of magnitude. However, as container startup cost and IO overhead is now minimized, the need to effectively utilize memory and CPU resources across long-running executors in the cluster is becoming increasingly essential. For instance, in a variety of production workloads, we noticed that the memory bandwidth of early decoding all table columns for every row, even when this row is dropped later on, is starting to overwhelm the performance of single query execution. In this talk, we focus on some of the optimizations we introduced in Hive 4.0 to increase CPU efficiency and save memory allocations. In particular, we describe the lazy decoding (or row-level filtering) and composite bloom-filters optimizations that greatly improve the performance of queries containing broadcast joins, reducing their runtime by up to 50%. Over several production and synthetic workloads, we show the benefit of the newly introduced optimizations as part of Cloudera’s cloud-native Data Warehouse engine. At the same time, the community can directly benefit from the presented features as are they 100% open-source!
Talk given for the #phpbenelux user group, March 27th in Gent (BE), with the goal of convincing developers that are used to build php/mysql apps to broaden their horizon when adding search to their site. Be sure to also have a look at the notes for the slides; they explain some of the screenshots, etc.
An accompanying blog post about this subject can be found at http://www.jurriaanpersyn.com/archives/2013/11/18/introduction-to-elasticsearch/
Relevance trilogy may dream be with you! (dec17)Woonsan Ko
Introducing new BloomReach Experience Plugins which changes the game of DREAM (Digital Relevance Experience & Agility Management), to increase productivity and business agility.
Learning To Run - XPages for Lotus Notes Client DevelopersKathy Brown
You’re an experienced Lotus Notes developer. You’ve been doing “classic” development for years. You know LotusScript better than your native language. You know @Formula like the back of your hand. But when it comes to Xpages and Javascript, you feel like you’re learning to walk all over again. This session will cover some tips and tricks to get you up and running in Xpages. Learn how to translate what you already know, into what you need to know for Xpages. Find out where to get the information to be just as skillful at Xpages as you are with Notes client development.
The web has changed! Users spend more time on mobile than on desktops and they expect to have an amazing user experience on both platforms. APIs are the heart of the new web as the central point of access data, encapsulating logic and providing the same data and same features for desktops and mobiles.
In this talk, I will show you how in only 45 minutes we can create full REST API, with documentation and admin application build with React.
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
This talk, given at PyGotham 2011, will teach you techniques using the popular NoSQL database MongoDB and the Python library Ming to write maintainable, high-performance, and scalable applications. We will cover everything you need to become an effective Ming/MongoDB developer from basic PyMongo queries to high-level object-document mapping setups in Ming.
Tuning and optimizing webcenter spaces application white paperVinay Kumar
This white paper focuses on Oracle WebCenter Spaces performance problem and analysis after post production deployment. We will tune JVM ( JRocket). Webcenter Portal, Webcenter content and ADF task flow.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
1. Compass Framework
Lightning introduction
into full-text search framework in 7 minutes
Lukáš Vlček, jOpenSpace 2009
2. Agenda
• What is Compass
• Code snippets
• Compass integration
• Distributed indexing and searching
• Alternatives
• Resources
• Q&A
3. What it Compass
Compass Goal
quot;Simplify the integration of Search Engine into any applicationquot;
Author: Shay Banon (working for GigaSpaces)
http://www.kimchy.org/
„Using Spring Framework, Hibernate, and all the other tools out there that makes a
developer life simple, he was surprised to find nothing similar in the search engine
department. Now, don't get him wrong, Lucene is an amazing search engine (or IR
library), but developers want simplicity, and Lucene comes with an added complexity that
caused Shay to start Compass.“
4. @Searchable(alias= quot;casequot;, root = true)
Metadata definitions
@SearchableDynamicMetaData(
for the class
name = quot;descriptionquot;,
expression = quot;quot; +
quot;(data.caseComments == null ? '' : data.caseComments) + ' ' + quot; +
quot;(data.caseDescription == null ? '' : data.caseDescription) + ' ' + quot; +
quot;(data.caseDispo == null ? '' : data.caseDispo)quot;,
converter = quot;groovyquot;,
Code
termVector = TermVector.WITH_POSITIONS_OFFSETS) snippets
@Entity
@Table(name=quot;CDT_CASEquot;)
public class Case implements Serializable, LockableEntity{
@SearchableId(boost = 1.2f) Searchable ID
@SearchableMetaData(name = quot;idquot;)
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator=quot;CDT_CASE_SEQquot;)
@SequenceGenerator(name=quot;CDT_CASE_SEQquot;, sequenceName=quot;CDT_CASE_SEQquot;, allocationSize=1)
@GenericGenerator(
name = quot;CDT_CASE_SEQquot;, strategy = quot;sequencequot;, parameters = {
@Parameter(name = quot;sequencequot;, value = quot;CDT_CASE_SEQquot;),
@Parameter(name = quot;allocationSizequot;, value = quot;1quot;)
}
) Searchable property
private Long id;
@SearchableProperty(name = quot;titlequot;, termVector = TermVector.WITH_POSITIONS_OFFSETS)
@Column(length=1024)
private String caseTitle; Searchable collection
(collection entitiy must have its own mapping)
@SearchableComponent
@OneToMany(fetch=FetchType.EAGER, cascade=javax.persistence.CascadeType.ALL)
@org.hibernate.annotations.Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
private Set<EventLotId> lotId;
5. Compass integration
Compass integrates with Spring in the same manner ORM Frameworks
integration is done within the Spring Framework code-base. It also
integrates with Spring transaction abstraction layer, AOP support, and MVC
library.
Integration with different ORM frameworks (Hibernate, JPA, JDO, OJB),
allowing for almost transparent integration between a search engine and an
ORM view of content that resides in a database.
Not using ORM or Spring?
Having XML data?
Having no domain model?
… never mind, Compass can still support you!
6. Distributed indexing and searching
~ Slide contributed by Shay Banon ~
Storing the index in a distributed fashion:
Compass supports storing the index on GigaSpaces, Coherence, and
Terracotta. The first two allow to partition the data, the last (tc) works a bit
different. Bind that with optimized local cache on the client side, and you
get a very nice solution where the index is stored in a data grid.
Performing both the indexing and the search in a map reduce
fashion:
You can check the blog I wrote about GigaSpaces, as well as the part
about collocated indexing and searching to understand how the
architecture would look like. This is implemented in Compass for
GigaSpaces, and probably it can be done for Coherence as well.
7. Alternatives
• Database build in full text search (Oracle, Postgres,...)
o close coupling to database
o not using Lucene
• Hibernate Search
o using Lucene
o limited to Hibernate (not for iBatis or JDBCTemplate)
• Solr
o not a direct alternative - used in different scenarios
o Compass could be used in similar scenario as Solr
o http://wiki.apache.org/solr/DataImportHandler
o http://www.sematext.com/product-db-indexer.html
9. More Resources...
Compass on Google App Engine
http://www.kimchy.org/searchable-google-appengine-with-compass/
Compass in media
http://www.tradingmarkets.com/.site/news/Stock%20News/2301092/