Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
Beyond PHP - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just writing PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
LAMP was originally Linux, Apache, MySQL, PHP. While the L & A have parts have become more flexible, most still use MySQL. With the recent acquisition by Oracle there's no better time to demystify PostgreSQL. For years PostgreSQL has had a reputation of being difficult, but this is the furthest from the truth.
This presentation by Asher Snyder will cover installation, basic queries, stored procedures, triggers, and full-text search.
Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
Beyond PHP - it's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just writing PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
LAMP was originally Linux, Apache, MySQL, PHP. While the L & A have parts have become more flexible, most still use MySQL. With the recent acquisition by Oracle there's no better time to demystify PostgreSQL. For years PostgreSQL has had a reputation of being difficult, but this is the furthest from the truth.
This presentation by Asher Snyder will cover installation, basic queries, stored procedures, triggers, and full-text search.
LAMP was originally Linux, Apache, MySQL, PHP. While the L & A have parts have become more flexible, most still use MySQL. With the recent acquisition by Oracle there's no better time to demystify PostgreSQL. For years PostgreSQL has had a reputation of being difficult, but this is the furthest from the truth.
This presentation by Asher Snyder will cover installation, basic queries, stored procedures, triggers, and full-text search and more.
Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
Databases are a key part of any application. The storage subsystem contributes most to performance of the database. In recent days, new storage technologies like Solid State Storage (SSD) and high performance drives are becoming cheaper and more accessible, but it takes a lot of planning to use these technologies in a cost effective way for best price-performance.
Beyond PHP - It's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
As any engineer will tell you, choosing a database for your next great project is an important but challenging step , as it’s hard to anticipate exactly where your project roadmap will lead you. Switching databases is even harder and while we are able to choose the right tool for each task, you then end up with several database solutions to maintain.
What if there was a product that was hybrid enough to suit multiple workloads? While there is a grain of truth in the “jack of all trades, master of none” expression, after a recent deep-dive into the Apache Ignite database we may have that “master of all” database. In this Meetup, led by DoiT International Staff Cloud Architect Zaar Hai, we will explore together to find out.
And finally, does it run on Kubernetes?
LAMP was originally Linux, Apache, MySQL, PHP. While the L & A have parts have become more flexible, most still use MySQL. With the recent acquisition by Oracle there's no better time to demystify PostgreSQL. For years PostgreSQL has had a reputation of being difficult, but this is the furthest from the truth.
This presentation by Asher Snyder will cover installation, basic queries, stored procedures, triggers, and full-text search and more.
Caching and tuning fun for high scalabilityWim Godden
Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you.
Databases are a key part of any application. The storage subsystem contributes most to performance of the database. In recent days, new storage technologies like Solid State Storage (SSD) and high performance drives are becoming cheaper and more accessible, but it takes a lot of planning to use these technologies in a cost effective way for best price-performance.
Beyond PHP - It's not (just) about the codeWim Godden
Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
As any engineer will tell you, choosing a database for your next great project is an important but challenging step , as it’s hard to anticipate exactly where your project roadmap will lead you. Switching databases is even harder and while we are able to choose the right tool for each task, you then end up with several database solutions to maintain.
What if there was a product that was hybrid enough to suit multiple workloads? While there is a grain of truth in the “jack of all trades, master of none” expression, after a recent deep-dive into the Apache Ignite database we may have that “master of all” database. In this Meetup, led by DoiT International Staff Cloud Architect Zaar Hai, we will explore together to find out.
And finally, does it run on Kubernetes?
We moeten het begrip innovatie terug renoveren. Innovatie is niets meer of minder dan de som van creativiteit plus ondernemerschap. Die creativiteit kan vanuit de wetenschappelijke of technologische hoek komen, maar even goed uit een business idee of maatschappelijke hoek alsook uit de creativiteit die ingebakken zit in de creatieve sectoren.
We hebben dan ook renaissance mensen nodig die op elk van die drie domeinen kunnen meespelen of toch minstens meepraten: technologie - business - creativiteit
MobileDiagnosis Onlus
codice fiscale 97261360826
IBAN: IT37 R050 18046 0000 0000 14 11 55
Le cose fatte:
270 studenti formati in 9 nazioni
15 corsi
Uganda, Bangladesh Comilla- Tangail-Bhuapur, Dinajpur, Madagascar,
Afghanistan, Repubblica Democratica del Congo-Thailandia-Shoklo Malaria Research Center al confine con il Myanmar- Campi rifugiat diMae La, MKT, Wang Pha, India Assam, Nepal Kathmandu.
Individuazione di quattro forme di malaria e della babesia nel villaggio di Tshimbulu Screening per la riduzione della mortalita' dei bimbi
In DRC abbiamo contribuito a costruire, anche fisicamente:
1-la TBC Room, la stanza dedicata alla tubercolosi, in modo da lavorare con un minimo di sicurezza
2-due ambienti per isolamento e decontaminazione in caso di emergenze con altissimo rischio infettivo come febbri emorragiche, e patologie infettive particolarmente gravi
Con il direttore, Valerio Fullin, abbiamo disegnato e realizzato, avvalendoci della mano d’opera locale, un ambiente costituito da due stanze , un bagno, ed un locale spogliatoio con doccia per la decontaminazione degli operatori.
3- sistemazione definitiva dell’inceneritore, che era aperto ed accessibile a bimbi, cani, e quanti si trovassero in quella zona dell’ospedale.
4-identificazione del tipo di malaria responsabile delle morti di tanti bimbi.
con l’aiuto del Team della Prof Cancrini, e la Prof Gabrielli , della
Universita’ la Sapienza,
Grazie al vostro sostegno
Abbiamo potuto inviare a Tshimbulu 200 confezioni di Clorochina efficace nelle forme di malaria non falciparum
Grazie ai vostri doni di Natale
abbiamo potuto sponsorizzare uno screening gratuito per
I bambini di Tshimbulu
Tutti I bambini positivi per malaria hanno ricevuto immediatamente la terapia, e
non moriranno, almeno per adesso, per la terribile anemia legata alla malaria
Aiutateci ad andare avanti
Scegliete di contribuire con il vostro 5 x 1000 alla nostra attivita'
Sosteneteci donando il vostro 5 x 1000 a
MobileDiagnosis Onlus
codice fiscale 97261360826
IBAN: IT37 R050 18046 0000 0000 14 11 55
grazie
livia
Dr Vassilis Zachariadis explained the research activities that have been developed at CASA, UCL.
Flood impact assessment in mega cities under urban sprawl and climate change kick-off workshop
Oracle hardware includes a full-suite of scalable engineered systems, servers, and storage that enable enterprises to optimize application and database performance, protect crucial data, and lower costs.
With Oracle, customers have freedom from the complexity of having multiple databases, analytics tools, and machine learning environments. Oracle's data management platform makes it easier and faster for application developers to create microservices-based applications with multiple data types.
Accelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based F...Databricks
In Big Data field, Spark SQL is important data processing module for Apache Spark to work with structured row-based data in a majority of operators. Field-programmable gate array(FPGA) with highly customized intellectual property(IP) can not only bring better performance but also lower power consumption to accelerate CPU-intensive segments for an application.
By using Cloud Storage Ceph you can start building your cloud foundation or big data storage backend with an affordable small pc. Even using really cheap cloud storage to learn and use 40TB storage cost over 3400$ USD a year.
Let's take a look how can we build a cloud storage of your own.
Security Best Practice: Oracle passwords, but secure!Stefan Oehrli
Authentication is an integral part of database security. If authentication or passwords are insufficient or inadequate, all further security measures are generally useless. But how do you ensure that passwords are complex and authentication is secure? In this presentation, the password hashes will be explained and it will be shown how to make sure passwords and authentication are state of the art. Focusing on the current versions of the Oracle database, the following topics will be discussed:
– Oracle database authentication
– Password verification and hashes
– Where can I find password hashes?
– Check and password hashes.
– Discussion of various risks related to authentication.
– Discussion of password policies and strong passwords.
– Customer Use Case in the DB Vault environment "ups we have forgotten the passwords"
2018 Infortrend All Flash Arrays Introduction (GS3025A)infortrendgroup
Infortrend All Flash Array Storages deliver lightning speed performance and enable users to optimize SSDs and energy efficiency. It also provides all the benefits of SAN, NAS and Cloud Gateway storage together in one single system, making it the ideal choice for enterprise applications (such as database, virtualization, video editing, file sharing, backup, and cloud data integration). For hardware, all flash array storage features a 2U-25bay form factor, flexible host boards to choose from, and stable, reliable modular design with high expandability; as for software, it comes with complete data services and simple, intuitive management interfaces.
More Information:
https://www.infortrend.com/global/products/FS
Recommended Product:
https://www.infortrend.com/global/products/families/fs/all-flash-arrays
EEvolution slides from EEUK2013 to use as a reference to our talk. Let us know if you need a hand with anything or further explanation... we know it was quite a heavy presentation.
This presentation was given to the Dublin Node (JS) Community on May 29th 2014.
Presented by: Chris Lawless, Kevin Yu Wei Xia, Fergal Carroll @phergalkarl, Ciarán Ó hUallacháin, and Aman Kohli @akohli
Intro to goldilocks inmemory db - low latencyDongpyo Lee
Goldilocks is a In-memory DB solution which has tremendous performance. It can process millions of user data operations per second.
It's simple, lightweight, easy to use.
We strongly recommand you to try it. If you do so, you'll see the best low latency data solution which you've ever met.
Palestra realizada por Luciano Palma no Intel Software Day 2013 (22/10/2013)
Conheça a arquitetura do Intel Xeon Phi, um coprocessador capaz de entregar mais de 2 TFlops de processamento para sua solução de HPC (High Performance Computing).
Cloud economics design, capacity and operational concernsMarcos García
Learn how to choose your e-commerce infrastructure, and how to forecast the TCO based on a simple model, including the explanations on how public, private and hybrid cloud computing work.
Initial presentation of openstack (for montreal user group)Marcos García
Introduction to Openstack: basic concepts, latest Havana project release, cloud terminology (including IaaS, PaaS and SaaS). This presentation was shown in the first Openstack Montreal user group in November 19 2013 (http://montrealopenstack.org/)
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
2. SOLR
●
SOLR is an standalone search server, that can scale separatedly from
the application that uses it
●
i.e. Avoid the case where an e-commerce server is slowed down by the
users searching their product catalog
●
SOLR is accessed using HTTP/XML REST-like and JSON APIs
●
Multi-platform, multi-language and client-independent
●
Results in XML, CSV, or JSON (with custom variations for
Ruby,Python,PHP)
●
100% Opensource, written in Java, runs in JVM
●
Apache Foundation top-level project
●
Most widely-used search server in industry
3. SOLR : A Lucene server
●
Solr is a search platform that provides all the features of Lucene search engine *
●
high-performance indexing
●
Incremental and batch indexing
●
Small footprint (RAM and disk)
●
And has all of Lucene features
●
Ranked searching
●
Many query types (phrase, wildcard, regexp, range, geospatial proximity)
●
Many field types, meaningful sorting
●
Multi-index search and merge of results
● Faceting
●
Language recognition (stemming)
● Suggestions
* (both projects are actually merged since SOLR 3.1, March 2010)
4. Simple SOLR Example
●
Index a product catalog (i.e. IPod Video)
●
Data in XML format
<doc>
<field name="id">MA147LL/A</field>
<field name="name">Apple 60 GB iPod with Video Playback Black</field>
<field name="features">2.5-inch, 320x240 color TFT LCD display with LED backlight</field>
<field name="features">Up to 20 hours of battery life</field>
<field name="features">Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video</field>
<field name="price">399.00</field>
<field name="inStock">true</field>
<field name="store">37.7752,-100.0232</field> <!-- Dodge City store -->
</doc>
●
Schema configuration
<field
<field
name="id" type="string" indexed="true" stored="true"/>
name="name" type="text" indexed="true" stored="true"/>
<field name="features" type="text" indexed="true" stored="true" multiValued="true"/>
<field name="price" type="float" indexed="true" stored="true"/>
<field name="inStock" type="boolean" indexed="true" stored="true" />
<field name="store" type="location" indexed="true" stored="true"/>
5. Simple SOLR Example
●
Query
●
Return all products with « video » in any field, sorted by descendant
price, show just the name,price,inStock
curl "http://localhost:8983/solr/collection1/select?q=video&sort=price+desc&fl=name,price,instock&indent=true"
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1</int>
<lst name="params">
<str name="fl">name,price</str>
<str name="sort">price desc</str>
<str name="indent">true</str>
<str name="q">video</str>
</lst>
</lst>
<result name="response" numFound="3" start="0">
<doc>
<str name="name">ATI Radeon X1900 XTX 512 MB PCIE Video Card</str>
<float name="price">649.99</float>
<bool name="inStock">false</bool></doc>
<doc>
<str name="name">ASUS Extreme N7800GTX/2DHTV (256 MB)</str>
<float name="price">479.95</float>
<bool name="inStock">false</bool></doc>
<doc>
<str name="name">Apple 60 GB iPod with Video Playback Black</str>
<float name="price">399.0</float>
<bool name="inStock">true</bool></doc>
</result>
</response>
7. Simple SOLR Example
●
Filter Query
●
Uses different cache than Search Cache (useful for big results)
Filter Query : all products priced from 300 to 499 USD
q=*&fl=name,price&fq=price:[300 TO 499]
<result name="response" numFound="4" start="0">
<doc>
<str name="name">Maxtor DiamondMax 11 - hard drive - 500 GB – SATA-300</str>
<float name="price">350.0</float>
</doc>
<doc>
<str name="name">Apple 60 GB iPod with Video Playback Black</str>
<float name="price">399.0</float>
</doc>
<doc>
<str name="name">Canon PowerShot SD500</str>
<float name="price">329.95</float>
</doc>
<doc>
<str name="name">ASUS Extreme N7800GTX/2DHTV (256 MB)</str>
<float name="price">479.95</float>
</doc>
</result>
8. Simple SOLR Example
●
Spatial Query
●
Store data:
– <field name="store">45.17614,-93.87341</field> <!-- Buffalo store -->
– <field name="store">40.7143,-74.006</field> <!-- NYC store -->
– <field name="store">37.7752,-122.4232</field> <!-- San Francisco store -->
●
We are at 45.15,-93.85 (at 3.437 km from the Buffalo store)
●
Find all products in a store within 5km of our position:
QUERY : &fl=name,store&q=*:*&fq={!geofilt%20pt=45.15,-93.85%20sfield=store%20d=5} »
"response":{"numFound":3,"start":0,"docs":[
{
"name":"Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300",
"store":"45.17614,-93.87341"},
{
"name":"Belkin Mobile Power Cord for iPod w/ Dock",
"store":"45.18014,-93.87741"},
{
"name":"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM",
"store":"45.18414,-93.88141"}]
}
9. SOLR Features
●
SOLR Cloud
●
Cluster configuration using zookeper
●
Easy sharding and failover management
●
Self-healing, no single point of failure
●
SOLR Cell (aka RequestImportHandler)
●
TIKA integration for binary document parsing
●
Parses DOC, PDF, XLS, MIME, etc
●
DataImportHandlers
●
Automatically fetch and index SQL Databases, E-mails, RSS feeds,
Files in folder, etc.
10. SOLR Features
●
Multiple Solr Core
●
Many index collections in the same server
●
Different schema definitions for each collection
●
Different configurations for storage, replication, etc
●
Caching
●
Recurrent searches are cached, improves speed
●
Advanced warming techniques
●
Adding content triggers just a partial cache update
● Advanced
●
Language detection
●
Natural Language Processing
●
Clustering to scale both search and document retrieval
12. SOLR TIKA integration
●
SOLRCell embeds TIKA for binary file parsing
●
TIKA parses DOC, PDF, XLSX, HTML... and represent it
using XHTML, JSON or CSV
●
Full list of accepted formats :
http://tika.apache.org/1.3/formats.html
●
For some files, it can just index metadata (MP3, JPG, AVI)
●
SOLRCell will internally recover the TIKA output and store it so
we can search it
●
SOLR does not store the original binary file
15. SOLR Use Cases
●
Liferay Search
●
As liferay already uses Lucene, we can connect it to a SOLR server
●
Leverages the Liferay server and lets the SOLR cluster handle all the
user searches in the portal
●
Magento E-Commerce .
●
Avoids using MySQL for searching
●
Better search results
●
Better overall performance
●
Alfresco Search
●
Currently, Alfresco recommends to setup SOLR from the beginning
●
By default, Lucene+Tika is used internally