MongoDB presentation for NYC Python's June meetup. Brief discussion on non-relational databases in general followed by an example of using MongoDB as a blog's backend
MongoDB presentation for NYC Python's June meetup. Brief discussion on non-relational databases in general followed by an example of using MongoDB as a blog's backend
Since its first appearance in 2009, NodeJS has come a long way. Many frameworks have been developed on top of it. These all make our task easy and quick. It is us who need to decide which one to choose? So, here is the list of top 10 NodeJS frameworks that will help you build an awesome application.
These are the slides I presented at the Nosql Night in Boston on Nov 4, 2014. The slides were adapted from a presentation given by Steve Francia in 2011. Original slide deck can be found here:
http://spf13.com/presentation/mongodb-sort-conference-2011
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Students of Navgujarat College of Computer Applications, Ahmedabad felt excit...cresco
Cresco's panel included one of the best expert in Open Source Technology who had vast experience in PHP/MySQL programming. Our expert has shared enthusiastic information about Open Source Technology & PHP programming as well as its benefits starting right from its introduction and various modules of PHP.
Since its first appearance in 2009, NodeJS has come a long way. Many frameworks have been developed on top of it. These all make our task easy and quick. It is us who need to decide which one to choose? So, here is the list of top 10 NodeJS frameworks that will help you build an awesome application.
These are the slides I presented at the Nosql Night in Boston on Nov 4, 2014. The slides were adapted from a presentation given by Steve Francia in 2011. Original slide deck can be found here:
http://spf13.com/presentation/mongodb-sort-conference-2011
MongoDB is the most famous and loved NoSQL database. It has many features that are easy to handle when compared to conventional RDBMS. These slides contain the basics of MongoDB.
Students of Navgujarat College of Computer Applications, Ahmedabad felt excit...cresco
Cresco's panel included one of the best expert in Open Source Technology who had vast experience in PHP/MySQL programming. Our expert has shared enthusiastic information about Open Source Technology & PHP programming as well as its benefits starting right from its introduction and various modules of PHP.
Todays technology race is at high speed. Private companies plan trips to Mars and consumers print in 3D on their kitchen table. Joris tells his story about open hardware and what it can bring to you.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ut imperdiet enim. Donec lectus eros, luctus quis dapibus ac, posuere sed dolor. Sed id orci at sapien hendrerit adipiscing et at enim. Nam eu adipiscing mauris. Nulla aliquam nisl nec risus viverra elementum. Maecenas facilisis.
This presentation is about Open Source Software, this may be helpful to understand what is open source, why we need open source software and examples of Open Source software.
This Presentation is created by Harishankar Ranagaraj and was presentated at various sessions.
Harishankar Rangaraj is the founder and Director of Open Source Academy India Pvt Ltd.
For any support on Open Source Software you can Contact us.
Open Source Academy Pvt India Ltd,
Email: info@osaipl.com
www.osaipl.com
A seminar presentation on Open Source by Ritwick Halder - a computer science engineering student at Academy Of Technology, West Bengal, India - 2013
Personal Website - www.ritwickhalder.com
Presentation Material for NoSQL Indonesia "October MeetUp".
This slide talks about basic schema design and some examples in applications already on production.
Building scalable and language independent java services using apache thriftTalentica Software
This presentation is about the key challenges of cross language interactions and how they can be overcome. We discuss the Apache Thrift as a solution and understand its principle of Operation with code snippets and examples.
Building and deploying LLM applications with Apache AirflowKaxil Naik
Behind the growing interest in Generate AI and LLM-based enterprise applications lies an expanded set of requirements for data integrations and ML orchestration. Enterprises want to use proprietary data to power LLM-based applications that create new business value, but they face challenges in moving beyond experimentation. The pipelines that power these models need to run reliably at scale, bringing together data from many sources and reacting continuously to changing conditions.
This talk focuses on the design patterns for using Apache Airflow to support LLM applications created using private enterprise data. We’ll go through a real-world example of what this looks like, as well as a proposal to improve Airflow and to add additional Airflow Providers to make it easier to interact with LLMs such as the ones from OpenAI (such as GPT4) and the ones on HuggingFace, while working with both structured and unstructured data.
In short, this shows how these Airflow patterns enable reliable, traceable, and scalable LLM applications within the enterprise.
https://airflowsummit.org/sessions/2023/keynote-llm/
A fotopedia presentation made at the MongoDay 2012 in Paris at Xebia Office.
Talk by Pierre Baillet and Mathieu Poumeyrol.
French Article about the presentation:
http://www.touilleur-express.fr/2012/02/06/mongodb-retour-sur-experience-chez-fotopedia/
Video to come.
MongoDB World 2019: Fast Machine Learning Development with MongoDBMongoDB
Today an increasingly large number of products use machine learning to deliver a great personalized user experience, and workplace software is no exception. Learn how Spoke uses MongoDB to do dynamic model training in real time from user interaction data and automatically train and serve thousands of models, with multiple customized models per client.
Topics covered :
- What is Meteor
- What is inside
- What is reactivity
- Reactivity in Meteor
- DDP
- Minimongo
- To use or Not to use
- File Structure
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
27. Web server can refer to either the hardware (the
computer) or the software (the computer
application) that helps to deliver Web content
that can be accessed through the Internet.
The most common use of web servers is to host
websites, but there are other uses such as
gaming, data storage or running enterprise
applications.
Apache
* Only webserver to run on all major platforms
(*NIX, WINDOZ, MAC, FREEBSD and any other you
name it)
* Largest Market share holder for web servers
since 1996 and still growing.
36. A Copy of real data with faster (and/or
cheaper) access.
From Wikipedia : "A cache is a
collection of data duplicating original
stored elsewhere or computed earlier,
where the original data is expensive to
fetch(owing to longer access time) or
to compute, compared to the cost of
reading the cache."
37. MySQL query Cache : Cache in the DB
Disk : File Cache
In Memory : Memached
38. What is Memcache ?
Free & open source, highperformance, distributed
memory object caching system, generic in nature,
but intended for use in speeding up dynamic web
applications by alleviating database load.
Memcached is an inmemory keyvalue store for
small chunks of arbitrary data (strings, objects)
from results of database calls, API calls, or page
rendering.
Memcached is simple yet powerful. Its simple
design promotes quick deployment, ease of
development, and solves many problems facing large
data caches. Its API is available for most popular
languages.
65. Introduction
MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and
traditional RDBMS systems (which provide rich queries and deep functionality).
MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++
Mongo is not a relational database like MySQL
Goodbye rows and tables, hello documents and collections
Features
Document-oriented
Documents (objects) map nicely to programming language data types
Embedded documents and arrays reduce need for joins
No joins and no multi-document transactions for high performance and easy scalability
High performance
No joins and embedding makes reads and writes fast
Indexes including indexing of keys from embedded documents and arrays
High availability
Replicated servers with automatic master failover
Easy scalability
Automatic sharding (auto-partitioning of data across servers)
Reads and writes are distributed over shards
No joins or multi-document transactions make distributed queries easy and fast
Eventually-consistent reads can be distributed over replicated servers
66. Why ?
Cost - MongoDB is free
MongoDb is easily installable.
MongoDb supports various programming languages like C, C++, Java,Javascript, PHP.
MongoDB is blazingly fast
MongoDB is schemaless
Ease of scale-out
If load increases it can be distributed to other nodes across computer networks.
It's trivially easy to add more fields -- even complex fields -- to your objects.
So as requirements change, you can adapt code quickly.
Background Indexing
MongoDB is a stand-alone server
Development time is faster, too, since there are no schemas to manage.
It supports Server-side JavaScript execution.
Which allows a developer to use a single programming language for both client and server
side code
67. Limitations
Mongo is limited to a total data size of 2GB for all databases in 32-bit mode.
No referential integrity
Data size in MongoDB is typically higher.
At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK,
but not blisteringly fast.
Group By : less than 10,000 keys.
For larger grouping operations without limits, please use map/reduce .
Lack of predefined schema is a double-edged sword
No support for Joins & transactions
68. Mongo data model
A Mongo system (see deployment above) holds a set of databases
A database holds a set of collections
A collection holds a set of documents
A document is a set of fields
A field is a key-value pair
A key is a name (string)
A value is a
basic type like string, integer, float, timestamp, binary, etc.,
a document, or
an array of values
MySQL Term Mongo Term
database database
table collection
index index
75. Xdebug
Xdebug is a PHP extension that aims to
lend a helping hand in the process of
debugging your applications. Xdebug
offers features like:
* Automatic stack trace upon error
* Function call logging
* Display features such as enhanced
var_dump() output and code
coverage information
Open Source
Free
80. Apache Lucene is a free/open source
information retrieval software library,
originally created in Java by Doug
Cutting.
81. Scalable, HighPerformance Indexing
* small RAM requirements
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
* daterange searching
* sorting by any field
* multipleindex searching with merged results
* allows simultaneous update and searching
CrossPlatform Solution
* Available as Open Source software under the Apache
License which lets you use Lucene in both commercial
and Open Source programs
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
82. Scalable, HighPerformance Indexing
* small RAM requirements
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
* daterange searching
* sorting by any field
* multipleindex searching with merged results
* allows simultaneous update and searching
CrossPlatform Solution
* Available as Open Source software under the Apache
License which lets you use Lucene in both commercial
and Open Source programs
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
83. Scalable, HighPerformance Indexing
* small RAM requirements
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
* daterange searching
* sorting by any field
* multipleindex searching with merged results
* allows simultaneous update and searching
CrossPlatform Solution
* Available as Open Source software under the Apache
License which lets you use Lucene in both commercial
and Open Source programs
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
84. Scalable, HighPerformance Indexing
* small RAM requirements
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
* daterange searching
* sorting by any field
* multipleindex searching with merged results
* allows simultaneous update and searching
CrossPlatform Solution
* Available as Open Source software under the Apache
License which lets you use Lucene in both commercial
and Open Source programs
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
85. Scalable, HighPerformance Indexing
Pitfalls
* small RAM requirements
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
Powerful, Accurate and Efficient Search Algorithms
* Update = Delete + Add
* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
* No Partial document update
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
* daterange searching
* No Joins
* sorting by any field
* multipleindex searching with merged results
* allows simultaneous update and searching
CrossPlatform Solution
* Available as Open Source software under the Apache
License which lets you use Lucene in both commercial
and Open Source programs
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
86. Scalable, HighPerformance Indexing
* small RAM requirementsCode: FS Indexer
* incremental indexing as fast as batch indexing
* index size roughly 2030% the size of text indexed
private IndexWriter writer;
Powerful, Accurate and Efficient Search Algorithms
public Indexer(String indexDir) throws IOException {
Directory dir = FSDirectory.open(new File(indexDir));
* ranked searching best results returned first
writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_CURRENT), true,
IndexWriter.MaxFieldLength.UNLIMITED);
* many powerful query types: phrase queries, wildcard
}
queries, proximity queries, range queries and more
* fielded searching (e.g., title, author, contents)
public void close() throws IOException {
* daterange searching
writer.close();
}
* sorting by any field
* multipleindex searching with merged results
public void index(String dataDir, FileFilter filter) throws Exception {
* allows simultaneous update and searching
File[] files = new File(dataDir).listFiles();
for (File f: files) {
Document doc = new Document();
CrossPlatform Solution
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("filename", f.getName(),
* Available as Open Source software under the Apache
Field.Store.YES, Field.Index.NOT_ANALYZED));
License which lets you use Lucene in both commercial
writer.addDocument(doc);
}
and Open Source programs
}
* 100%pure Java
* Implementations in other programming languages
available that are indexcompatible
87. Code: Searcher
public void search(String indexDir, String q) throws IOException,
ParseException {
Directory dir = FSDirectory.open(new File(indexDir));
IndexSearcher is = new IndexSearcher(dir, true);
QueryParser parser = new QueryParser("contents",
new
StandardAnalyzer(Version.LUCENE_CURRENT));
Query query = parser.parse(q);
TopDocs hits = is.search(query, 10);
System.err.println("Found " + hits.totalHits + " document(s)");
for (int i=0; i<hits.scoreDocs.length; i++) {
ScoreDoc scoreDoc = hits.scoreDocs[i];
Document doc = is.doc(scoreDoc.doc);
System.out.println(doc.get("filename"));
}
is.close();
}