Open source Technology

Simple: You can read the code.
You can see how it's made

Two main characteristics
First, Its FREE

Second (much more important &
interesting),it’s free as in freedom.

Four Freedoms
* The freedom to run the program for any

  Purpose

* The freedom to study how the program
  works, and adapt it to your needs

* The freedom to redistribute copies

* The freedom to improve the program

Anyone can do whatever they like with it.
Nobody owns it, Everyone can use it, Anyone
can improve it

Improved in terms of quantity of code
(functionality)
People add layers on top of other people’s code

As the code base grows, the potential grows
Improves chances of it being used for something
not intended by the originator

What does it take to be a
Web Developer?

Let's take a brief look on what is a
“Web Developer”

And that was just the Ruby stack

What does it take to be a Web Developer?

* Very reliable OS

* Extremely powerful

* Performs great even in less
resources

* Compelling Graphics

* Powerful Programming supports

* Scalable

* No piracy Issues

Web server can refer to either the hardware (the
computer) or the software (the computer
application) that helps to deliver Web content
that can be accessed through the Internet.

The most common use of web servers is to host
websites, but there are other uses such as
gaming, data storage or running enterprise
applications.

Apache
* Only webserver to run on all major platforms
   (*NIX, WINDOZ, MAC, FREEBSD and any other you
   name it)

* Largest Market share holder for web servers
   since 1996 and still growing.

* Relational Database

* World’s Fastest growing open
   source database servers.

* Fast performance, high reliability
   and ease of use.

* It's used on every continent
   Yes, even Antarctica

* Work on more than 20 platforms
   including Linux, Windoz, OS/X, HP
   UX, AIX, Netware to name a few

* Supports various Engines

* Open Source serverside scripting
   language designed specifically for the
   web.

* Most widely uses language on the web

* Outputs not only HTML but can output XML,
   images (JPG & PNG), PDF files and even
   Flash movies (using libswf and Ming) all
   generated on the fly. Can write these
   files to the filesystem.

* Supports a widerange of databases
   (20 + ODBC).

* Perl and Clike syntax. Relatively easy
   to learn.

A Copy of real data with faster (and/or
cheaper) access.

From Wikipedia : "A cache is a
collection of data duplicating original
stored elsewhere or computed earlier,
where the original data is expensive to
fetch(owing to longer access time) or
to compute, compared to the cost of
reading the cache."

MySQL query Cache : Cache in the DB

Disk : File Cache

In Memory : Memached

What is Memcache ?
Free & open source, highperformance, distributed
memory object caching system, generic in nature,
but intended for use in speeding up dynamic web
applications by alleviating database load.

Memcached is an inmemory keyvalue store for
small chunks of arbitrary data (strings, objects)
from results of database calls, API calls, or page
rendering.

Memcached is simple yet powerful. Its simple
design promotes quick deployment, ease of
development, and solves many problems facing large
data caches. Its API is available for most popular
languages.

Memcache Users

Faebook
Naukri
LiveJournal
Wikipedia
Flickr
Bebo
Twitter
Typepad
Yellowbot
Youtube
Digg
WordPress.com
Craigslist
Mixi

Pattern

Fetch from cache

If there, return

Else caclculate, place in cache, return

Program
function get_foo(foo_id)

    foo = memcached_get("foo:" . foo_id)

    return foo if defined foo

    foo = fetch_foo_from_database(foo_id)

    memcached_set("foo:" . foo_id, foo)

    return foo

end

Let's add Memcache to the CODE

Gearmend
Daemon that manages the work.

Does not do any work.

Accetps a job id and a binay payload from
Clients

Workers keep connections open at all
times.

Client

Clients connect to Gearmand and ask for
  work to be done

The client can fire and forget or wait on
  a responses

Multiple jobs can be done asynchronously
  by workers for one client.

Workers

A single worker can do just one job or
can do many jobs.

Does not have to be written using the
same language as the workers.

An Example Client
# Create our client object.
$client= new GearmanClient();

# Add default server (localhost).
$client>addServer();

echo "Sending jobn";

# Send reverse job
$result = $client>do("reverse", "Hello!");
if ($result) {
echo "Success: $resultn";
}

An Example Worker
# Create our worker object.
$worker= new GearmanWorker();

# Add default server (localhost).
$worker>addServer();

# Register function "reverse" with the server.
$worker>addFunction("reverse", "reverse_fn");

while (1)
{
  print "Waiting for job...n";
  $ret= $worker>work();
  if ($worker>returnCode() != GEARMAN_SUCCESS)
    break;
}

# A much simple reverse function
function reverse_fn($job)
{
  $workload= $job>workload();
  echo "Received job: " . $job>handle() . "n";
  echo "Workload: $workloadn";
  $result= strrev($workload);
  echo "Result: $resultn";
  return $result;
}

Database paradigms

* Relational (RDBMS)

* NoSQL
* Keyvalue stores
* Document databases
* Graph Database

* Others

Relational Databases
* ACID
Automicity
Consistency
Isolation
Durability

* SQL

* Mature

NoSQL
* No relational tables

* No fixed tables schemas

* No joins

* No risk, no fun !

* Massive data stores

* Scaling is easy

* Simpler to implement

Goodbye rows and tables, hello documents and collections

Lots of pretty pictures to fool you.

Introduction

MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and
traditional RDBMS systems (which provide rich queries and deep functionality).

MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++

Mongo is not a relational database like MySQL

Goodbye rows and tables, hello documents and collections

Features
Document-oriented



Documents (objects) map nicely to programming language data types

Embedded documents and arrays reduce need for joins

No joins and no multi-document transactions for high performance and easy scalability

High performance

No joins and embedding makes reads and writes fast

Indexes including indexing of keys from embedded documents and arrays

High availability

Replicated servers with automatic master failover

Easy scalability

Automatic sharding (auto-partitioning of data across servers)

Reads and writes are distributed over shards

No joins or multi-document transactions make distributed queries easy and fast

Eventually-consistent reads can be distributed over replicated servers

Why ?

 Cost - MongoDB is free
 MongoDb is easily installable.
 MongoDb supports various programming languages like C, C++, Java,Javascript, PHP.
 MongoDB is blazingly fast
 MongoDB is schemaless
 Ease of scale-out
If load increases it can be distributed to other nodes across computer networks.
 It's trivially easy to add more fields -- even complex fields -- to your objects.
So as requirements change, you can adapt code quickly.
 Background Indexing
 MongoDB is a stand-alone server
 Development time is faster, too, since there are no schemas to manage.
 It supports Server-side JavaScript execution.
Which allows a developer to use a single programming language for both client and server
side code

Limitations

Mongo is limited to a total data size of 2GB for all databases in 32-bit mode.

No referential integrity

Data size in MongoDB is typically higher.

At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK,
but not blisteringly fast.

Group By : less than 10,000 keys.
For larger grouping operations without limits, please use map/reduce .

Lack of predefined schema is a double-edged sword

No support for Joins & transactions

Mongo data model


A Mongo system (see deployment above) holds a set of databases

A database holds a set of collections

A collection holds a set of documents

A document is a set of fields

A field is a key-value pair

A key is a name (string)

A value is a

basic type like string, integer, float, timestamp, binary, etc.,

a document, or

an array of values

MySQL Term Mongo Term

database database

table collection

index index

Continued ...
SQL Statement Mongo Statement

Why & How ?

* Bugs are bad

* Locate issues during runtime

* Speed up issue resolution

* Breakpoints

* Xdebug

Xdebug
Xdebug is a PHP extension that aims to
lend a helping hand in the process of
debugging your applications. Xdebug
offers features like:

    * Automatic stack trace upon error
    * Function call logging
    * Display features such as enhanced
      var_dump() output and code
      coverage information

   Open Source
   Free

Enabling Xdebug in php.ini

zend_extension="/usr/lib/php5/20090626+lfs/xdebug.so"
xdebug.remote_enable=1
xdebug.remote_host="127.0.0.1"
xdebug.remote_port=9000
xdebug.profiler_enable=1
xdebug.show_local_vars=On
xdebug.trace_output_dir="/tmp/xprofile/"
xdebug.trace_output_name= %t.trace
xdebug.profiler_output_name = %s.%t.profile
xdebug.profiler_output_dir="/tmp/xprofile/"

Apache Lucene is a free/open source
information retrieval software library,
originally created in Java by Doug
Cutting.

Scalable, HighPerformance Indexing

* small RAM requirements
* incremental indexing as fast as batch indexing
   * index size roughly 2030% the size of text indexed

Powerful, Accurate and Efficient Search Algorithms

* ranked searching best results returned first
* many powerful query types: phrase queries, wildcard
     queries, proximity queries, range queries and more
   * fielded searching (e.g., title, author, contents)
   * daterange searching
   * sorting by any field
   * multipleindex searching with merged results
   * allows simultaneous update and searching

CrossPlatform Solution

* Available as Open Source software under the Apache
     License which lets you use Lucene in both commercial
     and Open Source programs
* 100%pure Java
   * Implementations in other programming languages
     available that are indexcompatible


Pitfalls
* small RAM requirements

* Update = Delete + Add
* No Partial document update
* No Joins


* 100%pure Java


* small RAM requirementsCode: FS Indexer
private IndexWriter writer;
public Indexer(String indexDir) throws IOException {
Directory dir = FSDirectory.open(new File(indexDir));
writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_CURRENT), true,
IndexWriter.MaxFieldLength.UNLIMITED);
}
public void close() throws IOException {
writer.close();
}
public void index(String dataDir, FileFilter filter) throws Exception {
File[] files = new File(dataDir).listFiles();
for (File f: files) {
Document doc = new Document();
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("filename", f.getName(),
Field.Store.YES, Field.Index.NOT_ANALYZED));
writer.addDocument(doc);
}
}
* 100%pure Java

Code: Searcher
public void search(String indexDir, String q) throws IOException,
ParseException {
Directory dir = FSDirectory.open(new File(indexDir));
IndexSearcher is = new IndexSearcher(dir, true);

QueryParser parser = new QueryParser("contents",
new
StandardAnalyzer(Version.LUCENE_CURRENT));
Query query = parser.parse(q);
TopDocs hits = is.search(query, 10);
System.err.println("Found " + hits.totalHits + " document(s)");

for (int i=0; i<hits.scoreDocs.length; i++) {
ScoreDoc scoreDoc = hits.scoreDocs[i];
Document doc = is.doc(scoreDoc.doc);
System.out.println(doc.get("filename"));
}

is.close();
}

Open source Technology

More Related Content

What's hot

Viewers also liked

Similar to Open source Technology

Recently uploaded

Open source Technology