Open source Technology

1,255 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,255
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
43
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Open source Technology

  1. 1. Open Source Technologies
  2. 2. What is Open Source ?
  3. 3. Simple: You can read the code.           You can see how its made
  4. 4. Two main characteristics  First, Its FREE
  5. 5. Second (much more important &  interesting),it’s free as in freedom.
  6. 6. Four Freedoms* The freedom to run the program for any   Purpose* The freedom to study how the program     works, and adapt it to your needs* The freedom to redistribute copies* The freedom to improve the program
  7. 7. Why this is cool ?
  8. 8. Anyone can do whatever they like with it.Nobody owns it, Everyone can use it, Anyone can improve it
  9. 9. Improved in terms of quantity of code (functionality)People add layers on top of other people’s code
  10. 10. As the code base grows, the potential growsImproves chances of it being used for something not intended by the originator
  11. 11. What does it take to be a  Web Developer?
  12. 12. HTML &PHP
  13. 13. Lets take a brief look on what is a  “Web Developer”
  14. 14. And that was just the Ruby stack
  15. 15. Now back to the question
  16. 16. What does it take to be a Web Developer?
  17. 17. A Passion for Learning
  18. 18. LAMP
  19. 19. L Linux
  20. 20.  * Very reliable OS * Extremely powerful * Performs great even in less    resources * Compelling Graphics * Powerful Programming supports * Scalable * No piracy Issues
  21. 21. L Apache
  22. 22. Web server can refer to either the hardware (the computer)  or  the  software  (the  computer application)  that  helps  to  deliver  Web  content that can be accessed through the Internet.The  most  common  use  of  web  servers  is  to  host websites,  but  there  are  other  uses  such  as gaming,  data  storage  or  running  enterprise applications.Apache * Only web­server to run on all major platforms    (*NIX, WINDOZ, MAC, FREEBSD and any other you    name it) * Largest Market share holder for web servers    since 1996 and still growing.
  23. 23. L MySQL
  24. 24.  * Relational Database  * World’s Fastest growing open    source database servers. * Fast performance, high reliability    and ease of use.  * Its used on every continent ­­    Yes, even Antarctica  * Work on more than 20 platforms    including Linux, Windoz, OS/X, HP­   UX, AIX, Netware to name a few * Supports various Engines
  25. 25. L PHP
  26. 26.  * Open Source server­side scripting     language designed specifically for the    web.  * Most widely uses language on the web * Outputs not only HTML but can output XML,   images (JPG & PNG), PDF files and even    Flash movies (using libswf and Ming) all    generated on the fly. Can write these    files to the filesystem. * Supports a wide­range of databases    (20 + ODBC). * Perl­ and C­like syntax. Relatively easy    to learn.
  27. 27. L LAMP Overview
  28. 28. Lets CODE :)
  29. 29. Memcache
  30. 30. What is Caching ?
  31. 31. A Copy of real data with faster (and/or cheaper) access.From  Wikipedia  :  "A  cache  is  a collection  of  data  duplicating  original stored  elsewhere  or  computed  earlier, where the original data is expensive to fetch(owing  to  longer  access  time)  or to  compute,  compared  to  the  cost  of reading the cache."
  32. 32. MySQL query Cache : Cache in the DBDisk : File CacheIn Memory : Memached
  33. 33. What is Memcache ?Free  &  open  source,  high­performance,  distributed memory  object  caching  system,  generic  in  nature, but  intended  for  use  in  speeding  up  dynamic  web applications by alleviating database load.Memcached  is  an  in­memory  key­value  store  for small  chunks  of  arbitrary  data  (strings,  objects) from results of database calls, API calls, or page rendering.Memcached  is  simple  yet  powerful.  Its  simple design  promotes  quick  deployment,  ease  of development, and solves many problems facing large data caches. Its API is available for most popular languages.
  34. 34. Memcache Users Faebook Naukri LiveJournal Wikipedia Flickr Bebo Twitter Typepad Yellowbot Youtube Digg WordPress.com Craigslist Mixi
  35. 35. Pattern­ Fetch from cache­ If there, return­ Else caclculate, place in cache, return
  36. 36. Programfunction get_foo(foo_id)    foo = memcached_get("foo:" . foo_id)    return foo if defined foo    foo = fetch_foo_from_database(foo_id)    memcached_set("foo:" . foo_id, foo)    return fooend
  37. 37. Lets add Memcache to the CODE
  38. 38. GEARMAN ?
  39. 39. MANAGER
  40. 40. Gearmend­ Daemon that manages the work.­ Does not do any work.­ Accetps a job id and a binay payload from   Clients­ Workers keep connections open at all   times.
  41. 41. Client­ Clients connect to Gearmand and ask for   work to be done­ The client can fire and forget or wait on   a responses­ Multiple jobs can be done asynchronously   by workers for one client.
  42. 42. Workers­ A single worker can do just one job or   can do many jobs.­ Does not have to be written using the   same language as the workers.
  43. 43. An Example Client# Create our client object.$client= new GearmanClient(); # Add default server (localhost).$client­>addServer(); echo "Sending jobn"; # Send reverse job$result = $client­>do("reverse", "Hello!");if ($result) {  echo "Success: $resultn";}
  44. 44. An Example Worker# Create our worker object.$worker= new GearmanWorker(); # Add default server (localhost).$worker­>addServer(); # Register function "reverse" with the server.$worker­>addFunction("reverse", "reverse_fn"); while (1){  print "Waiting for job...n";  $ret= $worker­>work();  if ($worker­>returnCode() != GEARMAN_SUCCESS)    break;} # A much simple reverse functionfunction reverse_fn($job){  $workload= $job­>workload();  echo "Received job: " . $job­>handle() . "n";  echo "Workload: $workloadn";   $result= strrev($workload);  echo "Result: $resultn";  return $result;}
  45. 45. NOSQL
  46. 46. Database paradigms* Relational (RDBMS)* NoSQL * Key­value stores * Document databases * Graph Database* Others
  47. 47. Relational Databases* ACID  Automicity Consistency Isolation Durability* SQL* Mature
  48. 48. NoSQL* No relational tables* No fixed tables schemas* No joins* No risk, no fun !* Massive data stores* Scaling is easy* Simpler to implement 
  49. 49. Goodbye rows and tables, hello documents and collections
  50. 50. Lots of pretty pictures to fool you.
  51. 51. Noise
  52. 52. IntroductionMongoDB bridges the gap between key-value stores (which are fast and highly scalable) andtraditional RDBMS systems (which provide rich queries and deep functionality).MongoDB is document-oriented, schema-free, scalable, high-performance, open source. Written in C++Mongo is not a relational database like MySQLGoodbye rows and tables, hello documents and collectionsFeaturesDocument-oriented  Documents (objects) map nicely to programming language data types  Embedded documents and arrays reduce need for joins  No joins and no multi-document transactions for high performance and easy scalability High performance  No joins and embedding makes reads and writes fast  Indexes including indexing of keys from embedded documents and arrays High availability  Replicated servers with automatic master failover Easy scalability  Automatic sharding (auto-partitioning of data across servers)  Reads and writes are distributed over shards  No joins or multi-document transactions make distributed queries easy and fast  Eventually-consistent reads can be distributed over replicated servers
  53. 53. Why ?  Cost - MongoDB is free  MongoDb is easily installable.  MongoDb supports various programming languages like C, C++, Java,Javascript, PHP.  MongoDB is blazingly fast  MongoDB is schemaless  Ease of scale-out If load increases it can be distributed to other nodes across computer networks.  Its trivially easy to add more fields -- even complex fields -- to your objects. So as requirements change, you can adapt code quickly.  Background Indexing  MongoDB is a stand-alone server  Development time is faster, too, since there are no schemas to manage.  It supports Server-side JavaScript execution. Which allows a developer to use a single programming language for both client and server side code
  54. 54. Limitations Mongo is limited to a total data size of 2GB for all databases in 32-bit mode. No referential integrity Data size in MongoDB is typically higher. At the moment Map/Reduce (e.g. to do aggregations/data analysis) is OK, but not blisteringly fast. Group By : less than 10,000 keys. For larger grouping operations without limits, please use map/reduce . Lack of predefined schema is a double-edged sword No support for Joins & transactions
  55. 55. Mongo data model  A Mongo system (see deployment above) holds a set of databases  A database holds a set of collections  A collection holds a set of documents  A document is a set of fields  A field is a key-value pair  A key is a name (string)  A value is a  basic type like string, integer, float, timestamp, binary, etc.,  a document, or  an array of values MySQL Term Mongo Term database database table collection index index
  56. 56. SQL to Mongo Mapping Chart
  57. 57. Continued ... SQL Statement Mongo Statement
  58. 58. Debugging & Profiling
  59. 59. Debugging & Profiling
  60. 60. Debugging & Profiling
  61. 61. Why & How ?* Bugs are bad* Locate issues during runtime* Speed up issue resolution* Breakpoints* Xdebug
  62. 62. Xdebug  Xdebug  is  a  PHP  extension  that  aims  to lend  a  helping  hand  in  the  process  of debugging  your  applications.  Xdebug offers features like:    * Automatic stack trace upon error    * Function call logging    * Display features such as enhanced       var_dump() output and code       coverage information    ­ Open Source  ­ Free
  63. 63. Enabling Xdebug in php.ini zend_extension="/usr/lib/php5/20090626+lfs/xdebug.so" xdebug.remote_enable=1 xdebug.remote_host="127.0.0.1" xdebug.remote_port=9000 xdebug.profiler_enable=1 xdebug.show_local_vars=On xdebug.trace_output_dir="/tmp/xprofile/" xdebug.trace_output_name= %t.trace xdebug.profiler_output_name = %s.%t.profile xdebug.profiler_output_dir="/tmp/xprofile/"
  64. 64. Enabling Xdebug in php.ini zend_extension="/usr/lib/php5/20090626+lfs/xdebug.so" xdebug.remote_enable=1 xdebug.remote_host="127.0.0.1" xdebug.remote_port=9000 xdebug.profiler_enable=1 xdebug.show_local_vars=On xdebug.trace_output_dir="/tmp/xprofile/" xdebug.trace_output_name= %t.trace xdebug.profiler_output_name = %s.%t.profile xdebug.profiler_output_dir="/tmp/xprofile/"
  65. 65. Enabling Xdebug in php.ini zend_extension="/usr/lib/php5/20090626+lfs/xdebug.so" xdebug.remote_enable=1 xdebug.remote_host="127.0.0.1" xdebug.remote_port=9000 xdebug.profiler_enable=1 xdebug.show_local_vars=On xdebug.trace_output_dir="/tmp/xprofile/" xdebug.trace_output_name= %t.trace xdebug.profiler_output_name = %s.%t.profile xdebug.profiler_output_dir="/tmp/xprofile/"
  66. 66. Lucene
  67. 67. Apache  Lucene  is  a  free/open  source information  retrieval  software  library, originally  created  in  Java  by  Doug Cutting.
  68. 68. Scalable, High­Performance Indexing  * small RAM requirements * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexedPowerful, Accurate and Efficient Search Algorithms  * ranked searching ­­ best results returned first * many powerful query types: phrase queries, wildcard      queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents)   * date­range searching   * sorting by any field   * multiple­index searching with merged results   * allows simultaneous update and searchingCross­Platform Solution *  Available  as  Open  Source  software  under  the  Apache      License which lets you use Lucene in both commercial        and Open Source programs * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  69. 69. Scalable, High­Performance Indexing  * small RAM requirements * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexedPowerful, Accurate and Efficient Search Algorithms  * ranked searching ­­ best results returned first * many powerful query types: phrase queries, wildcard      queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents)   * date­range searching   * sorting by any field   * multiple­index searching with merged results   * allows simultaneous update and searchingCross­Platform Solution *  Available  as  Open  Source  software  under  the  Apache      License which lets you use Lucene in both commercial        and Open Source programs * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  70. 70. Scalable, High­Performance Indexing  * small RAM requirements * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexedPowerful, Accurate and Efficient Search Algorithms  * ranked searching ­­ best results returned first * many powerful query types: phrase queries, wildcard      queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents)   * date­range searching   * sorting by any field   * multiple­index searching with merged results   * allows simultaneous update and searchingCross­Platform Solution *  Available  as  Open  Source  software  under  the  Apache      License which lets you use Lucene in both commercial        and Open Source programs * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  71. 71. Scalable, High­Performance Indexing  * small RAM requirements * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexedPowerful, Accurate and Efficient Search Algorithms  * ranked searching ­­ best results returned first * many powerful query types: phrase queries, wildcard      queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents)   * date­range searching   * sorting by any field   * multiple­index searching with merged results   * allows simultaneous update and searchingCross­Platform Solution *  Available  as  Open  Source  software  under  the  Apache      License which lets you use Lucene in both commercial        and Open Source programs * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  72. 72. Scalable, High­Performance Indexing  Pitfalls * small RAM requirements * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexedPowerful, Accurate and Efficient Search Algorithms * Update = Delete + Add  * ranked searching ­­ best results returned first * many powerful query types: phrase queries, wildcard  * No Partial document update     queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents)   * date­range searching * No Joins   * sorting by any field   * multiple­index searching with merged results   * allows simultaneous update and searchingCross­Platform Solution *  Available  as  Open  Source  software  under  the  Apache      License which lets you use Lucene in both commercial        and Open Source programs * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  73. 73. Scalable, High­Performance Indexing  * small RAM requirementsCode: FS Indexer * incremental indexing as fast as batch indexing   * index size roughly 20­30% the size of text indexed private IndexWriter writer;Powerful, Accurate and Efficient Search Algorithms public Indexer(String indexDir) throws IOException { Directory dir = FSDirectory.open(new File(indexDir));  * ranked searching ­­ best results returned first writer = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_CURRENT), true, IndexWriter.MaxFieldLength.UNLIMITED); * many powerful query types: phrase queries, wildcard  }     queries, proximity queries, range queries and more   * fielded searching (e.g., title, author, contents) public void close() throws IOException {   * date­range searching writer.close(); }   * sorting by any field   * multiple­index searching with merged results public void index(String dataDir, FileFilter filter) throws Exception {   * allows simultaneous update and searching File[] files = new File(dataDir).listFiles(); for (File f: files) { Document doc = new Document();Cross­Platform Solution doc.add(new Field("contents", new FileReader(f))); doc.add(new Field("filename", f.getName(), *  Available  as  Open  Source  software  under  the  Apache  Field.Store.YES, Field.Index.NOT_ANALYZED));     License which lets you use Lucene in both commercial    writer.addDocument(doc); }     and Open Source programs } * 100%­pure Java   * Implementations in other programming languages      available that are index­compatible
  74. 74. Code: Searcherpublic void search(String indexDir, String q) throws IOException, ParseException { Directory dir = FSDirectory.open(new File(indexDir)); IndexSearcher is = new IndexSearcher(dir, true); QueryParser parser = new QueryParser("contents", new StandardAnalyzer(Version.LUCENE_CURRENT)); Query query = parser.parse(q); TopDocs hits = is.search(query, 10); System.err.println("Found " + hits.totalHits + " document(s)"); for (int i=0; i<hits.scoreDocs.length; i++) { ScoreDoc scoreDoc = hits.scoreDocs[i]; Document doc = is.doc(scoreDoc.doc); System.out.println(doc.get("filename")); } is.close();}

×