slides (PPT)

921 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
921
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Thank you for your kind introduction. Hello, everyone. I would like to speak to you about our study, titled "Secure and High-performance Web Server System for Shared Hosting Service". I’m Daisuke Hara. I was a master’s degree student until March at the Department of Computer Science, the University of Electro-Communications, Japan. Now, I’m a research engineer at Nippon Telegraph and Telephone Corporation.
  • We present as follows: Introduction, Background of this study, design and implementation of our proposed system, Hi-sap, evaluation, and conclusions.
  • At the beginning of our presentation, we describe introduction. There is a problem of exiting web servers. The problem is that server embedded interpreters cannot be used safely in large-scale environments like a shared hosting service. In order to solve the problem, we designed and implemented the web server system, Hi-sap. In the system, web objects that are stored in a server are divided into partitions. “partitions” are, for example sites and content. Also, server processes run under the privilege of different users in every partition. The system can solve the problem and achieves high performance and scalability.
  • Next, we describe background of this study. Recently, more people are creating their own websites as the Internet grows in popularity. Shared hosting services, where many customers share a server, are widely used.
  • Also, recently, server embedded interpreters, for example PHP, mod_ruby, and mod_perl are commonly used. Because they have server processes including interpreters of language processors, they can improve performance in processing dynamic content like weblogs and wikis.
  • Next, we describe problem of existing web servers. This picture shows HTTP authentication. [Click] In this mechanism, users that have valid ID and password only can browse authentication content. [Click] However, internal users can steal & delete authentication content without authentication by using cp, rm commands or malicious CGI scripts. [Click] The problem occurs because it is required to grant read permission to an other defined by the UNIX permission model, “owner/group/other”.
  • Existing solution for the problem is POSIX ACL & suEXEC. In this solution, CGI scripts run under the privilege of the site owner by using suEXEC. And, permissions of public access files are granted only to the dedicated user by using POSIX ACL. “dedicated user” is user account that runs server processes. For example, www, apache, www-data. [Click] By using this solution, it is not required to grant read permission to an other.
  • However, even if POSIX ACL & suEXEC is used, the problem occurrs when server embedded interpreters are used. Because dynamic content that use server embedded interpreters also run under the privilege of a dedicated user, [Click] malicious PHP scripts can steal & delete authentication content.
  • In order to solve the problem, we developed the web server system, Harache before Hi-sap. Harache is predecessor of Hi-sap. In Harache, server processes run under the privilege of the site owner. This picture shows a procedure for Harache. [Click] First, a browser sends request to the user A's website. [Click] Second, the privilege of the server process is changed to user A. [Click] Third, the server process processes the request. [Click] Last, it returns a response to the browser.
  • By using Harache, server embedded interpreters can be used safely because file permissions to a dedicated user are not necessary. It is required to grant permissions only to the site owner. But, it cannot fully use the increased speed of server embedded interpreters because server processes terminate after each session in the same way as CGI. Hi-sap solves Harache’s performance problem.
  • Our goal is realization of secure, high-performance, and scalable web server system, named Hi-sap. In Hi-sap, first, scripts of a partition cannot access other partitions. Second, dynamic content can be processed at high speed by fully using the increased speed of server embedded interpreters. Last, a number of partitions can be housed in a server.
  • Here, we present a design of Hi-sap. First, to achieve high-security, server processes run under the privilege of different users in every partition in the same way as Harache. In addition, the system brings access control into operation with a secure OS. Second, to achieve high-performance, the system pools server processes that run under the privilege of the different users in the different way as Harache. Last, to achieve high-scalability of the number of partitions in a server, the system controls the creation and termination of server processes. This mechanism is named Content Access Scheduler.
  • Content Access Scheduler is web-server level scheduler. It enhances the scalability of the number of partitions in a server. It controls the creation and termination of server processes. By using the suitable scheduler for the purpose, it achieves high-scalability.
  • We implemented Hi-sap on a Linux OS with SELinux. The system consists of a dispatcher, 1000 workers, and hisapd. The dispatcher is a reverse proxy server and distributes requests to workers. It was implemented as an Apache module, mod_hisap on Apache. Each worker runs under the privilege of a different user and processes requests for a specific dedicated partition. Although, 1000 Apache were used as workers in this implementation, any web server software can be used. Also, Content Access Scheduler and other management facilities of workers were implemented as a daemon, hisapd.
  • This picture shows overview of request processing. [Click] When the dispatcher receives a request from a browser, for example for partition C in this picture, [Click] it confirms whether the dedicated worker for partition C is active. If the worker C is inactive, [Click] the dispatcher asks hisapd to activate it. [Click] [Click] After hisapd activates the worker, [Click] the dispatcher forwards the request to the worker. [Click] The worker processes the request. [Click] Then the dispatcher receives a response from the worker and sends the response to the browser. [Click] When the server is in a heavy load state, [Click] hisapd chooses a worker that terminates, for example worker A [Click] and terminates it. [Click] Then, heavy load state is canceled.
  • We developed Content Access Scheduler to avoid thrashing. Thrashing decreases the performance of web servers dramatically. Scheduling algorithm of worker activation is that hisapd dynamically activates workers after requests from the dispatcher. Also, scheduling algorithm of worker termination is that when thrashing seems to occur, hisapd terminates workers that have not been requested recently.
  • The conditions for which hisapd judges that thrashing seems to occur are as follows. A swap-in occurs, a swap-out occurs, and memory use is 99% or more. Also, the conditions for which hisapd chooses workers to terminate are as follows. The worker is active and the worker is not recorded in the most recent 10,000 requests.
  • Ok, now, we evaluated proposed web server system from experiments. This figure shows the experimental environment.
  • We performed two experiments. First, we evaluated the basic performance in processing dynamic content. Next, we evaluated the scalability of the number of partitions in a server in processing dynamic content. In these experiments, we sent requests to a PHP script that calls phpinfo(). The script displays the system information of the PHP language processor. The traffic of the script is 40 KB per request.
  • First, we performed basic performance evaluation to determine useful performance of our system. For this experiment, Apache, One-to-one, and suEXEC were used for comparisons. One-to-one uses networks with a reverse proxy, and has a dispatcher and many workers that are dedicated to process requests for each partition. Although One-to-one is similar to our system, mod_hisap and hisapd are not installed. All of the workers run from beginning to end. We used a httperf benchmark to measure the performance.
  • This graph shows the experimental results. The x-axis shows the request frequency, and the y-axis shows the throughput. The system loses an avg. of 28.0% of the throughput relative to Apache. The overhead of the system is because of a reverse proxy. However, the system has high throughput relative to suEXEC. Also, because the system loses an avg. of 1.0% of the throughput relative to One-to-one. The overhead of mod_hisap & hisapd is very low.
  • Next, we performed scalability evaluation to determine the effectiveness of Content Access Scheduler. For this experiment, One-to-one was used for comparison. We used Apache benchmark to measure the performance.
  • This graph shows the experimental results. The x-axis shows the number of partitions in the server, and the y-axis shows the throughput. Our system’s scalability is high because the throughput decrement due to an increase in the number of partitions was low. On the contrary, for One-to-one, the OS crashed due to a memory shortage when the number of partitions was 600.
  • This graph shows the change of the memory use in the experiment. The x-axis shows the number of partitions in the server, and the y-axis shows the memory use. The swap use of One-to-one dramatically increases due to an increase in the number of partitions. This is the reason of the OS crash. On the contrary, our system does not use swap space as much because of Content Access Scheduler.
  • This table shows comparison of approaches. Comparison systems have some weak points. One the other hand, Hi-sap gets high marks in all items and does not have any weak points. Therefore, it is the most effective.
  • At last, we conclude this presentation. We have designed and implemented secure and high-performance web server system, Hi-sap. Experimental results show that our system achieves high performance and scalability.
  • Our future work is creating various Content Access Schedulers and its evaluation.
  • That's all for my presentation. Thank you for your kind attention. Any questions?
  • slides (PPT)

    1. 1. Secure and High-performance Web Server System for Shared Hosting Service D a isuke Ha r a and Yasuichi Nakayama The University of Electro-Communications, Tokyo, Japan
    2. 2. Outline <ul><li>Introduction </li></ul><ul><li>Background </li></ul><ul><ul><li>Problems of large-scale hosting service and web server </li></ul></ul><ul><li>Proposal - H i -s a p </li></ul><ul><ul><li>Design </li></ul></ul><ul><ul><li>Implementation </li></ul></ul><ul><li>Evaluation </li></ul><ul><li>Conclusions </li></ul>
    3. 3. Introduction <ul><li>Problem of existing web servers </li></ul><ul><ul><li>Server embedded interpreters cannot be used safely in large-scale environments like a shared hosting service. </li></ul></ul><ul><li>Proposal - H i -s a p </li></ul><ul><ul><li>Web objects that are stored in a server are divided into partitions* . </li></ul></ul><ul><ul><li>Server processes run under the privilege of different users in every partition. </li></ul></ul><ul><li>Achievement </li></ul><ul><ul><li>H i -s a p solves the problem. </li></ul></ul><ul><ul><li>It achieves high performance & scalability. </li></ul></ul><ul><ul><li>(*) “partition” is a unit of division of web objects. </li></ul></ul><ul><ul><li>(e.g. site, content, QUERY_STRING) </li></ul></ul>
    4. 4. Background <ul><li>More people are creating their own websites as the Internet grows in popularity. </li></ul><ul><ul><li>weblog, wiki, CMS </li></ul></ul><ul><li>Shared hosting services are widely used. </li></ul><ul><ul><li>Many customers share a server. </li></ul></ul><ul><ul><ul><li>100s - 1000s sites/server </li></ul></ul></ul><ul><ul><li>low price & flexible </li></ul></ul><ul><ul><ul><li>custom CGI, etc. </li></ul></ul></ul>
    5. 5. Server embedded interpreters <ul><li>e.g. PHP, mod_ruby, mod_perl </li></ul><ul><li>Because they have server processes including interpreters of language processors, </li></ul><ul><li>they can improve performance in processing dynamic content like weblogs and wikis. </li></ul>
    6. 6. Problem of existing web servers A’s website B’s website C’s website Server Internal users can steal & delete authentication content without authentication (cp, rm commands or malicious CGI scripts). browser authentication auth content auth content steal & delete ID & Pass It is required to grant read permission to an other. (rw-r--r--)
    7. 7. Problem of existing web servers (cont.) <ul><li>Existing solution: POSIX ACL & suEXEC </li></ul><ul><ul><li>CGI scripts run under the privilege of the site owner by using suEXEC. </li></ul></ul><ul><ul><li>Permissions of public access files are granted only to the dedicated user* by using POSIX ACL. </li></ul></ul><ul><ul><li>It is not required to grant read permission to an other . </li></ul></ul><ul><ul><li>(*) “dedicated user” is user account that runs server processes. </li></ul></ul><ul><ul><li>e.g. www, apache, www-data </li></ul></ul>
    8. 8. Problem of existing web servers (cont.) <ul><li>Even if POSIX ACL & suEXEC is used, the problem occurrs when server embedded interpreters are used. </li></ul><ul><ul><li>Dynamic content that use server embedded interpreters (e.g. PHP, mod_ruby, mod_perl) also run under the privilege of a dedicated user. </li></ul></ul><ul><ul><li>Malicious PHP scripts can steal & delete authentication content. </li></ul></ul>
    9. 9. Harache ([13][14]) <ul><li>Predecessor of H i -s a p </li></ul><ul><li>Server processes run under the privilege of the site owner. </li></ul>root root root ① ② ④ browser GET /~userA/ <ul><li>A browser sends request to the user A's website. </li></ul><ul><li>The privilege of the server process is changed to user A. </li></ul><ul><li>The server process processes the request. </li></ul><ul><li>It returns a response to the browser. </li></ul>Harache Server Process userA ③
    10. 10. Harache (cont.) <ul><li>Server embedded interpreters can be used safely. </li></ul><ul><ul><li>File permissions to a dedicated user are not necessary. </li></ul></ul><ul><ul><li>It is required to grant permissions only to the site owner. </li></ul></ul><ul><li>But, it cannot fully use the increased speed of server embedded interpreters. </li></ul><ul><ul><li>Server processes terminate after each session. (= CGI) </li></ul></ul>H i -s a p solves Harache’s performance problem.
    11. 11. Goal <ul><li>Realization of secure, high-performance, and scalable web server system, H i -s a p </li></ul><ul><ul><li>Secure: Scripts of a partition cannot access other partitions. </li></ul></ul><ul><ul><li>High performance: Dynamic content can be processed at high speed by fully using the increased speed of server embedded interpreters. </li></ul></ul><ul><ul><li>Scalable: A number of partitions can be housed in a server. </li></ul></ul>
    12. 12. Design <ul><li>Security </li></ul><ul><ul><li>Server processes run under the privilege of different users in every partition. (= Harache) </li></ul></ul><ul><ul><li>The system brings access control into operation with a secure OS. </li></ul></ul><ul><li>Performance </li></ul><ul><ul><li>The system pools server processes that run under the privilege of the different users. (!= Harache) </li></ul></ul><ul><li>Scalability </li></ul><ul><ul><li>The system controls the creation and termination of server processes. </li></ul></ul>Content Access Scheduler
    13. 13. Content Access Scheduler <ul><li>Web-server level scheduler </li></ul><ul><ul><li>[aim] It enhances the scalability of the number of partitions in a server. </li></ul></ul><ul><ul><li>[method] It controls the creation and termination of server processes. </li></ul></ul>By using the suitable scheduler for the purpose, it achieves high-scalability.
    14. 14. Implementation <ul><li>OS: Linux OS with SELinux </li></ul><ul><li>dispatcher </li></ul><ul><ul><li>reverse proxy server </li></ul></ul><ul><ul><li>Apache 2.0.55 + mod_hisap </li></ul></ul><ul><li>workers </li></ul><ul><ul><li>Each worker runs under the privilege of a different user and processes requests for a specific dedicated partition. </li></ul></ul><ul><ul><li>Apache 2.0.55 x 1000 </li></ul></ul><ul><ul><ul><li>Any web server software can be used. </li></ul></ul></ul><ul><li>hisapd </li></ul><ul><ul><li>Content Access Scheduler </li></ul></ul>
    15. 15. Overview of request processing B workers … GET / HTTP/1.1 Host: www.C.net terminating worker A www www B B hisapd asking to activate worker C root root worker A has no requests HTTP UNIX Domain socket sending the response process the request reverse proxy activating worker C confirming if worker C is active dispatcher OK Browser Server heavy load A A A C C C C
    16. 16. Scheduling algorithm <ul><li>We developed Content Access Scheduler to avoid thrashing. </li></ul><ul><ul><li>Thrashing decreases the performance of web servers dramatically. </li></ul></ul><ul><li>Algorithm of worker activation </li></ul><ul><ul><li>hisapd dynamically activates workers after requests from the dispatcher. </li></ul></ul><ul><li>Algorithm of worker termination </li></ul><ul><ul><li>When thrashing seems to occur, hisapd terminates workers that have not been requested recently. </li></ul></ul>
    17. 17. Scheduling algorithm (cont.) <ul><li>Conditions for which hisapd judges that thrashing seems to occur </li></ul><ul><ul><li>A swap-in occurs. </li></ul></ul><ul><ul><li>A swap-out occurs. </li></ul></ul><ul><ul><li>Memory use is 99% or more. </li></ul></ul><ul><li>Conditions for which hisapd chooses workers to terminate </li></ul><ul><ul><li>The worker is active. </li></ul></ul><ul><ul><li>The worker is not recorded in the most recent 10,000 requests. </li></ul></ul>
    18. 18. Evaluation <ul><li>Experimental environments </li></ul>Gigabit Ethernet Gigabit Ethernet DELL PowerConnect 2724 1000 BASE-T x 24 Switching Hub Network Broadcom BCM5704C 1 Gbps NIC Fedora Core 4 (kernel 2.6.14) OS 4 GB (swap 8 GB) Memory AMD Opteron 240EE 1.4 GHz x 2 CPU Server Intel PRO/1000XT PWLA8490XT 1 Gbps NIC Fedora Core 4 (kernel 2.6.14) OS 256 MB (swap 512 MB) Memory Intel Pentium III Xeon 500 MHz x 4 CPU Client
    19. 19. Evaluation (conf.) <ul><li>Basic performance evaluation </li></ul><ul><ul><li>We evaluated the basic performance in processing dynamic content. </li></ul></ul><ul><li>Scalability evaluation </li></ul><ul><ul><li>We evaluated the scalability of the number of partitions in a server in processing dynamic content. </li></ul></ul><ul><li>Target content </li></ul><ul><ul><li>We sent requests to a PHP script that calls phpinfo(). </li></ul></ul><ul><ul><ul><li>The script displays the system information of the PHP language processor. (40 KB per request) </li></ul></ul></ul>
    20. 20. Basic performance evaluation <ul><li>Aim </li></ul><ul><ul><li>to determine useful performance of our system </li></ul></ul><ul><li>Systems for comparison </li></ul><ul><ul><li>Apache </li></ul></ul><ul><ul><li>One-to-one </li></ul></ul><ul><ul><ul><li>It uses networks with a reverse proxy, and has a dispatcher and many workers that are dedicated to process requests for each partition. </li></ul></ul></ul><ul><ul><ul><li>Although it is similar to our system, mod_hisap and hisapd are not installed. </li></ul></ul></ul><ul><ul><li>Apache with suEXEC </li></ul></ul><ul><li>Benchmark </li></ul><ul><ul><li>httperf benchmark ver. 0.8 </li></ul></ul>
    21. 21. Basic performance evaluation (cont.) <ul><li>The system loses an avg. of 28.0% of the throughput relative to Apache. </li></ul><ul><ul><li>The overhead of the system is because of a reverse proxy. </li></ul></ul><ul><li>However, the system has high throughput relative to suEXEC. </li></ul><ul><li>The system loses an avg. of 1.0% of the throughput relative to One-to-one. </li></ul><ul><ul><li>The overhead of mod_hisap & hisapd is very low. </li></ul></ul>
    22. 22. Scalability evaluation <ul><li>Aim </li></ul><ul><ul><li>to determine the effectiveness of Content Access Scheduler </li></ul></ul><ul><li>Comparison system </li></ul><ul><ul><li>One-to-one </li></ul></ul><ul><ul><ul><li>mod_hisap and hisapd (Content Access Scheduler) are not installed. </li></ul></ul></ul><ul><li>Benchmark </li></ul><ul><ul><li>Apache benchmark ver. 2.0.41-dev </li></ul></ul>
    23. 23. Scalability evaluation (cont.) <ul><li>Our system’s scalability is high. </li></ul><ul><ul><li>The throughput decrement due to an increase in the number of partitions was low. </li></ul></ul><ul><li>For One-to-one, the OS crashed due to a memory shortage when the number of partitions was 600. </li></ul>
    24. 24. Scalability evaluation (cont.) <ul><li>The swap use of One-to-one dramatically increases due to an increase in the number of partitions. </li></ul><ul><ul><li>This is the reason of the OS crash. </li></ul></ul><ul><li>Our system does not use swap space as much because of Content Access Scheduler. </li></ul>
    25. 25. Comparison of approaches good poor good good One-to-one good good poor good Harache good poor / very poor excellent excellent   Sandbox / VM   good good excellent very poor Apache good good very poor good suEXEC & POSIX ACL Generality Scalability Basic Performance Security in a Server good good good excellent H i -s a p   good poor - good Apache perchild MPM very poor good excellent good PHP safe mode
    26. 26. Conclusions <ul><li>Proposal: H i -s a p </li></ul><ul><ul><li>Secure and high-performance web server system </li></ul></ul><ul><li>Implementation: </li></ul><ul><ul><li>On a Linux OS with SELinux. </li></ul></ul><ul><li>Achievement: </li></ul><ul><ul><li>High performance </li></ul></ul><ul><ul><li>High scalability </li></ul></ul>
    27. 27. Future Work <ul><li>Creating various Content Access Schedulers </li></ul><ul><ul><li>for wiki </li></ul></ul><ul><ul><li>for weblog </li></ul></ul><ul><ul><li>for CMS, etc. </li></ul></ul><ul><li>Evaluating these schedulers </li></ul>
    28. 28. <ul><li>Thank you. </li></ul><ul><li>Any questions/comments? </li></ul>

    ×