Common Gateway Interface


Published on

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Scripts can be accessed by their virtual pathname, followed by extra information at the end of this path. The extra information is sent as PATH_INFO. This information should be decoded by the server if it comes from a URL before it is passed to the CGI script. "The 'extra path info' is the information that follows the filename in a URL when separated by a '/' (as opposed to query string info, which is what follows a '?').
  • AUTH_TYPE The name of the authentication scheme used to protect the servlet. For example, BASIC, SSL, or null if the servlet was not protected. CONTENT_LENGTH The length of the request body in bytes made available by the input stream or -1 if the length is not known. For HTTP servlets, the value returned is the same as the value of the CGI variable CONTENT_LENGTH. CONTENT_TYPE The MIME type of the body of the request, or null if the type is not known. For HTTP servlets, the value returned is the same as the value of the CGI variable CONTENT_TYPE. GATEWAY_INTERFACE The revision of the CGI specification being used by the server to communicate with the script. It is "CGI/1.1". HTTP_ACCEPT Variables with names beginning with "HTTP_" contain values from the request header, if the scheme used is HTTP. HTTP_ACCEPT specifies the content types your browser supports. For example, text/xml. HTTP_ACCEPT_CHARSET Character preference information. Used to indicate the client's prefered character set if any. For example, utf-8;q=0.5. HTTP_ACCEPT_ENCODING Defines the type of encoding that may be carried out on content returned to the client. For example, compress;q=0.5. HTTP_ACCEPT_LANGUAGE Used to define which languages you would prefer to receive content in. For example, en;q=0.5. If nothing is returned, no language preference is indicated. HTTP_FORWARDED If the request was forwarded, shows the address and port through of the proxy server. HTTP_HOST Specifies the Internet host and port number of the resource being requested. Required for all HTTP/1.1 requests. HTTP_PROXY_AUTHORIZATION Used by a client to identify itself (or its user) to a proxy which requires authentication. HTTP_USER_AGENT The type and version of the browser the client is using to send the request. For example, Mozilla/1.5. PATH_INFO Optionally contains extra path information from the HTTP request that invoked the script, specifying a path to be interpreted by the CGI script. PATH_INFO identifies the resource or sub-resource to be returned by the CGI script, and it is derived from the portion of the URI path following the script name but preceding any query data. PATH_TRANSLATED Maps the script's virtual path to the physical path used to call the script. This is done by taking any PATH_INFO component of the request URI and performing any virtual-to-physical translation appropriate. QUERY_STRING The query string that is contained in the request URL after the path. REMOTE_ADDR Returns the IP address of the client that sent the request. For HTTP servlets, the value returned is the same as the value of the CGI variable REMOTE_ADDR. REMOTE_HOST The fully-qualified name of the client that sent the request, or the IP address of the client if the name cannot be determined. For HTTP servlets, the value returned is the same as the value of the CGI variable REMOTE_HOST. REMOTE_USER Returns the login of the user making this request if the user has been authenticated, or null if the user has not been authenticated. REQUEST_METHOD Returns the name of the HTTP method with which this request was made. For example, GET, POST, or PUT. SCRIPT_NAME Returns the part of the URL from the protocol name up to the query string in the first line of the HTTP request. SERVER_NAME Returns the host name of the server that received the request. For HTTP servlets, it is the same as the value of the CGI variable SERVER_NAME. SERVER_PORT Returns the port number on which this request was received. For HTTP servlets, the value returned is the same as the value of the CGI variable SERVER_PORT. SERVER_PROTOCOL Returns the name and version of the protocol the request uses in the following form: protocol/majorVersion.minorVersion. For example, HTTP/1.1. For HTTP servlets, the value returned is the same as the value of the CGI variable SERVER_PROTOCOL. SERVER_SOFTWARE Returns the name and version of the servlet container on which the servlet is running. HTTP_COOKIE HTTP Cookie String. WEBTOP_USER The user name of the user who is logged in. NCHOME The NCHOME environment variable.
  • Common Gateway Interface

    1. 1. Common Gateway Interface Web Technologies Piero Fraternali
    2. 2. Outline• Architectures for dynamic content publishing – CGI – Java Servlet – Server-side scripting – JSP tag libraries
    3. 3. Motivations• Creating pages on the fly based on the user’s request and from structured data (e.g., database content)• Client-side scripting & components do not suffice – They manipulate an existing document/page, do not create a new one from strutured content• Solution: – Server-side architectures for dynamic content production
    4. 4. Common Gateway Interface• An interface that allows the Web Server to launch external applications that create pages dynamically• A kind of «double client-server loop»
    5. 5. What CGI is/is not• Is is not – A programming language – A telecommunication protocol• It is – An interface between the web server and tha applications that defines some standard communication variables• The interface is implemented through system variables, a universal mechanism present in all operating systems• A CGI program can be written in any programming language
    6. 6. Invocation• The client specifies in the URI the name of the program to invoke• The program must be deployed in a specified location at the web server (e.g., the cgi-bin directory) – http://my.server.web/cgi-bin/xyz.exe
    7. 7. Execution• The server recognizes from the URI that the requested resource is an executable – Permissions must be set in the web server for allowing program execution – E.g., the extensions of executable files must be explicitly specified • http://my.server.web/cgi-bin/xyz.exe
    8. 8. Execution• The web server decodes the paramaters sent by the client and initializes the CGI variables • request_method, query_string, content_length, content_type • http://my.server.web/cgi-bin/xyz.exe?par=val
    9. 9. Execution• The server lauches the program in a new process
    10. 10. Execution• The program executes and «prints» the response on the standard output
    11. 11. Execution• The server builds the response from the content emitted to the standard output and sends it to the client
    12. 12. Handling request parameters• Client paramaters can be sent in two ways – With the HTTP GET method • parameters are appended to the URL (1) • – With the HTTP POST method • Parameters are inserted as an HTTP entity in the body of the request (when their size is substantial) • Requires the use of HTML forms to allow users input data onto the body of the request – (1) The specification of HTTP does not specify any maximum URI length, practical limits are imposed by web browser and server software
    13. 13. HTML Form<HTML><BODY><FORM action="" method=post> <P> Tell me your name:<p> <P><INPUT type="text" NAME="whoareyou"> </p> <INPUT type="submit" VALUE="Send"></FORM></BODY></HTML>
    14. 14. Structure of a CGI program Read environment variable Execute business logic Print MIME heading "Content-type: text/html" Print HTML markup
    15. 15. Parameter decoding Read variable Request_method Read variable Read variable Query_string content_length Read content_length bytes from the standard input
    16. 16. CGI development• A CGI program can be written in any programming language: – C/C++ – Fortran – PERL – TCL – Unix shell – Visual Basic• In case a compiled programming language is used, the source code must be compiled – Normally source files are in cgi-src – Executable binaries are in cgi-bin• If instead an interpreted scripting language is used the source files are deployed – Normally in the cgi-bin folder
    17. 17. Overview of CGI variables• Clustered per type: – server – request – headers
    18. 18. Server variables• These variables are always available, i.e., they do not depend on the request – SERVER_SOFTWARE: name and version of the server software • Format: name/version – SERVER_NAME: hostname or IP of the server – GATEWAY_INTERFACE: supported CGI version • Format: CGI/version
    19. 19. Request variables• These variables depend on the request – SERVER_PROTOCOL: transport protocol name and version • Format: protocol/version – SERVER_PORT: port to which the request is sent – REQUEST_METHOD: HTTP request method – PATH_INFO: extra path information – PATH_TRANSLATED: translation of PATH_INFO from virtual to physical – SCRIPT_NAME: invoked script URL – QUERY_STRING: the query string
    20. 20. Other request variables• REMOTE_HOST: client hostname• REMOTE_ADDR: client IP address• AUTH_TYPE: authentication type used by the protocol• REMOTE_USER: username used during the authentication• CONTENT_TYPE: content type in case of POST and PUT request methods• CONTENT_LENGTH: content length
    21. 21. Environment variables: headers• The HTTP headers contained in the request are stored in the environment with the prefix HTTP_ – HTTP_USER_AGENT: browser used for the request – HTTP_ACCEPT_ENCODING: encoding type accepted by the client – HTTP_ACCEPT_CHARSET: charset accepted by the client – HTTP_ACCEPT_LANGUAGE: language accepted by the client
    22. 22. CGI script for inspecting variables#include <stdlib.h>#include <stdio.h>int main (void){ printf("content-type: text/htmlnn"); printf("<html><head><title>Request variables</title></head>"); printf("<body><h1>Some request header variables:</h1>"); fflush(stdout); printf("SERVER_SOFTWARE: %s<br>n",getenv("SERVER_SOFTWARE")); printf("GATEWAY_INTERFACE: %s<br>n",getenv("GATEWAY_INTERFACE")); printf("REQUEST_METHOD: %s<br>n",getenv("REQUEST_METHOD")); printf("QUERY_STRING: %s<br>n",getenv("QUERY_STRING")); printf("HTTP_USER_AGENT: %s<br>n",getenv("HTTP_USER_AGENT")); printf("HTTP_ACCEPT_ENCODING: %s<br>n",getenv("HTTP_ACCEPT_ENCODING")); printf("HTTP_ACCEPT_CHARSET: %s<br>n",getenv("HTTP_ACCEPT_CHARSET")); printf("HTTP_ACCEPT_LANGUAGE: %s<br>n",getenv("HTTP_ACCEPT_LANGUAGE")); printf("HTTP_REFERER: %s<br>n",getenv("HTTP_REFERER")); printf("REMOTE_ADDR: %s<br>n",getenv("REMOTE_ADDR")); printf("</body></html>"); return 0;}
    23. 23. Example output
    24. 24. Problems with CGI• Performance and security issues in web server to application communication• When the server receives a request, it creates a new process in order to run the CGI program • This requires time and significant server resources • A CGI program cannot interact back with the web server• The process of the CGI program is terminated when the program finishes • No sharing of resources between subsequen calls (e.g., reuse of database connections) • No main memory preservation of the user’s session (database storage is necessary if session data are to be preserved)• Exposing to the web the physical path to an executable program can breach security
    25. 25. Riferimenti• CGI reference: –• Security and CGI: –
    26. 26. Esempio completo 1. Prima richiesta 2. Recupero risorsaForm.html Form.html 3. Risposta 5. Set variabili dambiente e 4. Seconda chiamata richiesta 6. Calcolo Mult.cgi risposta 7. Invio risposta Mult.c Precedentemente compilato in... Mult.cgi
    27. 27. La form (form.html)<HTML> <HEAD><TITLE>Form di URL moltiplicazione</TITLE><HEAD> chiamata <BODY> <FORM ACTION=""> <P>Introdurre i moltiplicandi</P> <INPUT NAME="m" SIZE="5"><BR/> <INPUT NAME="n" SIZE="5"><BR/> <INPUT TYPE="SUBMIT" VALUE="Moltiplica"> </FORM> <BODY> Vista in un browser</HTML>
    28. 28. #include <stdio.h> Lo script Istruzioni di stampa della#include <stdlib.h> risposta sulloutputint main(void){ char *data; long m,n; printf("%s%c%cn", "Content-Type:text/html;charset=iso-8859- 1",13,10); printf("<HTML>n<HEAD>n<TITLE>Risultato Recupero di moltiplicazione</TITLE>n<HEAD>n"); valori dalle variabili printf("<BODY>n<H3>Risultato moltiplicazione</H3>n"); dambiente data = getenv("QUERY_STRING"); if(data == NULL) printf("<P>Errore! Errore nel ricevere i dati dalla form.</P>n"); else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2) printf("<P>Errore! Dati non validi. Devono essere numerici.</P>n"); else printf("<P>Risultato: %ld * %ld = %ld</P>n",m,n,m*n); printf("<BODY>n"); return 0;}
    29. 29. Compilazione e test locale della• Compilazione: Set manuale variabile $ gcc -o mult.cgi mult.c dambiente contenente la query string• Test locale: $ export QUERY_STRING="m=2&n=3" $ ./mult.cgi• Risultato: Content-Type:text/html;charset=iso-8859-1 <HTML> <HEAD> <TITLE>Risultato moltiplicazione</TITLE> <HEAD> <BODY> <H3>Risultato moltiplicazione</H3> <P>Risultato: 2 * 3 = 6</P> <BODY>
    30. 30. Considerazioni su CGI• Possibili problemi di sicurezza• Prestazioni (overhead) – creare e terminare processi richiede tempo – cambi di contesto richiedono tempo• Processi CGI: – creati a ciascuna invocazione – non ereditano stato di processo da invocazioni precedenti (e.g., connessioni a database)
    31. 31. Riferimenti• CGI reference: ml• Sicurezza e CGI: ml