Common Gateway Interface
(CGI)
CGI
• CGI is one of the important server side-
programming techniques.
• CGI connects web servers to an external
application
• When a CGI-enabled web server receives
a request for a CGI program, the web
server executes the program at the server
end and sends output back to the client
• CGI program can be written in C or C++,
Perl, ASP, PHP, Python, TCL, shells, and
many others languages and scripts.
• CGI is a standard mechanism for:
– Associating URLs with programs that can be
run by a web server.
– A protocol for how the request is passed to
the external program.
– How the external program sends the
response to the client.
CGI URLs
• There is some mapping between URLs
and CGI programs provided by a web
sever. The exact mapping is not
standardized (web server admin can set it
up).
• Typically:
– requests that start with /CGI-BIN/ , /cgi-bin/
or /cgi/, etc. refer to CGI programs.
• Different Directory Path Different
Behaviors at web server
– regular directory => returns the file
– cgi-bin => returns output of the program
• Which Behavior is determined by Server
– Based on directory, or file extension, ...
Request CGI program
• The web server sets some environment
variables with information about the
request.
• The CGI program gets information about
the request from environment variables.
Environment Variables
• In order to pass data from the server to the
script, the server uses environment
variables.
There are two types of environment
variables:
Non-Request specific variables - those
set for every request
Request specific variables - those that
are dependent on the request being
fulfilled by the CGI Script
Environment Variables
 SERVER_NAME
– The server's Host name or IP address
 SERVER_SOFTWARE
– The name and version of the server-software that is
answering the client requests
 SERVER_PROTOCOL
– The name and revision of the information protocol the
request came in with.
 REQUEST_METHOD
– The method with which the information request was issued.
Environment Variables Cont...
 QUERY_STRING
– The query information passed to the program. It is appended
to the URL with a "?”
 CONTENT_TYPE
– The MIME type of the query data, such as "text/html”
 CONTENT_LENGTH
– The length of the data in bytes, passed to the CGI program
through standard input.
 HTTP_USER_AGENT
– The browser the clients is using to issue the request.
 DOCUMENT_ROOT
– It displays the server document root directory
Example of some
Environment Variables
 SERVER_SOFTWARE = Apache/1.3.14
 SERVER_NAME = www.ncsi.iisc.ernet.in
 GATEWAY_INTERFACE = CGI/1.1
 SERVER_PROTOCOL = HTTP/1.0
 SERVER_PORT = 80
 REQUEST_METHOD = GET
 HTTP_ACCEPT = 'image/gif, image/x-xbitmap, image/jpeg, */*'
 SCRIPT_NAME = /cgi-bin/environment-example
 REMOTE_HOST = ece.iisc.ernet.in
 REMOTE_ADDR = 144.16.64.3
• It depends on the scripting or programming
language used how a program can access
the value of an environment variable.
• In the C language, you would use the library
function getenv (defined in the standard
library stdlib) to access the value as a
string.
• You might then use various techniques to
pick up data from the string, convert parts of
it to numeric values, etc.
Where does the data for the
CGI Script come from?
 The most common way for data to be sent to CGI Scripts
is through HTML forms. HTML forms use a multitude of
input methods to get data to a CGI Script. Some of
these input types are radio buttons, check boxes, text
input and pull-down menus.
 After the input necessary for the Script is determined and
what type of input are going to be used, there are two
main ways to receive information using the form. The
methods are Get and Post. The information will be
encoded differently depending on which method is used.
GET Method
 The form data is encoded and then
appended to the URL after ? mark
 The information contained in the part of the
URL after the ? mark is called the
QUERY_STRING, which consists of a string
of name=value pairs separated by
ampersands (&)
 GET http://www.ncsi.iisc.ernet.in/cgi-
bin/example/simple.pl?first=Amit&last=Sharma
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *data;
long m,n;
printf(“Content-type: text/htmlrnrn”);
printf("<TITLE>Multiplication results</TITLE>n");
printf("<H3>Multiplication results</H3>n");
data = getenv("QUERY_STRING");
if(data == NULL)
printf("<P>Error! Error in passing data from form to script.");
else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2)
printf("<P>Error! Invalid data. Data must be numeric.");
else
printf("<P>The product of %ld and %ld is %ld.",m,n,m*n);
return 0; }
POST Method
 Difference between Get and Post method is
primarily defined in terms of form data encoding
 With the post method, the server passes the
information contained in the submitted form as
standard input (STDIN) to the CGI program
POST Method ...
The length of the information (in bytes) is
also sent to the server, to let the CGI
script know how much information it has to
read
The environment variable
CONTENT_LENGTH contains information
about how much amount of data being
transferred from html form.
• For forms that use METHOD="POST", CGI
specifications say that the data is passed
to the script or program in the standard
input stream (stdin), and the length (in
bytes, i.e. characters) of the data is passed
in an environment variable called
CONTENT_LENGTH.
HTTP
SERVER
CGI Program
stdin
stdout
Environment
Variables
Form Fields
• Each field within a form has a name and a
value.
• The browser creates a query that includes
a sequence of “name=value” substrings
and sticks them together separated by the
‘&’ character.
• The space character ‘ ‘ is replaced by
‘+’.
• The ‘+’ character is replaced by “%2B”
• Most nonalphanumeric characters are
encoded as a ‘%’ followed by 2 ASCII
encoded hex digits
Form fields and encoding
• 2 fields - name and occupation.
• If user types in “Dave H.” as the name and
“none” for occupation, the query would
look like this:
“name=Dave+H%2E&occupation=none”

Common gateway interface

  • 1.
  • 2.
    CGI • CGI isone of the important server side- programming techniques. • CGI connects web servers to an external application • When a CGI-enabled web server receives a request for a CGI program, the web server executes the program at the server end and sends output back to the client
  • 3.
    • CGI programcan be written in C or C++, Perl, ASP, PHP, Python, TCL, shells, and many others languages and scripts. • CGI is a standard mechanism for: – Associating URLs with programs that can be run by a web server. – A protocol for how the request is passed to the external program. – How the external program sends the response to the client.
  • 4.
    CGI URLs • Thereis some mapping between URLs and CGI programs provided by a web sever. The exact mapping is not standardized (web server admin can set it up). • Typically: – requests that start with /CGI-BIN/ , /cgi-bin/ or /cgi/, etc. refer to CGI programs.
  • 5.
    • Different DirectoryPath Different Behaviors at web server – regular directory => returns the file – cgi-bin => returns output of the program • Which Behavior is determined by Server – Based on directory, or file extension, ...
  • 6.
    Request CGI program •The web server sets some environment variables with information about the request. • The CGI program gets information about the request from environment variables.
  • 7.
    Environment Variables • Inorder to pass data from the server to the script, the server uses environment variables. There are two types of environment variables: Non-Request specific variables - those set for every request Request specific variables - those that are dependent on the request being fulfilled by the CGI Script
  • 8.
    Environment Variables  SERVER_NAME –The server's Host name or IP address  SERVER_SOFTWARE – The name and version of the server-software that is answering the client requests  SERVER_PROTOCOL – The name and revision of the information protocol the request came in with.  REQUEST_METHOD – The method with which the information request was issued.
  • 9.
    Environment Variables Cont... QUERY_STRING – The query information passed to the program. It is appended to the URL with a "?”  CONTENT_TYPE – The MIME type of the query data, such as "text/html”  CONTENT_LENGTH – The length of the data in bytes, passed to the CGI program through standard input.  HTTP_USER_AGENT – The browser the clients is using to issue the request.  DOCUMENT_ROOT – It displays the server document root directory
  • 10.
    Example of some EnvironmentVariables  SERVER_SOFTWARE = Apache/1.3.14  SERVER_NAME = www.ncsi.iisc.ernet.in  GATEWAY_INTERFACE = CGI/1.1  SERVER_PROTOCOL = HTTP/1.0  SERVER_PORT = 80  REQUEST_METHOD = GET  HTTP_ACCEPT = 'image/gif, image/x-xbitmap, image/jpeg, */*'  SCRIPT_NAME = /cgi-bin/environment-example  REMOTE_HOST = ece.iisc.ernet.in  REMOTE_ADDR = 144.16.64.3
  • 11.
    • It dependson the scripting or programming language used how a program can access the value of an environment variable. • In the C language, you would use the library function getenv (defined in the standard library stdlib) to access the value as a string. • You might then use various techniques to pick up data from the string, convert parts of it to numeric values, etc.
  • 12.
    Where does thedata for the CGI Script come from?  The most common way for data to be sent to CGI Scripts is through HTML forms. HTML forms use a multitude of input methods to get data to a CGI Script. Some of these input types are radio buttons, check boxes, text input and pull-down menus.  After the input necessary for the Script is determined and what type of input are going to be used, there are two main ways to receive information using the form. The methods are Get and Post. The information will be encoded differently depending on which method is used.
  • 13.
    GET Method  Theform data is encoded and then appended to the URL after ? mark  The information contained in the part of the URL after the ? mark is called the QUERY_STRING, which consists of a string of name=value pairs separated by ampersands (&)  GET http://www.ncsi.iisc.ernet.in/cgi- bin/example/simple.pl?first=Amit&last=Sharma
  • 14.
    #include <stdio.h> #include <stdlib.h> intmain(void) { char *data; long m,n; printf(“Content-type: text/htmlrnrn”); printf("<TITLE>Multiplication results</TITLE>n"); printf("<H3>Multiplication results</H3>n"); data = getenv("QUERY_STRING"); if(data == NULL) printf("<P>Error! Error in passing data from form to script."); else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2) printf("<P>Error! Invalid data. Data must be numeric."); else printf("<P>The product of %ld and %ld is %ld.",m,n,m*n); return 0; }
  • 15.
    POST Method  Differencebetween Get and Post method is primarily defined in terms of form data encoding  With the post method, the server passes the information contained in the submitted form as standard input (STDIN) to the CGI program
  • 16.
    POST Method ... Thelength of the information (in bytes) is also sent to the server, to let the CGI script know how much information it has to read The environment variable CONTENT_LENGTH contains information about how much amount of data being transferred from html form.
  • 17.
    • For formsthat use METHOD="POST", CGI specifications say that the data is passed to the script or program in the standard input stream (stdin), and the length (in bytes, i.e. characters) of the data is passed in an environment variable called CONTENT_LENGTH.
  • 18.
  • 19.
    Form Fields • Eachfield within a form has a name and a value. • The browser creates a query that includes a sequence of “name=value” substrings and sticks them together separated by the ‘&’ character.
  • 20.
    • The spacecharacter ‘ ‘ is replaced by ‘+’. • The ‘+’ character is replaced by “%2B” • Most nonalphanumeric characters are encoded as a ‘%’ followed by 2 ASCII encoded hex digits
  • 21.
    Form fields andencoding • 2 fields - name and occupation. • If user types in “Dave H.” as the name and “none” for occupation, the query would look like this: “name=Dave+H%2E&occupation=none”