MIT OPENCOURSEWARE
MIRROR SITE PROGRAM
  DOCUMENTATION


       Last Updated
     November 24, 2008




                  ...
TABLE OF CONTENTS

ABOUT THIS DOCUMENT ......................................................................................
ABOUT THIS DOCUMENT

                  This document provides extensive documentation about the OCW Mirror Site program, h...
OCW MIRROR SITE PROGRAM OVERVIEW

                  Background and Goals


                  A fundamental part of OCW’s m...
MIRROR SITE LAUNCH

                  Once the mirror site has been installed and is publicly accessible on your network, ...
OCW MIRROR SITE CONTENTS

                  Organization of external hard drive


                  The organization of th...
Terms of Use


                  OCW materials


                  With the exception of the content within the Tools dire...
MIRROR SITE CONFIGURATION

                  Mirror Site Technical Requirements


                  Software


           ...
Downloadable files


                  Hosting them as downloadable files is the simplest option and requires no changes t...
directly off the external hard drive, without a Web server, but it will have limited performance and other
               ...
$ sudo ./linux_install.sh <path-to-ocw-content> <server-name>                <server-admin-email>
                  Ex. su...
Compile/Install
                  Windows users should simply run the MSI file to install Apache. Linux users will need to...
4.     Below DocumentRoot should be a Directory tag followed by a path in quotes (the same path we
                       ...
2.     Open httpd.conf in a text editor.

                  3.     Search for the line containing the text DirectoryIndex ...
</VirtualHost>

                          <VirtualHost *>
                             ServerName ocw.localhost
          ...
a.   Right-click the Web Sites directory > New > Web Site.

                       b.   Set the description to OCW and lea...
Windows                                        Linux/ Unix-type system
                   Web Browser                     ...
Here’s how to add a logo:
                   •   The maximum dimensions of your logo are 275 pixels wide by 36 pixels high...
<a href="[your Web site address]"><img src="/OcwWeb/images/[your image filename]"
                              style="pad...
Impact of customization on site updates


                  If you do any customization of your mirror site, please keep i...
synchronization. If you make the logo change to the page header throughout the site, all you need to do is
               ...
- Domains/countries of hosts visitors
                  - Hosts list, last visits and unresolved IP addresses list
       ...
MIRROR SITE UPDATES

                  Overview


                  OCW has a bi-annual publication schedule, which involv...
•      Audio, video, and other enhanced OCW content is also available to be synced. This content is available
            ...
•      Extract the SSH private key (which is provided in a password-protected .ZIP file, emailed from the
                ...
•      Command line options: -arvz --stats
                  •      Location of the private key file: SSH directory under ...
that are appropriate for your environment.
                  •      Make sure that the rsync commands are being run after ...
obsolete files and sub-directories, the summary information at the end of the log may look something like:


             ...
1.         Find out the path where rsync was installed. (generally /usr/local/bin)
                               2.      ...
Reason: Incorrect remote path
                                     Solution: Make sure that the remote path is either one ...
782448 63% 110.64kB/s          0:00:04


                                         Current file size
                      ...
Option 1: Regain Desktop search


                  Overview


                  If you fall into one of the first two cat...
a stand-alone console application and the search mask on the other hand, which is a .war archive that has
                ...
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Upcoming SlideShare
Loading in...5
×

DOC

7,595

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
7,595
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DOC

  1. 1. MIT OPENCOURSEWARE MIRROR SITE PROGRAM DOCUMENTATION Last Updated November 24, 2008 If you have any questions about this document, please contact Yvonne Ng, OCW External Outreach Manager, at: yng@mit.edu or +1-617-253-4719
  2. 2. TABLE OF CONTENTS ABOUT THIS DOCUMENT ..............................................................................................................................................2 OCW MIRROR SITE PROGRAM OVERVIEW........................................................................................................................3 Background and Goals................................................................................................................................................3 Responsibilities of Participating Institution.....................................................................................................................3 Responsibilities of OCW..............................................................................................................................................3 MIRROR SITE LAUNCH..................................................................................................................................................4 OCW MIRROR SITE CONTENTS.......................................................................................................................................5 Organization of external hard drive..............................................................................................................................5 Terms of Use.............................................................................................................................................................6 MIRROR SITE CONFIGURATION .....................................................................................................................................7 Mirror Site Technical Requirements..............................................................................................................................7 Basic Instructions......................................................................................................................................................8 Advanced Instructions (see installation videos for these instructions on the hard drive)......................................................9 Technical Requirements for User Environment..............................................................................................................15 MIRROR SITE USAGE TRACKING...................................................................................................................................20 Overview................................................................................................................................................................20 Monthly Reports.......................................................................................................................................................20 AWStats Introduction...............................................................................................................................................20 AWStats Requirements.............................................................................................................................................21 MIRROR SITE UPDATES...............................................................................................................................................22 Overview................................................................................................................................................................22 OCW Rsync Program.................................................................................................................................................22 Rsync Server Connection...........................................................................................................................................23 Recommended Rsync Configuration............................................................................................................................24 Rsync Commands.....................................................................................................................................................24 Rsync Logging ........................................................................................................................................................26 Rsync Troubleshooting..............................................................................................................................................27 APPENDIX..................................................................................................................................................................30 Search Functionality.................................................................................................................................................30 Feedback Functionality..............................................................................................................................................37 AWStats Installation with Apache on Windows.............................................................................................................40 AWStats Installation with IIS on Windows...................................................................................................................43 AWStats Installation on Unix-like Operating Systems....................................................................................................45 Rsync Installation ....................................................................................................................................................47
  3. 3. ABOUT THIS DOCUMENT This document provides extensive documentation about the OCW Mirror Site program, hosted by MIT OpenCourseWare (OCW), found at http://ocw.mit.edu. The first two sections of this document, “OCW Mirror Site Program Overview” and “Mirror Site Launch,” contain important details about institutional participation and responsibilities. The remaining bulk of the document provides extensive documentation about how to use the OCW Mirror Site to install and maintain a mirror site, or local copy, of OCW materials. The HTML version of this file can be viewed in a Web browser, and the PDF version of this file can be viewed in a PDF reader. If you do not have access to a PDF reader, see the “Technical Requirements for User Environment” section in this document for suggestions on how to obtain a PDF reader. Questions related to this document, or the OCW Mirror Site program, can be referred to Yvonne Ng, OCW External Outreach Manager, at yng@mit.edu or +1-617-253-4719. OCW Mirror Site ReadMe Page 2 September 2008
  4. 4. OCW MIRROR SITE PROGRAM OVERVIEW Background and Goals A fundamental part of OCW’s mission is to extend the reach and impact of OCW materials throughout the world. Much of OCW’s efforts have focused on Africa, where OCW materials are largely underutilized due to limited Internet connectivity. OCW has installed a small number of mirror sites, or local copies of OCW content, at African educational institutions and our evaluation data show that these mirror sites are having a significant positive impact. The OCW Mirror Site program intends to build off of the success of the mirror site program and substantially scale up the effort by distributing OCW Mirror Site packages to universities throughout the developing world. Responsibilities of Participating Institution The participating institution has agreed to the following as part of the OCW Mirror Site program: • Host and maintain a mirror site that contains a complete, updated, and accurate copy of the OCW site and is publicly accessible on campus; • Provide to OCW a monthly report of mirror site usage (which can be collected with the usage tracking software provided on the hard drive); • Promote the mirror site to at least one major local or national media outlet; • Provide a staff member to serve as the central point of contact for OCW inquiries; and • Upon request, contribute to a case study or other activities for the OCW Mirror Site program. Please refer to the next section on “Mirror Site Launch” for additional details on how to carry out these responsibilities. Responsibilities of OCW OCW has agreed to the following as part of the OCW Mirror Site program: • Provide the participating institution with a free OCW Mirror Site package; • Acknowledge the participating institution on the OCW Web site; • Grant authenticated secure shell (SSH) access using digital keys to our mirror site synchronization server, for easy and scheduled content updates; and • Provide an OCW staff member who will serve as the central point of contact for technical assistance and other inquiries. OCW Mirror Site ReadMe Page 3 September 2008
  5. 5. MIRROR SITE LAUNCH Once the mirror site has been installed and is publicly accessible on your network, please make sure to complete the following steps: • Notify Yvonne Ng, OCW External Outreach Manager, at yng@mit.edu or +1-617-253-4719 and provide the following information: o Mirror site URL (if the URL is not accessible from outside the network, please indicate this); o AWStats statistics URL (if the URL is not accessible from outside the network, provide a contact person who can manually email the statistics on a monthly basis); o Plans for promoting the mirror site; and o Central point of contact for the mirror site. • Implement strategies for promoting the mirror site. [NOTE: Use of MIT’s name must not be used in ways that suggest or imply the endorsement of other organizations, their products, or their services. The use of the Institute's name, seal, and photographs in the advertising and other promotional material and activities of outside organizations is prohibited when such use is likely to be understood as an endorsement, even if such an endorsement is not the intention of the person or organization seeking to use MIT's name. All proposals, therefore, for the use of MIT's name or other identification in advertising, sales literature and videos, and commercial publicity must be submitted to the Director of the News Office. Please contact Yvonne Ng (yng@mit.edu) to facilitate your promotion.] Please feel free to use the materials provided on the external hard drive in the PromotionMaterials folder for these promotional activities, but coordinate your efforts with Yvonne Ng. Once your mirror site is launched, OCW will include your institution’s name on a page dedicated to the OCW Mirror Site program on our main site, at http://ocw.mit.edu. In addition, we will include you on any relevant communications to our supported mirror sites. Your central point of contact at OCW will be Yvonne Ng, OCW External Outreach Manager, who can be reached at yng@mit.edu or +1-617-253-4719. OCW Mirror Site ReadMe Page 4 September 2008
  6. 6. OCW MIRROR SITE CONTENTS Organization of external hard drive The organization of this OCW Mirror Site is described below: • NR (13.0 GB)- This folder contains half of the main content from the OCW site, including PDF files for all the course section pages (course home, calendar, syllabus, lecture notes, assignments, exams, etc.) • OCWExternal (205 GB)- This folder contains all of the enhanced content from the OCW site, including video, audio, applets, and course zip packages. • OcwWeb (960 MB)- This folder contains the other half of the main content from the OCW site, including HTML files for all the course section pages (course home, calendar, syllabus, lecture notes, assignments, exams, etc.) • Promotion Materials (5.58 MB)- This folder contains helpful information and templates about OCW and the OCW Mirror Site Program, which institutions can use to announce and promote their mirror site. • Tools (163 MB)- This folder contains optional software tools and documentation that can be used, but are not required, to enhance the mirror site. Subfolders include: o Installer o PDFReader o Rsync o Search o UsageTracking o WebTemplate • index.htm- This file is the home page for the OCW Web site. If you open this file from the drive directly, you will be able to browse the OCW site locally off the drive. • MIT OpenCourseWareLegal Notices.htm or pdf – This file contains the full legal license that governs the use of the OCW materials. • MITOCWWelcomeLetter.htm or pdf - This file contains a welcome message from OCW’s Executive Director. • ReadMe.htm, pdf, or doc – This file contains extensive information about the OCW Mirror Site program, including configuration of the mirror site using the OCW Mirror Site contents. OCW Mirror Site ReadMe Page 5 September 2008
  7. 7. Terms of Use OCW materials With the exception of the content within the Tools directory and subdirectories, all materials on the OCW Mirror Site are licensed by the Massachusetts Institute of Technology under a Creative Commons License. For more details, please refer to the HTML and/or PDF versions of the files MITOCWFullLicense and MITOCWterms-of-use, located on the root drive of this external hard drive. The underlying premise and purpose of OCW is to make course materials used in MIT courses freely and openly available to others for non-commercial educational purposes. Through OCW, MIT grants the right to anyone to use the materials, either "as is," or in a modified form. There is no restriction on how a user can modify the materials for the user's purpose. Materials may be edited, translated, combined with someone else's materials, reformatted, or changed in any other way. However, there are three requirements that an OCW user must meet to use the materials: • Non-commercial: Use of OCW materials is open to all except for profit-making entities who charge a fee for access to educational materials. • Attribution: Any and all use or reuse of the material, including use of derivative works (new materials that incorporate or draw on the original materials), must be attributed to MIT and, if a faculty member's name is associated with the material, to that person as well. • Share alike (aka "copyleft"): Any publication or distribution of original or derivative works, including production of electronic or printed class materials or placement of materials on a Web site, must offer the works freely and openly to others under the same terms that OCW first made the works available to the user. Other materials The content within the Tools directory and subdirectories are not affiliated with, or endorsed by, OCW and are licensed under different terms. All licenses are provided in the same folders as the installation files for each tool. Please read through the licenses carefully to make sure that your use complies with the terms of use provided in the licenses. Please note that these tools are optional and should be used at your own discretion. OCW is not able to provide any support for these tools, and is including them as part of the external hard drive as a convenience to mirror sites. OCW Mirror Site ReadMe Page 6 September 2008
  8. 8. MIRROR SITE CONFIGURATION Mirror Site Technical Requirements Software The information provided below includes requirements for mirror sites running either Windows or Linux/Unix-type operating systems, although other operating systems can be used as well. Windows Linux/Unix-type system IIS or Apache Apache AWStats 6.8 (included in Tools folder on external AWStats 6.8 (included in Tools folder on external hard drive) hard drive) cwRsync (included in Tools folder on external hard rsync (native to operating system) drive) In addition, Java Run-time environment (JRE) 1.4x or newer is recommended. If your system does not already have this installed, you may download it at: http://www.java.com/en/download/manual.jsp. Disk space There are two options for hosting OCW content on the mirror site: (1) Host the entire site, including video, audio files, and course zip packages, or (2) Host the bulk content of the site, including most of the course materials minus the video and audio files. The size of the OCW content is broken up in the following ways: Bulk content of site (NR, OcwWeb folders): 14 GB Video, audio, and other enhanced content (OcwExternal folder): 205 GB Entire site (NR, OcwExternal, OcwWeb folders): 219 GB Please note that the OCW site will not grow considerably, as only new courses and updates will be published, while older versions of courses are unpublished and archived to D-Space, an online repository. Hosting video and audio If you decide to host the entire site, you will need to decide how you’d like to host the RealMedia streaming video and audio files on the site. There are three options for hosting this content: (1) Host them as downloadable files, (2) Host them as HTTP streaming files through your Web server, or (3) Host them as streaming files through a streaming server. OCW Mirror Site ReadMe Page 7 September 2008
  9. 9. Downloadable files Hosting them as downloadable files is the simplest option and requires no changes to the pages or your server setup. However, in order for your users to access these files, they will need to have sufficient disk space on their local environments to download the files completely. In addition, depending on the local network speed, it may take some time to download the files. HTTP streaming files You can also host the video and audio files as HTTP streaming files through your existing Web server. However, if you anticipate heavy usage of the mirror site, you may want to consider the streaming server option, since HTTP streaming will incur a heaver load on your Web server as more users access the streaming files. To set up HTTP streaming, you will need to follow this process for every streaming audio and video file on the site: • Go to the list of courses with complete sets of audio and video at: OcwWeb/Global/OCWHelp/avocw.htm • Starting with the first course on the list, go to the section of the course that links to the audio/video. • Identify the first audio/video file that you’d like to stream. • Open a text editor (such as Notepad) and enter the URL of the file. Example: OcwExternal/3/3.091/f04/video/ocw-3.091-lec01-56k.rm • Save the file with the same filename, but with the extension .ram, instead of .rm. This .ram file is known as a metafile - a file which contains data about another file. Example: OcwExternal/3/3.091/f04/video/ocw-3.091-lec01-56k.ram • Open the HTML code for that section of the course in a text editor (such as Notepad). Example: OcwWeb/Materials-Science-and-Engineering/3-091Fall-2004/LectureNotes/index.htm • In the HTML code, find the link for the .rm file and change it to link to the .ram file: Example: <a href="OcwExternal/3/3.091/f04/video/ocw-3.091-lec01-56k.ram">RAM - 56K</a> • Follow this process for all audio/video files in the course. • Repeat this entire process for all other courses listed at: OcwWeb/Global/OCWHelp/avocw.htm. Streaming files If you anticipate significant usage of your mirror site and would like to host the audio and video files as streaming, you can set up a streaming server using RealMedia's free Helix server or an open-source streaming server. Additional information on the Helix server can be found at: https://helix- server.helixcommunity.org/. Basic Instructions A Web server is strongly recommended for running the mirror site. It is possible to host the mirror site OCW Mirror Site ReadMe Page 8 September 2008
  10. 10. directly off the external hard drive, without a Web server, but it will have limited performance and other features, such as content updates, usage tracking, and search functionality, will not be possible to implement. If you decide to run the mirror site off the external hard drive directly, make sure that the location of all content is unchanged and remains located at the top level of the drive, so that all the links may work properly. Please note that if your users view the mirror site using the Firefox browser, they may have issues viewing the content if it is served from the external hard drive, because Firefox assumes that the content is located on the same drive as the browser. You may be able to remedy this situation by having users modify their settings in the Firefox browser. For mirror sites that plan to host the content on a Web server, the external hard drive contents are organized such that a simple copy and paste operation will be sufficient for transferring the contents onto the server. The NR, OCWExternal, and OcwWeb folders, and index.htm file, all found at the root directory of the external hard drive, should be copied over to the desired location on the mirror site server. As long as these 3 folders and the index.htm file all reside at the same directory level, and no contents within the folders are moved, the mirror site will work in the local environment without any further modification. If you do not plan on hosting audio, video, and course zip packages (see section on Disk Space below), you do not need to copy the OCWExternal folder to your mirror site server. Please note that links to these files will be broken if you do not include the contents of this folder on the mirror site. If you decide to host these files at a later date, you can always copy over the OCWExternal folder from the external hard drive onto your mirror site server. Advanced Instructions (see installation videos for these instructions on the hard drive) Single Step Installer In an effort to make installation as simple as possible we have provided installer scripts that will install Apache and AWStats, and configure them for your mirror site. For both Windows and Linux you must have administrator/root access to complete the installation. Each installer takes 3 arguments: • Server Name: IP address (ex. 18.72.0.3) or hostname (ex. http://ocw.mit.edu) of server. (Note: A hostname is preferable to an IP address.) • Server Administrator: Email address of the server administrator. When Apache experiences an error, this address will be displayed so that clients can’t report the problem to the server administrator. • OCW Content Location: Path to the OCW content. (Note: This path must be to a directory that contains the subdirectories OcwWeb, OCWExternal, and NR.) Linux (OCW Mirror Site Installation for Linux.wmv video) The Linux installer is a shell script that will compile Apache, copy the AWStats files to /usr/local, and configure both applications to work with one another. Please download the Perl application from http://www.activestate.com/Products/activeperl/index.mhtml and install the program onto your system. To run the script, do the following: 1. Open a terminal and cd to the location of the installer. (The installer is located by default at <path-to- ocw-drive>/Tools/Installer/Linux/.) 2. Run linux_install.sh as root: OCW Mirror Site ReadMe Page 9 September 2008
  11. 11. $ sudo ./linux_install.sh <path-to-ocw-content> <server-name> <server-admin-email> Ex. sudo ./linux_install.sh /mnt/ocw ocw.myschool.edu ocw@myschool.edu Notes: There is a single space between each command line argument. Some distributions use “su” rather than “sudo” to run commands with root access. The path to the content should NOT include a trailing slash. When the script runs you will see the verbose output of the Apache installation and status messages from the installer. After it is complete the script will provide you with two URLs – one pointing to your OCW mirror site, and the other to your AWStats page. Windows (see OCW Mirror Site Installation for Windows.wmv video) For Windows we have created an executable file with a graphical user interface. Here is what you need to do: 1. Run setup.exe. (The installer is located by default at <path-to-ocw-drive>/Tools/Installer/Windows/.) 2. Provide the server name, admin’s email, and content path when requested. 3. Click Install. In addition to installing the three applications mentioned above, the Windows installer will create scheduled task, OCW AWStats Update, that will update the AWStats daily. Once the installer completes it will automatically open two web pages for you. The first points to your OCW mirror site, and the second to your AWStats page. Manual Installation Manual installation of the applications requires installing the three applications separately and modifying Apache’s httpd.conf file to serve OCW content. This method should be used if you have an existing web site/server configuration and would like to add a directory or virtual host for OCW content. Note: ALWAYS make a backup of any configuration files BEFORE editing them. This method of installation is only recommended for those who have experience modifying Apache configuration files. Here is an overview of the process: 1. Use individual installers Apache/IIS and Perl. • Integrate Perl with IIS, if using IIS. 2. Copy AWStats files to a local directory on your filesystem. 3. Configure Apache/IIS to serve the OcwWeb, OCWExternal, and NR directories from the root of the site. 4. Configure Apache/IIS to serve the awstats, awstatsicons, awstatscss, and awstatsclasses directories. 5. Create and edit awstats.ocw.conf to configure AWStats for your OCW installation. Apache Most Linux distributions include Apache. To determine the default location of Apache for your distribution, refer to http://wiki.apache.org/httpd/DistrosDefaultLayout. If you cannot locate Apache on your system, or if you are running Windows, source code and an MSI can be found in <path-to-ocw-drive>/Tools/Installer/ Apache/. If you are running a firewall on your server, you should open TCP port 80 to allow client connections. OCW Mirror Site ReadMe Page 10 September 2008
  12. 12. Compile/Install Windows users should simply run the MSI file to install Apache. Linux users will need to do the following to compile and install Apache: 1. Copy httpd-2.2.8.tar.gz to a location in your local file system. 2. cd to the location in step 1 and run the following commands from a terminal: tar zxf httpd-2.2.8.tar.gz cd httpd-2.2.8 ./configure make make install These commands will install Apache in /usr/local/apache2. 3. Set Apache to run at system boot. a. cd back to the directory containing httpd-2.2.8.tar.gz. b. Copy httpd script from this directory to /etc/init.d with the following command: cp httpd /etc/init.d/httpd cd /etc/init.d c. Finish the setup: Debian/Ubuntu users run the following: update-rc.d httpd defaults Other distributions run: /sbin/chkconfig --add httpd /sbin/chkconfig --level 2345 httpd on 4. Start Apache with the command /usr/local/apache2/bin/apachectl –k start. 5. Test the installation by navigating to http://localhost in a web browser. You should see a default Apache welcome message: It works! Configure Apache for OCW In order for your server to serve OCW content it needs to know where to find the content. (Note: We highly recommend copying the OCW content from the drive provided to an internal drive on your local server so that you’ll have a backup in case the external drive is damaged or misplaced.) Apache gets this information from its httpd.conf file. The configuration process is the same for both Windows and Linux. OCW-only Server This procedure will describe how to setup Apache to only serve OCW content (ie. http://yourdomain.edu will point to the OCW homepage). 1. Make a backup of httpd.conf. 2. Open httpd.conf in a text editor. 3. Find the line that begins with DocumentRoot and replace the path in quotes with the path to your OCW content. OCW Mirror Site ReadMe Page 11 September 2008
  13. 13. 4. Below DocumentRoot should be a Directory tag followed by a path in quotes (the same path we replaced earlier). Replace this path again with the path to your OCW content. Again, make sure you keep the quotes. 5. Below the closing </Directory> add the following lines, replacing <CONTENT-ROOT> with the path that was used earlier: <Directory "<CONTENT-ROOT>OcwWeb"> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> <Directory "<CONTENT-ROOT>OCWExternal"> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> <Directory "<CONTENT-ROOT>NR"> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> 6. Search for the line containing the text DirectoryIndex index.html and append index.htm to it. (Note: There should be a space between index.html and index.htm.) 7. Configure Apache to use the combined log format: a. Locate the line containing LogFormat "%h %l %u %t "%r" %>s %b" common and comment it out by placing a # at the beginning of the line. b. Comment the line CustomLog "logs/access.log" common. c. Uncomment the line CustomLog "logs/access.log" combined. 8. Save the httpd.conf file and restart Apache from the command line: Windows net stop apache2 net start apache 2 Linux (Note: Replace /usr/local/apache2 with the appropriate path to your installation.) /usr/local/apache2/bin/apachectl -k restart 9. Navigate to http://yourdomain.edu in your browser once more and you should see the OCW homepage. OCW as a Directory This procedure sets up OCW to be served as a directory off the server root (ie. http://yourdomain.edu/ocw). 1. Make a backup of httpd.conf. OCW Mirror Site ReadMe Page 12 September 2008
  14. 14. 2. Open httpd.conf in a text editor. 3. Search for the line containing the text DirectoryIndex index.html and append index.htm to it. (Note: There should be a space between index.html and index.htm.) 4. Copy and paste the lines below at the end of httpd.conf, replacing <CONTENT-ROOT> with the path to your OCW content: # OCW Content Directories Alias /ocw "<CONTENT-ROOT>/" Alias /OcwWeb "<CONTENT-ROOT>/OcwWeb/" Alias /OCWExternal "<CONTENT-ROOT>/OCWExternal/" Alias /NR "<CONTENT-ROOT>/NR/" <Directory "<CONTENT-ROOT>/"> Options None AllowOverride None Order allow,deny Allow from all </Directory> 5. Configure Apache to use the combined log format: a. Locate the line containing LogFormat "%h %l %u %t "%r" %>s %b" common and comment it out by placing a # at the beginning of the line. b. Comment the line CustomLog "logs/access.log" common. c. Uncomment the line CustomLog "logs/access.log" combined. 6. Save the httpd.conf file and restart Apache. 7. Navigate to http://yourdomain.edu/ocw in your browser once more and you should see the OCW homepage. OCW as a Subdomain This procedure will setup OCW to be served from a subdomain (ie. http://ocw.yourdomain.edu). Before starting this procedure you must set a DNS record so that ocw.yourdomain.edu points to your server’s IP address. Without a DNS change, the following procedure will not work. 1. Make a backup of httpd.conf. 2. Open httpd.conf in a text editor. 3. Search for the line containing the text DirectoryIndex index.html and append index.htm to it. (Note: There should be a space between index.html and index.htm.) 4. Search for the line containing the text ServerName declaration (ie. ServerName yourdomain.edu:80) and uncomment it, if necessary, by removing the # at the beginning of the line. 5. Copy and paste the lines below at the end of httpd.conf, replacing <CONTENT-ROOT> with the path to your OCW content and <YOUR-DOMAIN> with the hostname of your server (Note: <YOUR-DOMAIN> should be the same as the hostname declared on the ServerName line from step 4.): # OCW Virtual Host NameVirtualHost * <VirtualHost *> ServerName localhost ServerAlias <YOUR-DOMAIN> OCW Mirror Site ReadMe Page 13 September 2008
  15. 15. </VirtualHost> <VirtualHost *> ServerName ocw.localhost ServerAlias ocw.<YOUR-DOMAIN> DocumentRoot "<CONTENT-ROOT>" <Directory "<CONTENT-ROOT> "> Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> ErrorLog "logs/ocw-error.log" CustomLog "logs/ocw-access.log" combined </VirtualHost> 6. Configure Apache to use the combined log format: a. Locate the line containing LogFormat "%h %l %u %t "%r" %>s %b" common and comment it out by placing a # at the beginning of the line. b. Comment the line CustomLog "logs/access.log" common. c. Uncomment the line CustomLog "logs/access.log" combined. 7. Save the httpd.conf file and restart Apache. 8. Navigate to http://ocw.yourdomain.edu in your browser once more and you should see the OCW homepage. IIS The instructions below were developed for IIS 6 on Windows Server 2003. While it may be possible to adapt these instructions to IIS 6 on Windows XP or IIS 7 on Windows Vista/Server 2008, this has not been tested and is not recommended unless you have experience with IIS setup and administration. 1. If you intend to use AWStats you should first install ActiveState Perl using the instructions below. 2. Set index.htm as a default file: a. Right-click on the Web Sites directory > Properties. b. Click the Documents tab. c. Click the Add button, type index.htm, and click OK twice to save the changes. OCW-only Server This procedure will describe how to setup IIS to only serve OCW content (ie. http://yourdomain.edu will point to the OCW homepage). 1. Open the IIS Manager (Start > Control Panel > Administrative Tools > Internet Information Services (IIS) Manager). 2. Expand the folder labeled “Web Sites” where you should find the Default Web Site. Stop this site, if it is running, by right-clicking on it and selecting “Stop”. 3. Rather than overwriting the default web site, create a new site: OCW Mirror Site ReadMe Page 14 September 2008
  16. 16. a. Right-click the Web Sites directory > New > Web Site. b. Set the description to OCW and leave the IP/TCP settings unchanged. c. At the Path field enter the path to the root of your OCW content (This is the drive/folder that contains the subdirectories NR, OCWExternal, and OcwWeb.). d. Set the permissions for the site to “Read.” 4. Set index.htm as a default file: a. Right-click on the Web Sites directory > Properties. b. Click the Documents tab. c. Click the Add button, type index.htm, and click OK twice to save the changes. 5. Navigate to http://yourdomain.edu in your browser once more and you should see the OCW homepage. OCW as a Directory This procedure sets up OCW to be served as a directory off the server root (i.e. http://yourdomain.edu/ocw). 1. Open the IIS Manager (Start > Control Panel > Administrative Tools > Internet Information Services (IIS) Manager). 2. Create a new virtual directory: a. Right-click on your existing site > New > Virtual Directory. b. Use the alias OCW. c. At the Path field enter the path to the root of your OCW content (This is the drive/folder that contains the subdirectories NR, OCWExternal, and OcwWeb.). d. Set the permissions for the site to “Read.” 3. Navigate to http://yourdomain.edu/ocw in your browser once more and you should see the OCW homepage. Technical Requirements for User Environment The information provided below includes recommended software for users accessing the mirror site either via Windows or Linux/Unix-type operating systems, although other operating systems are possible as well. OCW Mirror Site ReadMe Page 15 September 2008
  17. 17. Windows Linux/ Unix-type system Web Browser Web Browser One of the following is recommended: One of the following is recommended: Internet Explorer 6.0+ Firefox 1.0+ Firefox 1.0+ Mozilla 1.4+ Mozilla 1.4+ Media Player Media Player One of the following is recommended: One of the following is recommended: RealOne Player software (streaming or RealOne Player software (streaming or downloadable video) downloadable video) QuickTime Player (downloadable video) QuickTime Player (downloadable video) Windows Media Player (downloadable video) PDF reader The bulk of our site, including lecture notes, assignments, exams, etc., is comprised of PDF files and will require a PDF reader to be viewed. If your mirror site users do not have access to a PDF reader, one possible option to consider is the free tool Ghostscript/Ghostview (http://www.cs.wisc.edu/~ghost/gsview/index.htm). Ghostscript is an interpreter for PDF files, and Ghostview provides a graphical interface for Ghostscript. The installers and documentation are provided on the external hard drive at: Tools/PDFReader. In addition to Ghostscript/Ghostview, there are a number of other free software programs that can be used as well (you can search for other PDF readers at http://www.sourceforge.net). Customization of Mirror Site This section describes the steps that can be taken to add a limited degree of customization and branding to your mirror site. There are three possible changes that can be made which are discussed in detail below: (1) adding your institution’s logo to the mirror site’s home page, (2) adding text about your institution and/ or the mirror site on the home page, and/or (3) adding your institution’s logo to the page header, which appears on every page of the site except for the home page. Note: There are sample pages which follow the instructions below on the external hard drive at: Tools/WebTemplate. These are sample pages for reference purposes only and should not replace the actual Web pages for the mirror site. Please follow the instructions and make the modifications to the actual pages listed below. Adding your institution’s logo to the home page A logo may be displayed to the right of the MIT OpenCourseWare logo on the mirror site’s home page (see the “University of Smithville” example below). OCW Mirror Site ReadMe Page 16 September 2008
  18. 18. Here’s how to add a logo: • The maximum dimensions of your logo are 275 pixels wide by 36 pixels high. Exceeding this size limitation in either dimension will cause unwanted changes to the page layout. • If you wish to make your image a transparent GIF that will match the existing beige background color, the hexadecimal value is #edecdf. • Your image should be placed in the /OcwWeb/images/ directory on your mirror drive when it is ready. • In order to place your image, you will need to add one line of html code to four different copies of the home page. These files must be edited individually; do not edit one and save it to all four locations. The files can be found on your mirror drive at: • /index.htm • /OcwWeb/index.htm • /OcwWeb/web/index.htm • /OcwWeb/web/home/home/index.htm • Within each file, search for the text: div class=”logo” • On the following line you will see the link and image that represents the MIT OpenCourseWare logo: <h1><a href="/OcwWeb/web/home/home/index.htm"><img src="/OcwWeb/images/logo- ocw-home_new.gif" alt="MIT OpenCourseWare" width="289" height="36" /></a></h1> • You will be adding your image (linked if you choose) between the last two html tags on that line (between </a> and </h1>). • The line you add should be as follows: OCW Mirror Site ReadMe Page 17 September 2008
  19. 19. <a href="[your Web site address]"><img src="/OcwWeb/images/[your image filename]" style="padding-left: 15px;" /></a> In this example, [your Web site address] should be replaced with the complete URL that you want people sent to when they click on your image. If you do not want your logo to link to your site, you may remove the portion before “<img” as well as the “</a>” at the end. [your image filename] should be replaced with the name of the logo you made for this site (like “our-logo.gif”). • As an example, this is what the completed line would look like, after having added your logo next to the MIT OpenCourseWare logo: <h1><a href="/OcwWeb/web/home/home/index.htm"><img src="/OcwWeb/images/logo- ocw-home_new.gif" alt="MIT OpenCourseWare" width="289" height="36" /></a><a href="http://www.smithville.edu"><img src="/OcwWeb/images/our-logo.gif" style="padding- left: 15px;" /></a></h1> Adding text to the home page Text about your organization and/or mirror site may be added to replace the area of the home page initially labeled as “Featured Course.” See the example below. Here’s how to add the text: • Using the same four index.htm files referenced in the previous portion of this document, search for the text “FEATURED COURSE”. You can replace this text with whatever you would like the heading for your text to be. It will remain red and bold as long as you do not change any of the surrounding HTML. • Below that, you will see 6 lines of html that begins with “<div” and ends with </div>. Leave those two lines intact, but you may delete everything in between those two lines and replace it with your own text and images. • This area is limited to 320px wide, but will extend vertically to suit as much or as little content as you choose to add. OCW Mirror Site ReadMe Page 18 September 2008
  20. 20. Impact of customization on site updates If you do any customization of your mirror site, please keep in mind that these changes could be overwritten when you update the mirror site via our rsync process. If you make the logo and/or text changes to the home page, please make sure to save a copy of the customized home page in a separate location so that you can replace the updated home page after OCW Mirror Site ReadMe Page 19 September 2008
  21. 21. synchronization. If you make the logo change to the page header throughout the site, all you need to do is maintain a copy of the logo file in a separate location and replace the file after synchronization. The correct logo will be shown if this process is followed. Please read through the “Mirror Site Updates” section of this document carefully for more details on how to reinstate your mirror site’s customized elements after updates. MIRROR SITE USAGE TRACKING Overview After installing the mirror site, the most important task is to configure the mirror site to collect usage statistics. One possible option to consider is the open-source tool AWStats (http://awstats.sourceforge.net/), and detailed instructions on installing the tool are provided below. The installers and documentation are also provided on the external hard drive at: Tools/UsageTracking. In addition to AWStats, there are a number of other free software programs that can be used as well (you can search for other Web server log tools at http://www.sourceforge.net). Monthly Reports As part of the monthly reporting requirement for mirror sites, we ask that you provide the number of visits to your mirror site. If you are able to automatically generate the statistics to a Web page (which is configurable by AWStats- see instructions below), please send the URL to Yvonne Ng, OCW External Outreach Manager, at yng@mit.edu. This will be sufficient to meet the monthly reporting requirement. If you are unable to automatically generate the statistics to a Web page, please identify a central point of contact who we can email on a monthly basis to collect the data. Please provide this contact information to Yvonne at yng@mit.edu. AWStats Introduction AWStats is a free tool that generates advanced Web server statistics, graphically. This log analyzer works as a perl CGI or from a command line and shows you all possible access information contained in your Web server log, in a few graphical Web pages. It is able to process large log files, often and quickly. It can analyze log files from all major server tools like Apache log files (NCSA combined/XLF/ELF log format or common/CLF log format) and IIS (W3C log format) log files. AWStats reports the following statistics: - Number of visits, and number of unique visitors - Visit duration and last visits - Authenticated users, and last authenticated visits - Days of week and rush hours (pages, hits, KB for each hour and day of week) OCW Mirror Site ReadMe Page 20 September 2008
  22. 22. - Domains/countries of hosts visitors - Hosts list, last visits and unresolved IP addresses list - Most viewed entry and exit pages - Files type - OS used by visitors - Browsers used by visitors - Search engines, keyphrases and keywords used to find your site - HTTP errors - And more... AWStats supports the following features: - It can analyze many log formats: NCSA combined log files (XLF/ELF), common (CLF), IIS log files (W3C), WebStar native log files. - It works from a command line and from a browser as a CGI. - Updates of statistics can be made from a Web browser and not only from your scheduler. - It can process log files of unlimited size, and it supports split log files (load balancing system). - Supports 'not correctly sorted' log files even for entry and exit pages. - Allows reverse DNS lookup before or during analysis and support DNS cache files. - Country detection from IP location (geoip) or domain name. - It supports several languages. - No need of rare perl libraries. All basic perl interpreters can make AWStats working. - Dynamic reports as CGI output. - Static reports in one or framed HTML/XHTML pages, experimental PDF export. - Look and colors can match your site design, can use CSS. - Help and tooltips on HTML reported pages. - Easy to use. There is just one configuration file to edit. AWStats Requirements AWStats runs on Unix, Linux and Windows operating systems, requiring only a basic Perl installation. (No specialized Perl modules are needed). Perl is almost always present on Unix-like operating system platforms, but usually needs to be installed on Windows. The Perl installer should be downloaded from http://www.activestate.com/Products/activeperl/index.mhtml and installed onto your system. Please place it in your Tools folder. To use AWStats, the following specific requirements need to be met: - Your server must log Web access in a log file you can read. - You must be able to run perl scripts (.pl files) from command line and/or as CGI (Perl 5.00503 or higher required to run AWStats 6.0 or higher). Detailed AWStats documentation is included on the external hard drive at: Tools/ UsageTracking/Unix- Linux/ awstats_docs.pdf or Tools/ UsageTracking/Windows/awstats_docs.pdf. OCW Mirror Site ReadMe Page 21 September 2008
  23. 23. MIRROR SITE UPDATES Overview OCW has a bi-annual publication schedule, which involves a major publication of courses in April and October of every year. After each publication is complete, mirror sites should update their content to include the new and updated courses published, and remove the old courses unpublished, from the main OCW site. There are two methods for updating a mirror site’s content: (1) for those mirror sites who have no or very limited Internet connectivity, they can send the hard drive back to OCW for a new version of the entire site to be loaded onto the drive and shipped back, or (2) for those mirror sites that have some type of Internet connectivity, we recommend updating content via the Internet using an rsync process. This section contains instructions on how to update your mirror site via the rsync process. Rsync is an open source utility that provides fast incremental file transfers. The rsync remote-update protocol allows one to efficiently transfer just the differences between two sets of files even across “slow” (i.e. low bandwidth, high latency) network connections. This makes rsync ideally suited to synchronizing content when both endpoints contain different versions of the same files. Rsync is run in daemon mode on the OCW mirror server, and partner mirror sites usually connect to this process using a client-side rsync command-line utility. The data transfer between partner site and OCW occurs over a secure SSH connection. For more technical overview on rsync technology, see http://samba.anu.edu.au/ftp/rsync/rsync.html. OCW Rsync Program OCW maintains a dedicated rsync server (the “mirror server” located at ocw-mirror.mit.edu) which can be used by partner sites to obtain updated versions of OCW content. The OCW mirror server contains all OCW material up to the most recent publishing cycle. OCW partner sites can use the freely-available rsync command-line utility to connect to the mirror server and download content updates. Mirror sites can rsync with the OCW mirror server on a schedule that is aligned with the publishing schedule, typically two months after each major publication, to allow time for the content to stabilize on the main OCW site. At these times, OCW makes its mirror server available and mirror sites can “pull” content for a period of time. Key components The following summarizes the key components of the rsync program: • Rsync is currently available only to registered OCW mirror sites. Access is secured through a private SSH key, which is provided by the OCW External Outreach Manager. • Content to be synced includes two directories from the OCW Web site: OcwWeb_for_Intranets (or OcwWeb) and NR. OCW Mirror Site ReadMe Page 22 September 2008
  24. 24. • Audio, video, and other enhanced OCW content is also available to be synced. This content is available in a third directory called OCWExternal. • If you plan to locally host the audio and video content, make sure to sync with the OcwWeb_for_Intranets directory, instead of the OcwWeb directory. The OcwWeb directory contains HTML files with the original OCW links to the audio and video content and would require an Internet connection to access. • Content will be current as of the most recent major publishing cycle. • Any files that have been deleted from the OCW Web site (e.g. courses that have been retired and/or updated) will be deleted from your mirror site directories after rsync. • If you have customized HTML files on your mirror site, these files could get overwritten during rsync, unless you configure your server not to sync these files. In this case you will need to maintain rsync content updates in a directory that is distinct from your actual Web site DocumentRoot directory. This is especially important if you decide to customize your mirror site (see “Customization of Mirror Site” section of this document). Prerequisites The following are some of the prerequisites for a mirror site to participate in rsync: • At least 30 GB of disk space to accommodate basic OCW HTML and PDF content and the site’s anticipated growth over the next 1.5 years, when the site will reach a steady-state size. (At this point, the size will not grow considerably, as only new courses and updates will be published, while older versions of courses are unpublished.) As of 9/1/2008, this content (in /OcwWeb_for_Intranets and /NR) was about 14 GB. • An additional 270 GB of disk space if a mirror site plans to accommodate enhanced OCW content (audio, video, zip packages, etc) over the next 1.5 years. As of 9/1/2008, this content totaled about 205 GB. • Rsync 2.6.4 or newer (provided on the external hard drive). • OpenSSH 1.2.32 or newer (provided on the external hard drive). • An SSH private key (provided via email from the OCW External Outreach Manager). Mirror sites using current versions of the Unix or Linux operating system may already have workable versions of rsync and ssh installed. If not, then these can be easily be built from source or downloaded as part of a binary distribution. Please see the appropriate section below for installation instructions. Mirror sites using a Windows server platform will need to follow the instructions below and use the cwRsync installer package, which installs both the needed rsync and ssh components. Rsync Server Connection In order to connect to the OCW mirror server at ocw-mirror.mit.edu, please follow these steps: OCW Mirror Site ReadMe Page 23 September 2008
  25. 25. • Extract the SSH private key (which is provided in a password-protected .ZIP file, emailed from the OCW External Outreach Manager.) • Place the SSH private key into the appropriate ssh directory (the path depends on your operating system and configuration; for Windows, the path may be “C:Program FilescwRsyncbinssh” and for Unix-like systems, you should locate the administrator’s .ssh/ directory. • The private key is password protected, and this password must be entered each time when you use the SSH key to connect to ocw-mirror.mit.edu. The password will be communicated to you by OCW. • Now test rsync and ssh by obtaining a directory listing from the mirror server using a command like: $ rsync -arvz -e "ssh -i .ssh/partner4" ocwpartner@ocw-mirror.mit.edu:: - Please replace the “partner4” text above with the actual name of your key file. - For Windows users the “.ssh” directory in the command above should be “ssh”. The “.” character is not always supported in Windows folder names. • You now should be able to mirror all OCW Web content by using commands described in the subsequent sections. Recommended Rsync Configuration Please make sure that your local rsync directories being synchronized with the OCW mirror server are NOT the same as your Web site DocumentRoot directory. Instead, it is recommended that partner sites maintain separate locations in the file system to hold rsync updates. This is especially recommended for the OcwWeb/ and NR/ directories. Although this will require a manual reconciliation between the rsync updates and the live mirror content, this configuration ensures that the content on the live mirror site is not left in an inconsistent state, if the rsync operation is not completed successfully. In addition, please note the following important considerations: • Due to its large size, it may not be possible for mirror sites to host two copies of the OCWExternal/ directory, and may thus choose to rsync directly with their live OCWExternal directory. • If you are hosting local copies of audio and video files, you will be synching from the OcwWeb_for_Intranets directory on the mirror server. Please make sure that you construct the rsync commands such that your local path points to the OcwWeb directory (not OcwWeb_for_Intranets), since all of the relative links are written to point to OcwWeb/. For sample commands, please refer to the following section. Rsync Commands General Format The rsync command to be used typically has the following format: rsync <command line options> –e “ssh –i <location of the private key file>/<name of the key file>” <remote user name>@<remote host name>::<remote path> <local path> 1> <rsync log file path> 2><rsync error log file path> OCW Mirror Site ReadMe Page 24 September 2008
  26. 26. • Command line options: -arvz --stats • Location of the private key file: SSH directory under the home directory of the administrator i.e. ~/.ssh • Name of the key file: identityfile • Remote User Name: ocwpartner • Remote Host Name: ocw-mirror • Remote path: OcwWeb_for_Intranets, NR, or OcwExternal (OcwWeb is also available) • Local Path: Path on the local machine where the content needs to be copied (OcwWeb should be used here for content synched from OcwWeb_for_Intranets/) • Rsync log file Path: Path on the local machine where the rsync will log it’s output • Rsync error log file Path: Path on the local machine where the rsync will log the output for the exceptions Apart from the remote host name, remote server name and remote path, all other parameters can be modified as needed. Sample Commands Once the server has been setup with the rsync functionality and SSH keys as described in the previous sections, you can start receiving updates to your local OCW Web content using the following commands: rsync -arvz –-stats –-delete -e "ssh -i ~/.ssh/identityfile" ocwpartner@ocw-mirror.mit.edu:: OcwWeb_for_Intranets OcwWeb 1> rsync.log 2> error.log rsync -arvz –-stats –-delete -e "ssh -i ~/.ssh/identityfile" ocwpartner@ocw-mirror.mit.edu::NR NR 1> rsync.log 2> error.log rsync -arvz –-stats –-delete -e "ssh -i ~/.ssh/identityfile" ocwpartner@ocw-mirror.mit.edu::OcwExternal OcwExternal 1> rsync.log 2> error.log Please note: For Windows users the “.ssh” directory in the command above should be “ssh” The “.” character is not always supported in Windows folder names. Please also keep the following in mind when using these rsync commands: • These commands should be adequate for most applications. Other rsync command options can be used, as described in the additional documentation on the external hard drive. • These commands will use the SSH key file (provided by OCW) to connect to the OCW mirror server, and will begin mirroring OCW content directories (OcwWeb_for_Intranets, NR, OcwExternal) to corresponding directories on your Web server. • These commands have been tested on a BASH shell. Other shells may require different syntax. • The target directories will be created within the rsync directories on your server if they do not exist. • Make sure that the sync’d up directories are then (manually) copied to the Web accessible directories. • Please verify that any updated files moved to your server's DocumentRoot directory have permissions OCW Mirror Site ReadMe Page 25 September 2008
  27. 27. that are appropriate for your environment. • Make sure that the rsync commands are being run after logging in as root. Executing rsync under a "root-like" account helps sidestep two additional problems that could occur otherwise: o After downloading the OcwWeb_for_Intranets content, moving the content to the live Web directory and a possible restart of the Web servers may require root-like privileges. o In order to Web-publish the content that was just downloaded, you may need to change permissions on a large number of files. An account with explicit root privileges is guaranteed to be able to do this. • Again, it is strongly recommended that your local (rsync) directories being synchronized with the OCW mirror server are NOT the same as your Web site DocumentRoot directory. Instead, it is recommended that partner sites maintain separate locations in the filesystem to hold rsync updates. This ensures that, if the rsync operation is not completed successfully, the content on the live Web site is not left in an inconsistent state. Rsync Logging Rsync logs provide useful information for mirror site partners. For example, the logs detail which files have been added or deleted during an rsync run. As mentioned earlier, when obtaining course content from the OCW mirror server, the mirror site administrators will use a command-line similar to: $ rsync -arvz –-stats –-delete -e "ssh -i ~/.ssh/identityfile" ocwpartner@ocw-mirror.mit.edu:: OcwWeb_for_Intranets OcwWeb 1> rsync.log 2> error.log The corresponding file transfer log, called rsync.log in the above example, can be accessed to view additional details on the content transfer process. Important summary information, such as the number of files transferred and the total transferred file size, is contained at the end of this log file and can be used to confirm whether or not the rsync process was successful. A sample of this summary found in the log file can be seen below. Number of files: 195 Number of files transferred: 179 Total file size: 1052231 bytes Total transferred file size: 1052231 bytes Literal data: 1052231 bytes Matched data: 0 bytes File list size: 3985 Total bytes sent: 3668 Total bytes received: 257140 sent 3668 bytes received 257140 bytes 57957.33 bytes/sec total size is 1052231 speedup is 4.03 When the administrator downloads the same directory again several months later, after OCW removes OCW Mirror Site ReadMe Page 26 September 2008
  28. 28. obsolete files and sub-directories, the summary information at the end of the log may look something like: Number of files: 18 Number of files transferred: 0 Total file size: 166809 bytes Total transferred file size: 0 bytes Literal data: 0 bytes Matched data: 0 bytes File list size: 533 Total bytes sent: 88 Total bytes received: 609 sent 88 bytes received 609 bytes 278.80 bytes/sec total size is 166809 speedup is 239.32 These examples show that both new file/directory creations and deletions are clearly shown in the rsync transfer log. Rsync Troubleshooting Any problems with rsync operation can be reported by sending an e-mail to ocw-rsync@mit.edu. 1. The rsync connection is getting denied by the OCW Mirror Server. Reason: If the rsync connection is getting refused by the OCW Mirror Server, it may mean that the internal rsync on the OCW Mirror Server is currently taking place. This operation blocks any external SSH access on the OCW directories so that the content being synced by the partner mirror sites is not in an inconsistent state i.e. half updated and half out-of-date. This operation takes around half an hour to complete. Solution: Retry rsync commands after half an hour. If the connection keeps getting refused, please refer to the “How to report issues” section to report the matter. 2. The given rsync command does not work and throws errors. If the output from rsync command is being logged, the following error messages will appear in the log file rather than on the console: • Error message: rsync: command not found Reason: rsync command path is not in the PATH. Solution: Option 1 OCW Mirror Site ReadMe Page 27 September 2008
  29. 29. 1. Find out the path where rsync was installed. (generally /usr/local/bin) 2. Add the rsync path in the PATH parameter so that rsync can be run from anywhere. Option 2 1. Run the rsync command with the full path: /usr/local/bin/rsync -arvz –-stats –-delete -e "ssh -i ~/.ssh/identityfile" ocwpartner@ocw- mirror.mit.edu::OcwWeb_for_Intranets OcwWeb • Error message: rsync: Failed to exec ssh: No such file or directory Reason: SSH path is not set Solution: Option 1 Add the SSH path in the PATH parameter so that SSH can be run from anywhere. You can get the SSH path by running the following command: which ssh Option 2 Run the SSH command with the full path: /usr/local/bin/rsync -arvz –-stats –-delete -e "/usr/local/bin/ssh -i ~/.ssh/identityfile" ocwpartner@ocw-mirror.mit.edu::OcwWeb_for_Intranets OcwWeb • Warning: Identity file ~/.ssh/identityfile does not exist. Reason: Identityfile has not been placed in the correct directory Solution: 1. Make sure you are logged in as the user which has to run the rsync command. 2. Go to .ssh directory. 3. Make sure Identity file exists there. 4. Now run the command again. • Error message: host/servname not known Reason: Incorrect spelling for the remote server Solution: Make sure that the remote host name is ocw-mirror. • Issue: Password is not accepted Reason: Incorrect spelling for the remote user name Solution: Make sure that the remote user name is ocwpartner. • Error message: link_stat failed: No such file or directory OCW Mirror Site ReadMe Page 28 September 2008
  30. 30. Reason: Incorrect remote path Solution: Make sure that the remote path is either one of the following /var/https/jakarta-tomcat-4.0.6-LE-jdk14-prod/Webapps/OcwWeb_for_Intranets /var/https/jakarta-tomcat-4.0.6-LE-jdk14-prod/Webapps/NR /var/https/jakarta-tomcat-4.0.6-LE-jdk14-prod/Webapps/OcwExternal • Error message: failed to set times on <directory name> Reason: Directory does not have the correct permissions set Solution: Make sure that the directory to which the content is being copied has write permissions for group and other. 3. Incomplete rsync operation To determine if the rsync operation was not completed, check the following: • If the command is being run from the command line in the foreground, there will be an error thrown by rsync which will appear on the console like “some files could not be transferred.” • One can also use the following command options with rsync in order to make it robust. These command line options will help in catching any abrupt abnormal termination of the rsync command. (Detailed description of each of these options can be found at: http://hpux.connect.org.uk/hppd/hpux/Networking/Admin/rsync-2.6.4/man.html) o --checksum option  Forces the sender to checksum all files using a 128-bit MD4 checksum before transfer. Checksum is then explicitly checked on the receiver  Any files of the same name which already exist and have the same checksum and size on the receiver are not transferred.  This option can be quite slow.  Help in reducing the number of transfers if rsync has to be resumed again. o –partial  By default, rsync will delete any partially transferred file if the transfer is interrupted.  This option is used to keep partially transferred files so that a subsequent transfer of the rest of the file is much faster. o --partial-dir=DIR  To specify a DIR that will be used to hold the partial data instead of writing it out to the destination file.  On the next transfer, rsync will use a file found in this dir as data to speed up the resumption of the transfer and then deletes it after it has served its purpose. o –-progress  Print information showing the progress of the transfer. Implies --verbose if it wasn't already specified.  When the file is transferring, the data looks like this: OCW Mirror Site ReadMe Page 29 September 2008
  31. 31. 782448 63% 110.64kB/s 0:00:04 Current file size Percentage of the transfer that is complete Current calculated file-completion rate Estimated time remaining in this transfer  After a file is complete, the data looks like this: 1238099 100% 146.38kB/s 0:00:08 (5, 57.1% of 396) Final file size Percentage of the transfer that is complete Final transfer rate for the file Amount of elapsed time it took to transfer the file Total-transfer summary in parentheses. How many files have been updated What percent of the total number of files has been scanned o Once it has been determined that the previous rsync operation ended abnormally, it can be resumed by invoking the rsync commands again. APPENDIX Search Functionality The search functionality will not work on your mirror site by default, because the search on the OCW site relies on code hosted on our servers and is not localizable. This section offers potential solutions for mirror sites that wish to provide some form of site-wide content search to their users. There are two general ways we recommend to make the OCW mirror content searchable, due to their relative simplicity and ease of use. When implementing a search feature you have to first decide which of the following statements applies to your site: 1) You DON'T have a reliable or high-bandwidth Internet connection AND you use a Web server to publish the OCW content in your local area network (LAN) only. 2) Your site uses OS file sharing mechanisms (Windows File-sharing, Samba or NSF) to publish content on your local area network (LAN) only. 3) Your site is well connected to the Internet AND uses a Web server to publish the OCW content to locations outside your local area network (LAN). OCW Mirror Site ReadMe Page 30 September 2008
  32. 32. Option 1: Regain Desktop search Overview If you fall into one of the first two categories, you can implement a content search feature using an opensource software product called Regain Desktop Search (http://regain.sourceforge.net/), which is installed on the site's Web server. Regain is a Java search engine based on the Jakarta Lucene project. It provides indexing and searching files for plenty of formats (currently HTML, XML, Excel, Powerpoint, Word, PDF and RTF). The Desktop Version allows users to search both content stored on a server's local hard disk drive or published on a Web site. The Desktop Version runs on both Windows and Linux machines, and is available in the form of a simple-to- use installer package. This package will install all the necessary components, Java and a small dedicated Webserver. The installer files for Windows and Linux/Unix operating systems are provided on the external hard drive at: Tools/Search. Please Note: • If you are running other Unix-like operating system (Solaris, *BSD, etc) on your server, you will need to obtain the Regain Server Version, which runs as a Web application within Tomcat (or other Java servlet container such as Resin or Jetty). Thus the Server version requires you to install Java AND Tomcat as appropriate for your server. • Regain software is intended to completely replace the Search and Advanced Search fields that are visible on your mirror site. Those fields will not work. Instead for search your users will visit separate unique URLs (see below). Desktop versus Server Versions (Documentation here adapted from http://regain.murfman.de/wiki/en/index.php/Main_Page) Both Regain Desktop and Regain Server include the crawler, which is needed to create the search index, and the search mask, which is needed for searching on a search index. Also the configuration of both variants is done in the same way: XML files CrawlerConfiguration.xml and SearchConfiguration.xml, but the desktop search offers a Web interface that writes the most important settings to these two files. The Desktop Search also comes with a small program that manages these two parts. On Windows, this program integrates itself in the task bar and automatically starts the crawler, so the search index is updated regularly or whenever the configuration has changed. Furthermore Regain Desktop provides a Web server, which is needed by the search mask in order to work. For the Server Search, the crawler is an independent program, which the administrator has to call by hand (or automated) in order to update the search index. Furthermore the administrator has to run the search mask in a servlet engine. The Web server is not delivered with Regain. Technically speaking, the Desktop Search is a stand-alone application, an application that needs -- aside from Java -- no more programs to work. The Server Search is split in the crawler on the one hand, which is OCW Mirror Site ReadMe Page 31 September 2008
  33. 33. a stand-alone console application and the search mask on the other hand, which is a .war archive that has to be integrated in a Java servlet engine (like Tomcat). Installation and Configuration of Regain Desktop 1. Use the appropriate installer for your operating system from the external hard drive at: Tools/Search. For Windows you may use the Windows-installer for installation, under other platforms you have to unpack the zip file. Follow the step-by-step instructions. 2. Regain must be started if you want to search for something. On Windows you can recognize if Regain is started by the regain-symbol (a blue "r") in the task bar beside the clock. 3. The Windows-installer sets regain Regain to start automatically (even after rebooting). At the first start your browser will come up and show the welcome page. If you have downloaded the zip file for Linux then you can start Regain by entering the following commands in the console: cd [Regain-Directory] java -jar regain.jar Please note the following: • Regain Desktop ships with its own built-in Webserver. You will need to make sure that this Webserver application is not bound to the same TCP port as your primary Webserver! • Regain Desktop also requires Java to be installed on your server. Even a small run-time version of Java will work. Option 2: Google search Mirror sites falling into the third category, which have a reliable Internet connection, can implement a content search feature by leveraging the free indexing service provided by the Google search engine company. This involves two things: • Invite Google to index your Website by registering here: http://www.google.com/addurl/? continue=/addurl • Allow visitors to search the content on your site using a modified version of the Google search script below hosted on your server. (Script courtesy of Dr. Dilvan de Abreu Moreira, University of São Paulo, Brazil) ------- Sample JSP script ------------------------------------------------------------------------ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html lang="en"> <head> <title>MIT OpenCourseWare | Advanced Search</title> <meta name="Author" content="Dilvan de Abreu Moreira"> OCW Mirror Site ReadMe Page 32 September 2008

×