Ok, now you know the requirements and how to offer the services, but not how to get it. :-) This section explains how to actually mirror the various parts of FreeBSD, what tools to use, and where to mirror from.
The FTP area is the largest amount of data that needs to be mirrored. It includes the distribution sets required for network installation, the branches which are actually snapshots of checked-out source trees, the ISO Images to write CD-ROMs with the installation distribution, a live file system, lots of packages, the ports tree, distfiles, and a huge amount of packages. All of course for various FreeBSD versions, and various architectures.
You can use a FTP mirror program to get the files. Some of the most commonly used are:
ftp/mirror was very popular, but seemed to have some drawbacks, as it is written in perl(1), and had real problems with mirroring large directories like a FreeBSD site. There are rumors that the current version has fixed this by allowing a different algorithm for comparing the directory structure to be specified.
In general FTP is not really good for mirroring. It transfers the whole file if it has changed, and does not create a single data stream which would benefit from a large TCP congestion window.
A better way to mirror the FTP area is rsync. You can install the port net/rsync and then use rsync to sync with your upstream host. rsync is already mentioned in Section 2.4.2. Since rsync access is not required, your preferred upstream site may not allow it. You may need to hunt around a little bit to find a site that allows rsync access.
Note: Since the number of rsync clients will have a significant impact on the server machine, most admins impose limitations on their server. For a mirror, you should ask the site maintainer you are syncing from about their policy, and maybe an exception for your host (since you are a mirror).
A command line to mirror FreeBSD might look like:
% rsync -vaz --delete ftp4.de.FreeBSD.org::FreeBSD/ /pub/FreeBSD/
Consult the documentation for rsync, which is also available at http://rsync.samba.org/, about the various options to be used with rsync. If you sync the whole module (unlike subdirectories), be aware that the module-directory (here "FreeBSD") will not be created, so you cannot omit the target directory. Also you might want to set up a script framework that calls such a command via cron(8).
A few sites, including the one-and-only ftp-master.FreeBSD.org even offer CVSup to mirror the contents of the FTP space. You need to install a CVSup client, preferably from the port net/cvsup. (Also reread Section 2.4.4.) A sample supfile suitable for ftp-master.FreeBSD.org looks like this:
# # FreeBSD archive supfile from master server # *default host=ftp-master.FreeBSD.org *default base=/usr *default prefix=/pub #*default release=all *default delete use-rel-suffix *default umask=002 # If your network link is a T1 or faster, comment out the following line. #*default compress FreeBSD-archive release=all preserve
It seems CVSup would be the best way to mirror the archive in terms of efficiency, but it is only available from few sites.
Note: Please have look at the CVSup documentation like cvsup(1) and consider using the
-s
option. This reduces I/O operations by assuming the recorded information about each file is correct.
There are various ways to mirror the CVS repository. CVSup is the most common method.
CVSup is described in some detail in Section 2.4.4 and Section 3.1.3.
It is very easy to setup a CVSup mirror. Installing net/cvsup-mirror will make sure all of the needed programs are installed and then gather all the needed information to configure the mirror.
Note: Please do not forget to consider the hint mentioned in this note above.
Using other methods than CVSup is generally not recommended. We describe them in short here anyway. Since most sites offer the CVS repository as part of the FTP fileset under the path /pub/FreeBSD/development/FreeBSD-CVS, the following methods could be used.
FTP
Rsync
HTTP
Important: AnonCVS cannot be used to mirror the CVS repository since CVS does not allow you to access the repository itself, only checked out versions of the modules.
The best way is to check out the www distribution from CVS. If you have a local mirror of the CVS repository, it is as easy as:
% cvs -d /home/ncvs co www
and a cronjob, that calls cvs up -d -P on a regular basis, maybe just after your repository was updated. Of course, the files need to remain in a directory available for public WWW access. The installation and configuration of a web server is not discussed here.
If you do not have a local repository, you can use CVSup to maintain an “up to date copy” of the www pages. A sample supfile can be found in /usr/share/examples/cvsup/www-supfile and could look like this:
# # WWW module supfile for FreeBSD # *default host=cvsup3.de.FreeBSD.org *default base=/usr *default prefix=/usr/local *default release=cvs tag=. *default delete use-rel-suffix # If your network link is a T1 or faster, comment out the following line. *default compress # This collection retrieves the www/ tree of the FreeBSD repository www
Using ftp/wget or other web-mirror tools is not recommended.
Since the documentation is referenced a lot from the web pages, it is recommended that you mirror the FreeBSD documentation as well. However, this is not as trivial as the www-pages alone.
First of all, you should get the doc sources, again preferably via CVSup. Here is a corresponding sample supfile:
# # FreeBSD documentation supfile # *default host=cvsup3.de.FreeBSD.org *default base=/usr *default prefix=/usr/share *default release=cvs tag=. *default delete use-rel-suffix # If your network link is a T1 or faster, comment out the following line. #*default compress # This will retrieve the entire doc branch of the FreeBSD repository. # This includes the handbook, FAQ, and translations thereof. doc-all
Then you need to install a couple of ports. You are lucky, there is a meta-port: textproc/docproj to do the work for you. You need to set up some environment variables, like SGML_CATALOG_FILES. Also have a look at your /etc/make.conf (copy /usr/share/examples/etc/make.conf if you do not have one), and look at the DOC_LANG variable. Now you are probably ready to run make in your doc directory (/usr/share/doc by default) and build the documentation. Again you need to make it accessible for your web server and make sure the links point to the right location.
Important: The building of the documentation, as well as lots of side issues, is documented itself in the FreeBSD Documentation Project Primer. Please read this piece of documentation, especially if you have problems building the documentation.
Every mirror should be updated on a regular basis. You will certainly need some script framework for it that will be called by cron(8). Since nearly every admin does this his own way, we cannot give specific instructions. It could work like this:
Put the command to run your mirroring application in a script. Use of a plain /bin/sh script is recommended.
Add some output redirections so diagnostic messages are logged to a file.
Test if your script works. Check the logs.
Use crontab(1) to add the script to the appropriate user's crontab(5). This should be a different user than what your FTP daemon runs as so that if file permissions inside your FTP area are not world-readable those files can not be accessed by anonymous FTP. This is used to “stage” releases — making sure all of the official mirror sites have all of the necessary release files on release day.
Here are some recommended schedules:
FTP fileset: daily
CVS repository: hourly
WWW pages: daily