Project Infrastructure Migration 2007

This document attempts to discuss the needs of projects currently going through incubation. It attempts to address infrastructure needs, migration strategies to OSGeo servers, and migration strategies to mitigate disruption if OSGeo stops using CollabNet services.

= Infrastructure Components =

The following components make up the typical needs of an open source project, each section documents what the current projects use, and what CollabNet offers. The "pain" of migration is evaluated by Frank (thanks Frank).

Web Pages
Currently the members projects use:
 * Wiki: OSSIM + GRASS (Twiki), GeoTools + MapBuilder (Confluence), Mapbender + OSGeo (Mediawiki)
 * CMS: MapServer (Plone)
 * Static HTML: MapGuide, OSGeo
 * PHP HTML: GRASS (but CMS forthcoming) + weekly generated software user docs
 * Doxygen (nightly generated HTML): GDAL, GRASS (for programmer's manual)

CollabNet offers static HTML pages under SVN for web sites. This works fine for MapGuide and GDAL. Not acceptable to MapServer nor GRASS. Migration is acceptable for Mapbender as most action is taking place in the current Wiki (and this will not change). Not sure about OSSIM, MapBuilder or GeoTools.

It would seem that a Wiki solution for the web site should be offered.

For MapGuide and GDAL migrating into the CollabNet website mechanism is no problem. Migrating out again (if needed) should also be straight forward. Migrating in and out for other projects would be moderately difficult due to all the reformatting needed.

From Daniel Brookshier - We can put any collaboration tool, like we did for the OSGeo Wiki as a subdomain of OSGeo.org. We can also do a cookie-based single signon. If you decide to move and can't live without such a tool, please consider moving your code repository and putting other resources as subdomains to osgeo.org. I am happy to oblige as wiki capability in the CollabNet tool is many months down the road.

Source Code Control
Some projects now using SVN, while others use CVS.

CollabNet offers both. From Daniel Brookshier -Note, we only have svn turned on for requests at this time. If you really want cvs, let me know and I can configure it for you.

Migrating in should be quite easy. The main downside is that all existing committer authentication will be lost, and will need to be resetup via the CollabNet infrastructure.

Will the CVS/SVN history be maintained when moving into/out of CollabNet? From Daniel Brookshier - There are tools like cvs2svn - not sure about how complete it is, but you can check out the docs and see if it meets your needs. CollabNet is also about to launch services to do this where our people do the import/conversion from cvs to subversion.

Some projects currently use CVS (or SVN?) triggers to launch actions such as IRC (via CIA-bot) notifications, mailing list notifications, web site updates, and automated builds. Collabnet provides a mailing list for updates, but does not allow arbitrary commit hooks (as far as I know). Perhaps we should look at using the commit mailing list to drive stuff like CIA. From Daniel Brookshier -The only hook we currently support is to a standard list in each project. If you submit a specific request t me detailing the hook and documentation to do it with a standard svn, I will put in a statement of work to CollabNet to request it. As with all changes to the underlying software, I can't promise anything until we run the request through to its conclusion.

CVS projects might want to take this opportunity to consider SVN which is superior technology. Howard Butler is knowledgable about how to do a CVS to SVN transition that preserves history. This would of course add some extra disruption for developers. From Daniel Brookshier - one key feature is https access with is far better than ssh tunneling for getting around your corporate firewall (note that in some companies, opening a tunnel is a fireable offense and in some countries you can be sent to prison).

Migrating out of Collabnet SVN is pretty easy assuming Collabnet provides access to the raw SVN archive (which they have agreed to do). The main disruption would be related to user authentication and a new location. SVN is open source so there is no need to change to a new tool if migrating out. From Daniel Brookshier - yes we support a full svn image in the event of migration out.

Bug / Issue Tracking
Currently members projects use:
 * Bugzilla: GDAL, MapServer, OSSIM
 * Jira: GeoTools, MapBuilder
 * SF bug tracker: Mapbender
 * RT: GRASS (Gforge planned)
 * CN Issue Tracker: MapGuide

CollabNet offers an issue tracker with roughly comparible capabities to other bug trackers. (please note significant distinctions here) However, it is not clear that there is a clean way to migrate bugs from other systems to the CN issue tracker. Lacking this it seems unlikely the projects with substantial historical bug databases will be eager to migrate to the CN infrastructure.

Should any projects migrate to the CN issue tracker, it is also unclear that there is any way to migrate bugs out again should we drop CN support, though apparently CN does support an XML export of the database so hand crafting tools should be *possible* with some lossiness.

It would seem that migration to (and from) the CN bug tracker are going to be painful.

Mailing Lists / Forums
Currently member projects use:
 * Mailman: GDAL, GRASS, OSSIM, MapBuilder, GeoTools, MapServer, Mapbender is currently in transition
 * CN Mailing Lists: MapGuide, Mapbender

CN offers a mailing list mechanism (ezmlm). It supports and easy mechanism for administrators to batch subscribe email addresses, so migrating existing lists to it is relative easy (though digest or other config options may be lost). It has a few quirks (no apparent way to limit messages by size, digest is by accumulated mail size rather than something like daily). But the mailing lists seem to work fine, and are integrated into the platform. Migration will require all subscribers updating their address books with a new email address. The Mapbender PSC has experimentally copied the existing dev-mailman list to the CN ezmlm box. Works fine, no interruption. This is not solution regarding the archive which is left behind.

However, is it possible to migrate the mailing list archives when moving into/out of CollabNet? Some projects such as GRASS and GDAL do want to maintain it.

Migrating out it is easy to capture the subscriber lists, and setup a new external mailing list instance. It may involve a new email address again and will likely result in loss of email subscriber options. Is there an option to get the mailing list archives out of CN?

Overall, there is not a high cost to migrating in or out of the CN mailing list architecture.

The above does not address migrating archives which is also high desirable, but apparently very difficult with the CN mailing list archive manager.

Download Server
Existing projects offer source, binary and data downloads through http and ftp.

Collabnet offers an http based download facility from from the "Documents and Files" area on the left nav bar. This seems roughly analogous to the download support in SourceForge, and is generally adequate for downloads. (are there any perceived problems?)

There may be a migration hassle for folks moving large amounts of existing files into CN but generally speaking migration to CN for downloads should be straightforward.

Migrating out should also be relatively easy. Just "wget" the files to another server or something similar.

One downside of CN is that it doesn't offer ftp download services, but it isn't at all obvious to me that this is important in 2006. (comments?)

The other issue that could arise is that sufficient popularity for OSGeo projects could push the limits of our CN bandwidth limit (not sure what it is) in which case we might need to move some big things to the telascience hosted servers.

Wiki
Current projects:
 * No wiki: GDAL (want one!), MapServer (had one but wiki-spammed), MapGuide
 * Twiki: OSSIM, GRASS
 * Mediawiki: Mapbender, OSGeo
 * Confluence: GeoTools, MapBuilder

Currently CN does not offer a wiki. Arnulf has kindly asked terrestris (Till Adams) to host a Mediawiki instance for OSGeo.

It isn't clear if there are benefits to moving into a common wiki for projects. We could likely host Mediawiki, Confluence, Twiki, etc. instances for each project at telascience if needed.

There would be no pressing need to migrate out as the Wiki won't be dependent on collabnet.

Automated Build/Smoke Test System
Current projects:
 * GDAL: BuildBot (prototype)
 * GeoTools: cruise control + maven 2 (and may consider Continum)
 * GRASS: script based build system for Linux, MacOSX, mingW; script/HTML based testsuite; additionally automated clone testing and function size/structure Quality Control
 * Mapbender: currently testing Selenium
 * MapBuilder: Setting up a fitnesse/ruby/WATIR/Autoit solution for AJAX type testing
 * MapGuide: Cruise Control (I think)
 * MapServer: BuildBot (prototype)
 * OSSIM:

Moving automated Build/Smoke Test Systems to CollabNet infrastructure could go along with a build farm. Instead automated Build/Smoke Test Systems could also be installed on the telescience hosts.

Demo Site
Current Projects:


 * geotools: n/a library project, demo examples in wiki and included with source download
 * GRASS: a couple of Demo Live CDROMs/DVDs are available (Linux, MS-Windows)
 * Mapbender: There are several Mapbender demo installations operated by different companies and clients, all linked from the Wiki. It is planned to create a "full stack" demo site on the telescience servers asap. Every Mapbender installation contains demo data (Capabilities URLs) and demo interfaces. A MapServer WMS demo ist operated by CCGIS hosting the Free Data project "Germany", a GeoServer WFS-T demo installation hosts the Mapbender users.

IRC

 * GDAL: irc://irc.freenode.org#gdal
 * GeoTools: irc://irc.freenode.org#geotools
 * GRASS: irc://irc.freenode.org#grass
 * Mapserver: irc://irc.freenode.org#mapserver
 * Mapbuilder: irc://irc.freenode.org#mapbuilder

It would be desirable to have an automated archieving of IRC channels.

Security

 * Common LDAP infrastructure for single sign-on
 * SSL certificates for OSGeo sites (currently CN owned ?)

= Migration schedule = The following priorities have been identified (thanx to Hobu's mail to IncCom). Maybe we can continue to reine this list here.

DNS
Host this somewhere else. Identify a hosting provider and acquire a temporary (and similar) DNS name we can use for bootstraping the migration

Maillists
We need mboxes of all of the lists for the message archives to drop into mailman. Need subscription lists (unless we are willing to have everyone resubscribe, which is not always a bad thing) to feed into mailman if we are going to automatically resubscribe everyone.

LDAP
Out-of-the-box OpenLDAP+SSL. Need unix (posix) account schema and a LDAP group layout.

Apache + SVN + LDAP Authentication
Easy once 1-3 are up and running.

CMS
Special considerations for authentication and database(s) used for CMS. Otherwise, should be fairly straightforward.

Wiki
Hold off on moving this until things stabilize (other than move a DNS record)

Bug tracking
Some folks hold their bug trackers very dear. I have experience with Bugzilla->Trac migration and Trac customization. Any bug tracker that is chosen will have an administration load that is probably the second or third largest item on the admin's duty list (Maillists being #1 and CMS babysitting being #2).

Further integration
Hold off on these until everything else is up and stabilized (and DNS is migrated).

= Infrastructure Integration =

This section describes the ways in which the components of the infrastructure interact, and this is where CollabNet starts to excel, however some catching up is needed to match the GeoTools project (Jive).

Collabnet Integration
1. Can use an existing system such as curise control here

Q: Does the Collabnet Issue tracker interact with anything? Can we get an email of new bugs? Reply to those messages to comment?

Okay I am not doing Collabnet any justice here, what does it integrate? And how... Jive 14:29, 31 March 2006 (CEST) Can someone fill in the above table Jive 14:21, 31 March 2006 (CEST)

GeoTools Integration
Here is a worked example illustrating how the GeoTools project is intergrated:


 * 1) not sure if this one works right now
 * 2) logs of IRC meetings are posted

Areas for improvements in integration:
 * single user name / password for svn, confluence, jira

GeoTools SVN Integration
We are capturing integration in a single direction: ie if we do a commit in svn what gets updated? Well everything in the following list...


 * SVN (commit) to Build
 * curise control builds after a 30 min pause in commits
 * curise control builds nightly
 * various cruise control instances watch different important branches, these are maintained by several organizations


 * SVN (commit) to Tracker:
 * the Jira tracker will watch the svn code repository and pick up any commit message that mentions a Jira Number, these show up as comments against the Jira Issues, and serve to document a patch being applied in development and then stable branches
 * (not sure if we have configured this correctly)


 * SVN (commit) to IRC:
 * CIA is used to host a bot on the freenode#geotools IRC channel it cheerfully chrips up about commits, but does not answer questions


 * SVN (commit) to Email
 * there is a geotools-commits email list that can be subscribed to


 * SVN (commit) to WIKI
 * you can use wiki syntax to grab lines of a file from svn, these can be used as code examples on demo and tutorial pages


 * SVN (commit) to Web
 * cruise control is used to generate a javadoc website based on svn
 * we build a website for module information based on maven 2, a series of performance metrics are gathered using tools like clover.

= Tool Selection Criteria = If a project were to leave OSGeo, then they should be able to set up and use all the tools without a license cost.
 * Ideally, the tools should be open source, but a "Free for Open Source" licence is acceptable.
 * Tools that use Open Standards will be used where possible as reduces the need for vendor lock in.
 * Tools should be able to input data from existing OSGeo projects with minimal effort and without loosing history.

= CollabNet's Point of View = A few words on the CollabNet platform by Daniel Brookshier of CollabNet:

I wanted to point out a couple of things about what the CollabNet tool really is. First, it is a multi-project software collaboration system and service. It is not the same as say setting up your own email list and a subversion repository - rather it is a way to provision and manage dozens or thousands of projects within a single domain and common identity system. Also, this is not just a 'tool', but a service. You have dedicated servers, bandwidth, 24 hour support and guaranteed uptime. You also have my services to help you get going with your projects and as an advisor to you and the foundation.

As for the tool itself, I won't say that we are best in breed in all the components (except svn), but we are best in breed as an integrated multi-host platform. The only close competitor is from VA who also runs SourceForge. Their tool is just a license to use and does not include servers, bandwidth, or 24/7 operations support and uptime guarantees. The VA tool is also not the same software as SourceForge and is in fact a different code base and does not support many features of SourceForge. We have most of the market share as compared to VA, so the consumer market seems to think we are pretty good (we have more that 800 thousand registered users). We also cost less than VA overall because you save by us consolidating support among our many customers.

A lot of you are used to different tools and are also experience with the rich feature sets available to admins. Not seeing your favorite features is a hard thing to accept and we understand that. We can support limited customizations, but I need to kick all requests back to CollabNet and see what is possible and if there is a cost for such work. The platform may seem to have fewer features here and there but this is in part to improve usability for users in general.

We also understand the risks of migrating in and out of the Collabnet tool and I am getting specifics on this from support. I have noted what I do know below for some of the issues addressed. We want to do as much as we can to help you migrate and to feel that it is worth doing. My goal is to create a strong community and as part of that, the more projects hosted under one roof the better we attract and retain a great community members and their users.