None

Bare faced cheek

I guess a few people will mention this over the next few days but I can’t let it pass without comment. Meta Group performed a TCO (Total Cost of Ownership) study comparing MS Exchange with Lotus Domino. The report which ended up as pro-Microsoft was riddled with fundamental flaws (which are documented here) and was removed from Microsoft’s website pretty quickly after it was discovered by the Lotus community.

The thing which surprised me most was this statement:

“Other than the funding — sponsoring this comparison — Microsoft had nothing to do with the outcome of the survey, so we stand by the fact that this [research] is very independent,”

It just beggars belief that some can say this with a straight face. But you have to give the guy kudos for trying!

Catch Up

After a couple of very busy weeks and no relaxation at all I have just been having a slow day of sleeping, tinkering and catching up with paperwork. It shouldn’t make much difference but working for 12 days straight really takes it out of me.

Not a good day

After the long weekend I was hoping that today might be a little easier. Unfortunately not. When I got in this morning one of the servers had a fairly massive hardware failure which took it out of commission for an hour. Then the other server in the cluster managed to corrupt it’s address book so we have run for most of the day with absolutely no fallback position. It all seems to be OK now but I think there is a perception that this is due to the upgrade yesterday. I can’t see any link it just seems to be very unfortunate timing.

On a completely different note, I spent my train journey home fixing my RSS feed. I hadn’t realised it was broken until it was pointed out to me by Ed. RSS seems to be overly picky in my view. I had tested it with NetNewsWire and the new RSS feature in Firefox but other RSS validation seems to fail. I suppose they just let more through than the standards actually specify. Anyway, it seems to be working now so please let me know if there are still problems.

Completed

We finished at about 9 o’clock last night when UAT was signed off. There are a couple of minor issues to sort out today but unfortunately we have now had a hardware failure on one of the servers in the cluster so that has to be dealt with first of all.

I think today might be a long one as well!

And we’re into the home straight

The servers are upgraded and compacted, the designs have been rolled out and I have done all of my config changes. The UAT team are in just running through their scripts. A couple of little funnies have been found but they are being caused by me being tired rather than any problems with the server. Hopefully we’ll have sign-off in an hour or so.

Decision Made

It’s been agreed that we carry on with the upgrade. There are some risks associated with it but the general feeling was that they were acceptable and that the impact on the other projects would be too great. So I am in here for the rest of the day waiting for another compact to finish and then perform all of my tasks.

It’s going to be a long day not helped by the lack of air con in the office which means it’s over 30 degrees in here.

Some progress but still likely to rollback

OK we now think we understand (at least to some degree) why the server is not happy with our jar file. We had included some Java 1.2 collections classes (e.g. Hashmap, Hashlist etc) which obviously were not in the 1.1 JVM. Now that the server has gone up to 1.3 our extra inclusions are no longer required. This bit I am happy with but what I still don’t understand is why it should cause the production servers to crash when the dev and test servers have exactly the same setup on and run fine.

Even though I am 90% sure that removing the offending class files from the jar will solve the problem I still suspect that we will end up spending the day rolling back. I can’t say that this is the wrong decision but it will definitely cause a bad week of retesting, explanations to management and other systems which are waiting for this upgrade to happen.

All this proves just how important a rollback plan is. It’s the first time in my career when I have actually had to use the plan but it just emphasises that planning for the unexpected is never wasted.

Not good news

Unfortunately the upgrade has not gone to plan. The servers upgraded fine, the compact went without a hitch but when the admin team tried to restart both servers in the cluster crashed with a JVM error message.

Some very quick investigation work has narrowed the problem down to one of our jar files which is used by servlets but I do not yet understand why this file should cause a problem, it is fine on our dev and both tesst servers.

So queue an emergency phone conference at 9am tomorrow. I have to be in at 7 to test a workaround which I have proposed but even if it works I can’t see the upgrade continuing. Then we will be into emergency rollback mode which is not difficult but is time consuming simply due to the size of the files involved. And then we’ll have to try again next weekend once we have spoken to Lotus Support during the week. Which ties in rather nicely with an interesting discussion currently happening on Ed’s blog.

I was meant to be going out tonight to see Richard Herring but I’m not sure I can face it now. I’ll have some dinner and then decide.

Upgrade weekend

The start of what could be a nice simple weekend or a nightmare. We are upgrading a fairly large application which I am involved in to ND6. The actual server upgrades are happening this morning and then a compact starts at lunchtime which will take some time as our nsf’s are about 40gb in total. The plan is for me to go in for an hour or so this afternoon to make sure everything is progressing OK and then get in for 07:00 tomorrow to do the main bit of upgrade work for the application. Hopefully it should all be done by 13:00 but I have learned never to make assumptions about these things.